C# noob questions - very basic - web scrape example

J

jason

While I wait for my books to arrive.. anticipating the RTFM
responses... I am trying to understand the C# code below. The code is
suppose to be scraping stoke quote info off the MSN site. To help get
me started.

Can somebody please explain these three lines of code.

this.symbol = symbol; //( Is this saying set symbol to the value
passed to the function.. did I even say that right.. drawing no
distinction between the argument defined as symbol and the local
variable?? is this really neccessary?)

QuoteFetch q = new QuoteFetch(args[0]); //(Is is this saying define q
as type Quotefetch with the first argument passed)

Console.WriteLine("{0} = {1}", args[0], q.Last); .. //(especially
q.Last is this saying call Last with a value of q)


<code below .. btw thank you in advance to the original author>

/*
A Programmer's Introduction to C# (Second Edition)
by Eric Gunnerson

Publisher: Apress L.P.
ISBN: 1-893115-62-3
*/

// 32 - .NET Frameworks Overview\Reading Web Pages
// copyright 2000 Eric Gunnerson
using System;
using System.Net;
using System.IO;
using System.Text;
using System.Text.RegularExpressions;

class QuoteFetch
{
public QuoteFetch(string symbol)
{
this.symbol = symbol;
}

public string Last
{
get
{
string url =
"http://moneycentral.msn.com/scripts/webquote.dll?ipage=qd&Symbol=";
url += symbol;

ExtractQuote(ReadUrl(url));
return(last);
}
}
string ReadUrl(string url)
{
Uri uri = new Uri(url);

//Create the request object

WebRequest req = WebRequest.Create(uri);
WebResponse resp = req.GetResponse();
Stream stream = resp.GetResponseStream();
StreamReader sr = new StreamReader(stream);

string s = sr.ReadToEnd();

return(s);

}
void ExtractQuote(string s)
{
// Line like: "Last</TD><TD ALIGN=RIGHT NOWRAP><B>&nbsp;78
3/16"

Regex lastmatch = new Regex(@"Last\D+(?<last>.+)<\/B>");
last = lastmatch.Match(s).Groups[1].ToString();
}
string symbol;
string last;
}

public class ReadingWebPages
{
public static void Main(string[] args)
{
if (args.Length != 1)
Console.WriteLine("Quote <symbol>");
else
{
// GlobalProxySelection.Select = new
DefaultControlObject("proxy", 80);
QuoteFetch q = new QuoteFetch(args[0]);
Console.WriteLine("{0} = {1}", args[0], q.Last);
}
}
}
 
P

Paul E Collins

Can somebody please explain these three lines of code.
this.symbol = symbol;

It means that the class member variable called "symbol" should be set
to the value of the local "symbol" variable passed to the method.
Giving them both the same name might be seen as bad practice, since
it's slightly confusing.
QuoteFetch q = new QuoteFetch(args[0]);

"args" is the set of command-line arguments sent to the program (if,
for example, you run it from a command prompt), so "args[0]" is the
first of those (space-delimited) items. The line of code creates a new
QuoteFetch object, using the constructor that accepts a string
parameter (because args is a string array, so each element is a
string).
Console.WriteLine("{0} = {1}", args[0], q.Last);

This overload of WriteLine uses the first string as a formatting
template (so that {0} is replaced with the first item and {1} with the
second item), so it will basically print out "args[0] = q.Last", when
you fill in those values. As above, args[0] is the first command-line
parameter (this will fail if you don't have at least one available),
and q.Last is the value of the "Last" property of that particular
instance, called "q", of QuoteFetch.

Eq.
 
J

jason

Thank you!

You wrote:
q.Last is the value of the "Last" property of that particular
instance, called "q", of QuoteFetch

What would that be? Why necessary to ask for the last ? Any chance
there could be more than one property (value?) available? Is this
because the Regex could produce more than one? Perhaps some OOP thing
I'm not understanding?
 
P

Paul E Collins

q.Last is the value of the "Last" property of that particular
instance, called "q", of QuoteFetch
What would that be? Why necessary to ask for the last ?
Any chance there could be more than one property (value?)
available? [...]

Well, I didn't mean the last property of all the properties. "Last" is
just a name. Look at the code you posted, and the "public string
Last". When you write "q.Last", you're saying "get the value of the
property called Last for this particular variable q", so it's
basically the same as a method call to that "get" block inside "public
string Last".

I'm talking in pretty general terms here, just looking at your code as
pure code, and not trying to work out what the whole program does. I
suppose that "Last" is supposed to be the latest / most recently
downloaded stock ticker value or something.

Eq.
 
J

jason

Paul E Collins wrote:
"get the value of the
property called Last for this particular variable q", so it's
basically the same as a method call to that "get" block inside "public
string Last".

Yes. That makes good sense.


One last question. What is "get" telling us under

public string Last
{
get

I see ReadUrl is also returning something, but I don't see a "get".

Thanks again!
 
P

Paul E Collins

One last question. What is "get" telling us under
public string Last
{
get [...]

You should look up the difference between methods (traditionally
called "functions") and properties. In general, properties are used to
expose a class' private member variable to external code, without
actually making the variable public (which would cause trouble if the
internals of the class were reorganised later), while methods are just
calls that perform particular tasks. But really, read the docs on this
:) it's quite important.

Hope it helps,

Eq.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top