read web page

  • Thread starter Thread starter Mike
  • Start date Start date
M

Mike

I need to read a web a page and do a search on the page and gather
information and put it into a text file.
the web page is setup into a table, and displays information on files stuck
in a queue. What I need to do is read the page (without the user seeing it
if possible) search for certain words and get that row of data.

example:
table looks like this:
docNumber FileName
---------------------------------
1234 <docname>text.xml</docname>


I would need to search the page for text.xml and grab the docNumber and put
that into the text file.
Is this possible to do? If so any suggestions on how to do it?
thx
 
I'm not sure if I understood your requirements correctly. It might be
worthwhile to check out the following classes in MSDN

HttpWebRequest
HttpWebResponse

Basically, you have to read the web page by an Http Request.

In order to parse the page, you need an HTML parser.. You have the .Net XML
Parser which can be cleverly made to read HTML pages too.

-vJ
 
If it is just a standard HTML page you can just search it as a string.

I attached some code that I have used in the past to get data from a
webpage. In this example I was just searching the _download string for
certain keywords.

/// <summary>
/// Defines the different connection types available to retrieving whois
data.
/// </summary>
public enum CONNECTION_TYPE
{
/// <summary>Uses GET as the method for sending data to the Server</summary>
GET,
/// <summary>Uses POST as the method for sending data to the
Server</summary>
POST
}


protected void Connect()
{
string serverUrl = null, postData = null;
byte[] myDataBuffer = null;
WebClient httpClient = new WebClient();
try
{
// Generate Post Data
postData = HttpUtility.UrlEncode( _additionalPostData ) );
// Connect to the Server.
switch ( _connectionType )
{
case CONNECTION_TYPE.GET:
serverUrl = _server + "?" + postData;
myDataBuffer = httpClient.DownloadData(serverUrl);
break;
case CONNECTION_TYPE.POST:
serverUrl = _server;
httpClient.Headers.Add("Content-Type","application/x-www-form-urlencoded");
byte[] postArray = Encoding.ASCII.GetBytes( postData );
myDataBuffer = httpClient.UploadData(serverUrl,"POST",postArray);
break;
}
_download = Encoding.ASCII.GetString(myDataBuffer);
_isConnected = true;
}
catch (Exception e)
{
_isConnected = false;
throw (e);
}
}
 
Back
Top