Downloading only a part of a HTML page

N

Nik0001

Hello everyone!

I have the following problem

I need to download several HTML pages and get meta-tags out of the
code. I decided it would be better to download only the meta-tags
rather than downloading the whole page. But the standard method
(HttpWebRequest) in C# only allows me to download the whole page. Is
there some alternative method?

Thanks in advance.
 
M

Marc Gravell

Yes; to get just the http-headers, you can use the "HEAD" verb - if
the server supports it ;-p

var req = HttpWebRequest.Create("http://www.google.com/");
req.Method = "HEAD";
using(var resp = req.GetResponse()) {
foreach(string key in resp.Headers.Keys) {
Console.WriteLine("{0}={1}", key,
resp.Headers[key]);
}
}

If you want the meta tags from the body, then just get the body and
parse it. The good news is that if you want the body, you don't need
to mess with HttpWebRequest etc (which frankly I find confusing):
WebClient is simpler:

using (WebClient client = new WebClient())
{
string body = client.DownloadString("http://
www.google.com/");
}

Of course, now the problem becomes parsing the html (which may or may-
not be xhtml)...

Marc
 
P

Peter Bromberg [C# MVP]

Meta tags normailly appear inside the <Head> element, but even if you were
successful in "chunk" downloading you'd have to stop after you read the
</head> closing tag, and since there could be a lot of script and css inside
the HEAD element, it most likely would not be much use. One of the best ways
to do this sort of thing is to use Simon Mourier's HttpAgilityPack library
(on Codeplex.com I believe)
as it produces an HtmlDocument that works with XPATH just like an XmlDocument.
-- Peter
To be a success, arm yourself with the tools you need and learn how to use
them.

Site: http://www.eggheadcafe.com
http://petesbloggerama.blogspot.com
http://ittyurl.net
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top