Parsing a webpage

Enigma Boy · Aug 14, 2007

Hi folks,

I am retrieving a website for a site using httpWebRequest. What I want to
do with the retrieved webpage is list all the hyperlinks in the page. If I
do a simple regex search for <a then I get links that are commented out in
code and I don't want that. I want links that are actually active. This is
to do with reciprocal link check.

Can someone please point me in the right direction.

Thanks.

--
<a href="http://1pakistangifts.com">Send Gifts to Pakisan at #Pakistan Gifts
Store</a> | <a href="http://dotspecialists.com">Leading Software offshoring
and outsourcing service provider</a> | <a
href="http://websitedesignersrus.com">Professional Websites at affordable
prices</a>

Jesse Houwing · Aug 14, 2007

Hello Enigma,

Hi folks,

I am retrieving a website for a site using httpWebRequest. What I
want to do with the retrieved webpage is list all the hyperlinks in
the page. If I do a simple regex search for <a then I get links that
are commented out in code and I don't want that. I want links that
are actually active. This is to do with reciprocal link check.

Can someone please point me in the right direction.

Thanks.

Have a look at the HTML agility pack. It allows you to parse HTML as it were
XML.

http://www.codeplex.com/Wiki/View.aspx?ProjectName=htmlagilitypack

Alvin Bruney [MVP] · Aug 15, 2007

You can also throw a regex at it from this site regexlib.com

Enigma Boy · Aug 16, 2007

Jesse, you are a life savor.

Thanks,

Jesse Houwing · Aug 16, 2007

Hello Enigma,

Jesse, you are a life savor.

Thanks,

You're welcome. I suppose it worked like a charm

Retrievel Hyperlinks for a web page in code	2	Aug 14, 2007
Uri Class?	1	Sep 4, 2007
ListView for simple category, data layout	1	Apr 29, 2010
Uri Class?	1	Sep 4, 2007
Regex - Matching URLS	2	Dec 10, 2007
C# Regular expression to find all instance and add a prefix	2	Apr 22, 2009
Regex Issues - Finding Qualified URLS	2	Dec 11, 2007
Regex syntax request for help	3	Feb 18, 2008

Parsing a webpage

Enigma Boy

Jesse Houwing

Alvin Bruney [MVP]

Enigma Boy

Jesse Houwing

Ask a Question

Similar Threads