Retrieve tag A from html

Matteo Migliore · Oct 5, 2007

Hi.

I need a regular expression to extract all tag A from HTML
code. I need the href and text as two disting objects.

Suggestions?

Thx! ;-)
Matteo Migliore.

Martin Honnen · Oct 5, 2007

Matteo said:
I need a regular expression to extract all tag A from HTML
code. I need the href and text as two disting objects.

Why do you want to use regular expressions to parse HTML when there are
APIs for that like SgmlReader
<URL:http://www.gotdotnet.com/Community/...mpleGuid=B90FDDCE-E60D-43F8-A5C4-C3BD760564BC>

Matteo Migliore · Oct 5, 2007

Why do you want to use regular expressions to parse HTML when there

are APIs for that like SgmlReader

Thanks! I see SqmlReader but for my problem is too much
and I prefer to use .NET classes and RegEx.

I downloaded the project but i don't like it very much :-)

.

Thx! ;-)
Matteo Migliore.

Ignacio Machin \( .NET/ C# MVP \) · Oct 5, 2007

Hi,

All you have to do is get the text from "<a" and up to "</a>"

Jesse Houwing · Oct 5, 2007

Hello Ignacio Machin ( .NET/ C# MVP )" machin TA laceupsolutions.com,

Hi,

All you have to do is get the text from "<a" and up to "</a>"

Which would come down to something like this:

<a[^>]+href\s*=\s*"(?<href>[^"]+)"[^>]*>(?<text>(?

?!</a).)*)

It would save the href to a group named href and the text to a group named
text.

Matteo Migliore · Oct 6, 2007

Which would come down to something like this:

<a[^>]+href\s*=\s*"(?<href>[^"]+)"[^>]*>(?<text>(??!</a).)*)

It would save the href to a group named href and the text to a group
named text.

Sorry but with this Regex I can't retrieve all links. I'm comparing using
WebClient class and WebBrowser (Document.Links). In the second case I obtain
all links, in the first not.

Thx a lot! ;-)
Matteo Migliore.

Retrieve tag A from html

Matteo Migliore

Martin Honnen

Matteo Migliore

Ignacio Machin \( .NET/ C# MVP \)

Jesse Houwing

Matteo Migliore