Retrieve tag A from html

  • Thread starter Thread starter Matteo Migliore
  • Start date Start date
M

Matteo Migliore

Hi.

I need a regular expression to extract all tag A from HTML
code. I need the href and text as two disting objects.

Suggestions?

Thx! ;-)
Matteo Migliore.
 
Why do you want to use regular expressions to parse HTML when there
are APIs for that like SgmlReader

Thanks! I see SqmlReader but for my problem is too much
and I prefer to use .NET classes and RegEx.

I downloaded the project but i don't like it very much :-).

Thx! ;-)
Matteo Migliore.
 
Hi,

All you have to do is get the text from "<a" and up to "</a>"
 
Hello Ignacio Machin ( .NET/ C# MVP )" machin TA laceupsolutions.com,
Hi,

All you have to do is get the text from "<a" and up to "</a>"

Which would come down to something like this:

<a[^>]+href\s*=\s*"(?<href>[^"]+)"[^>]*>(?<text>(?:(?!</a).)*)

It would save the href to a group named href and the text to a group named
text.
 
Which would come down to something like this:
<a[^>]+href\s*=\s*"(?<href>[^"]+)"[^>]*>(?<text>(?:(?!</a).)*)

It would save the href to a group named href and the text to a group
named text.

Sorry but with this Regex I can't retrieve all links. I'm comparing using
WebClient class and WebBrowser (Document.Links). In the second case I obtain
all links, in the first not.

Thx a lot! ;-)
Matteo Migliore.
 
Back
Top