regex question

  • Thread starter Thread starter Dirk Reske
  • Start date Start date
D

Dirk Reske

Hello,

i don't know if this question fits here.
How can I extract all links out of a web pages source?
<a href="microsoft.com">Go here<"/a>

I want only microsoft.com....

thanks
 
Regex regex = new Regex (@"<a[^>]*microsoft\.com[^>]*>",
RegexOptions.IgnoreCase);

Thomas P. Skinner [MVP]
 
yes, but microsoft.com was an example only...
I want every links

Thomas P. Skinner said:
Regex regex = new Regex (@"<a[^>]*microsoft\.com[^>]*>",
RegexOptions.IgnoreCase);

Thomas P. Skinner [MVP]


Dirk Reske said:
Hello,

i don't know if this question fits here.
How can I extract all links out of a web pages source?
<a href="microsoft.com">Go here<"/a>

I want only microsoft.com....

thanks
 
I want all between every "a href=\"" "\">"

Thomas P. Skinner said:
Regex regex = new Regex (@"<a[^>]*microsoft\.com[^>]*>",
RegexOptions.IgnoreCase);

Thomas P. Skinner [MVP]


Dirk Reske said:
Hello,

i don't know if this question fits here.
How can I extract all links out of a web pages source?
<a href="microsoft.com">Go here<"/a>

I want only microsoft.com....

thanks
 
http://www.regexlib.com is the place to look for this type of stuff.

The blog link is a more specific examination by Wayne King of pulling out
image tags, but you can use the same feature set to examine link tags.
http://blogs.regexadvice.com/wayneking/archive/2004/04/12/948.aspx


--
Justin Rogers
DigiTec Web Consultants, LLC.
Blog: http://weblogs.asp.net/justin_rogers

Dirk Reske said:
I want all between every "a href=\"" "\">"

Thomas P. Skinner said:
Regex regex = new Regex (@"<a[^>]*microsoft\.com[^>]*>",
RegexOptions.IgnoreCase);

Thomas P. Skinner [MVP]


Dirk Reske said:
Hello,

i don't know if this question fits here.
How can I extract all links out of a web pages source?
<a href="microsoft.com">Go here<"/a>

I want only microsoft.com....

thanks
 
Back
Top