Regex - Matching URLS

M

Mick Walker

Hi,
I am using the following function to match any URLS from within a string
containing the html of a webpage:

public List<string> DumpHrefs(String inputString)
{
Regex r;
Match m;
List<string> LstURLs = new List<string>();

r = new Regex("href\\s*=\\s*(?:\"(?<1>[^\"]*)\"|(?<1>\\S+))",
RegexOptions.IgnoreCase | RegexOptions.Compiled);
for (m = r.Match(inputString); m.Success; m = m.NextMatch())
{
LstURLs.Add(m.Groups[1].ToString());
}
return LstURLs;
}
However the problem with this, is it returns all links on the page, and
I only wish to return fully qualified links such as
http://www.domain.com/page.html and not relitive links.
Does anyone know how I can modfy my regex to do so?
Regards
 
K

Kevin Spencer

Hi Mick,
(?i)href\s*=\s*"?(?<1>http://[^"]+\"?[^>]*)>


First, rather than using an alternation, I just gave a rule that it could
have 0 or 1 quotes at the beginning and end. The (?i) indicates that the
regex is not case-sensitive. The group 1 consists of the character sequence
"http://" followed by any character that is not a quote mark, followed by
zero or 1 quote marks, followed by any character that is not ">". The
expression ends with the ">" character.

--
HTH,

Kevin Spencer
Chicken Salad Surgeon
Microsoft MVP
 
M

Mick Walker

Kevin said:
Hi Mick,
(?i)href\s*=\s*"?(?<1>http://[^"]+\"?[^>]*)>


First, rather than using an alternation, I just gave a rule that it could
have 0 or 1 quotes at the beginning and end. The (?i) indicates that the
regex is not case-sensitive. The group 1 consists of the character sequence
"http://" followed by any character that is not a quote mark, followed by
zero or 1 quote marks, followed by any character that is not ">". The
expression ends with the ">" character.
Thanks for your reply Kevin.

Im having a little trouble with the regex you gave me, Im not to sure
which characters I need to escape in order to use it within my function.

Any help would go down great.

Kind Regards
Mick
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Top