Regex help needed

  • Thread starter Roberto Cavalieri
  • Start date
R

Roberto Cavalieri

hi everybody, i have a little problem with a simple regex. I need to extract
the href attribute value from a html tag.

Now, it is very simple, just googleing tell me:

..NET Framework Developer's Guide
Example: Scanning for HREFs

http://msdn.microsoft.com/en-us/library/t9e807fx.aspx

OK let see the example:

private static void DumpHRefs(string inputString)
{
Match m;
string HRefPattern = "href\\s*=\\s*(?:\"(?<1>[^\"]*)\"|(?<1>\\S+))";

m = Regex.Match(inputString, HRefPattern,
RegexOptions.IgnoreCase | RegexOptions.Compiled);
while (m.Success)
{
Console.WriteLine("Found href " + m.Groups[1] + " at "
+ m.Groups[1].Index);
m = m.NextMatch();
}
}


Well, this is my code:

System.Text.RegularExpressions.Regex Regex;
System.Text.RegularExpressions.Match Match;

string ToCheck = "<a
href='/Jobs/796/Software-Developer-M-F.aspx'>Software Developer (M/F)</a>";
string Pattern = "{0}\\s*=\\s*(?:\"(?<1>[^\"]*)\"|(?<1>\\S+))";

Regex = new System.Text.RegularExpressions.Regex(string.Format(Pattern,
"href"), System.Text.RegularExpressions.RegexOptions.IgnoreCase |
System.Text.RegularExpressions.RegexOptions.Compiled);

for (Match = Regex.Match(ToCheck); Match.Success; Match =
Match.NextMatch())
{
string MatchValue = Match.Groups[1].Value;
}


The Match is obtained but the value is
*****Jobs\796\Software-Developer-M-F.aspx'>Software***** ?????

Can someone explain me what's the wrong on my code?

Thank you in advise, good job to all

See you soon
 
A

Arne Vajhøj

hi everybody, i have a little problem with a simple regex. I need to
extract the href attribute value from a html tag.

Now, it is very simple, just googleing tell me:

.NET Framework Developer's Guide
Example: Scanning for HREFs

http://msdn.microsoft.com/en-us/library/t9e807fx.aspx

OK let see the example:

private static void DumpHRefs(string inputString)
{
Match m;
string HRefPattern = "href\\s*=\\s*(?:\"(?<1>[^\"]*)\"|(?<1>\\S+))";

m = Regex.Match(inputString, HRefPattern,
RegexOptions.IgnoreCase | RegexOptions.Compiled);
while (m.Success)
{
Console.WriteLine("Found href " + m.Groups[1] + " at "
+ m.Groups[1].Index);
m = m.NextMatch();
}
}


Well, this is my code:

System.Text.RegularExpressions.Regex Regex;
System.Text.RegularExpressions.Match Match;

string ToCheck = "<a
href='/Jobs/796/Software-Developer-M-F.aspx'>Software Developer (M/F)</a>";
string Pattern = "{0}\\s*=\\s*(?:\"(?<1>[^\"]*)\"|(?<1>\\S+))";

Regex = new System.Text.RegularExpressions.Regex(string.Format(Pattern,
"href"), System.Text.RegularExpressions.RegexOptions.IgnoreCase |
System.Text.RegularExpressions.RegexOptions.Compiled);

for (Match = Regex.Match(ToCheck); Match.Success; Match =
Match.NextMatch())
{
string MatchValue = Match.Groups[1].Value;
}


The Match is obtained but the value is
*****Jobs\796\Software-Developer-M-F.aspx'>Software***** ?????

Can someone explain me what's the wrong on my code?

You probably want either:

string Pattern = "{0}\\s*=\\s*(?:'(?<1>[^']*)'|(?<1>\\S+))";

or:

string Pattern = "{0}\\s*=\\s*(?:\"(?<1>[^\"]*)\"|(?<1>[\\w/\\.\\-']+))";

Arne
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

Regex - Matching URLS 2
Regex Issues - Finding Qualified URLS 2
regex multiplication problem 3
Help with Regex 5
Regex Help 2
Regex help 1
RegEx Help!! 2
Query String or Connection String with Regex 3

Top