Help with RegEx

N

nomad

Hi,

I need to be able to retrieve values from a string made up of HTML. A
colleague has mentioned using regular expressions to retrieve the
value but this is proving quite difficult. If someome could point me
in the right direction in regard to the values below, it would be
greatly appreciated.

<td class="brandorange">Quote reference: 123456789</td> - I need to
retrieve the 123456789 value.

<input type="radio" name="selections.excessBuildings" value="1">
£100<input type="radio" name="selections.excessBuildings" value="2">
£150<input type="radio" name="selections.excessBuildings" value="3"
checked="checked">£300<input type="radio"
name="selections.excessBuildings" value="4"> - I need to be able to
retrieve the checked value i.e. value '3' is checked so I need to
retrieve 300.
 
L

Lav G

I wrote this method for you.
It works.

you have to pass the html string as parameter to this method

string GetCheckedValue(string inputStringHtml)
{
string pattern = ".*checked=\"checked\"[^\\>]*\\>";
string poundValue = "£0";
Regex expression = new Regex(pattern);
if(expression.IsMatch(inputStringHtml))
{
string[] split = expression.Split(inputStringHtml,2);
expression = new Regex("\\<");
if (expression.IsMatch(split[1]))
{
string[] values = expression.Split(split[1]);
poundValue = values[0];
}
}
return poundValue;
}

Lav G
http://lavbox.blogspot.com
 
T

Tim Roberts

nomad said:
I need to be able to retrieve values from a string made up of HTML. A
colleague has mentioned using regular expressions to retrieve the
value but this is proving quite difficult. If someome could point me
in the right direction in regard to the values below, it would be
greatly appreciated.

In general regular expressions are a terrible way to parse HTML, because
most HTML is not very regular. A simple regex that matches your current
example would fail if the site decides changes the HTML just a bit.

I suggest you investigate one of the available HTML parsers that can deal
with this variability:

http://www.theserverside.net/discussions/thread.tss?thread_id=36886
 
R

Rudy Velthuis

Tim said:
In general regular expressions are a terrible way to parse HTML,
because most HTML is not very regular.

<<
Some people, when confronted with a problem, think "I know, I'll
use regular expressions." Now they have two problems.
-- Jamie Zawinski
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top