Parsing a string

J

John Rogers

Can someone show me how to parse a string to find a specific value?

<b><a id="wt2500_WC_xc2500_GVB_drtl00_WQR_xt400_G"
href="/WW/XZ/LinkToDetailsList.asp">Details List Filers</a></b>

That is my string, I have thousands of lines to go through, I am looking to
get back the
following value: "Details List Filers"

These are unique in the string:

String Begin: <b><a id="
String End: </a></b>

Once I find my string that starts with the begin sequence, I need to parse
the rest of the string to get the
value that I want. To be honest I don't have a clue what to do, can someone
provide a small example
that will get me started.


Appreciate the help.


John-
 
M

Misbah Arefin

try using regular expression to get list of matching strings
try something like <b><a id=.*>(?<innerText>.*)</a></b>


Regex oRE= new Regex("<b><a id=.*>(?<innerText>.*)</a></b>");
String s = "<b><a id=\"wt2500_WC_xc2500_GVB_drtl00_WQR_xt400_G\"
href=\"/WW/XZ/LinkToDetailsList.asp\">Details List Filers</a></b>";
Match m = oRE.Match(s);
if ( m.Success )
Console.WriteLine("User: " + m.Groups["innerText"].Value);
 
A

Arne Vajhøj

John said:
Can someone show me how to parse a string to find a specific value?

<b><a id="wt2500_WC_xc2500_GVB_drtl00_WQR_xt400_G"
href="/WW/XZ/LinkToDetailsList.asp">Details List Filers</a></b>

That is my string, I have thousands of lines to go through, I am looking to
get back the
following value: "Details List Filers"

These are unique in the string:

String Begin: <b><a id="
String End: </a></b>

Once I find my string that starts with the begin sequence, I need to parse
the rest of the string to get the
value that I want. To be honest I don't have a clue what to do, can someone
provide a small example
that will get me started.

What about a regex ?

Something like:

"(?:<b><a id=[^>]*>)([^<]*)(?:</a></b>)"

Arne
 
J

John Rogers

Appreciate the response guys, I didn't try to use regex because its not that
easy to use.

I have been looking at some code like SubString() and stuff like that, it
seems easier
to work with.

Which is faster when parsing strings? Regex() or just regular parsing using
SubString()
etc.


Thanks

John-
 
M

Misbah Arefin

if as you mentioned in your OP the string is huge (thousands of line) then
regular expression is the way to go
BTW the regex in Arne's post is more accurate
 
J

John Rogers

Yes it is very fast, I had never used it before but I am super surprised
that it took about
one second to parse a few thousand lines. I just grabbed a Regex() tutorial
from
codeproject, I will read it tomorrow so I can start using this from now on.

Thanks again for your help.

John-
 
M

Marc Gravell

Just for completeness; if your data is xml (such as xhtml), another
alternative is an XmlReader; again very quick, but geared towards xml;
if the scanario gets any more complicated, it might be worth
consideration - however, if the regex does what you need, I'd stick
with it! Just be aware that regex can't handle every scenario without
turning into a monster (although the simple case is, er, simple -
html / xml can have some complicated scenarios [just like e-mail]).

Marc
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top