Parsing a string

  • Thread starter Thread starter John Rogers
  • Start date Start date
J

John Rogers

Can someone show me how to parse a string to find a specific value?

<b><a id="wt2500_WC_xc2500_GVB_drtl00_WQR_xt400_G"
href="/WW/XZ/LinkToDetailsList.asp">Details List Filers</a></b>

That is my string, I have thousands of lines to go through, I am looking to
get back the
following value: "Details List Filers"

These are unique in the string:

String Begin: <b><a id="
String End: </a></b>

Once I find my string that starts with the begin sequence, I need to parse
the rest of the string to get the
value that I want. To be honest I don't have a clue what to do, can someone
provide a small example
that will get me started.


Appreciate the help.


John-
 
try using regular expression to get list of matching strings
try something like <b><a id=.*>(?<innerText>.*)</a></b>


Regex oRE= new Regex("<b><a id=.*>(?<innerText>.*)</a></b>");
String s = "<b><a id=\"wt2500_WC_xc2500_GVB_drtl00_WQR_xt400_G\"
href=\"/WW/XZ/LinkToDetailsList.asp\">Details List Filers</a></b>";
Match m = oRE.Match(s);
if ( m.Success )
Console.WriteLine("User: " + m.Groups["innerText"].Value);
 
John said:
Can someone show me how to parse a string to find a specific value?

<b><a id="wt2500_WC_xc2500_GVB_drtl00_WQR_xt400_G"
href="/WW/XZ/LinkToDetailsList.asp">Details List Filers</a></b>

That is my string, I have thousands of lines to go through, I am looking to
get back the
following value: "Details List Filers"

These are unique in the string:

String Begin: <b><a id="
String End: </a></b>

Once I find my string that starts with the begin sequence, I need to parse
the rest of the string to get the
value that I want. To be honest I don't have a clue what to do, can someone
provide a small example
that will get me started.

What about a regex ?

Something like:

"(?:<b><a id=[^>]*>)([^<]*)(?:</a></b>)"

Arne
 
Appreciate the response guys, I didn't try to use regex because its not that
easy to use.

I have been looking at some code like SubString() and stuff like that, it
seems easier
to work with.

Which is faster when parsing strings? Regex() or just regular parsing using
SubString()
etc.


Thanks

John-
 
if as you mentioned in your OP the string is huge (thousands of line) then
regular expression is the way to go
BTW the regex in Arne's post is more accurate
 
Yes it is very fast, I had never used it before but I am super surprised
that it took about
one second to parse a few thousand lines. I just grabbed a Regex() tutorial
from
codeproject, I will read it tomorrow so I can start using this from now on.

Thanks again for your help.

John-
 
Just for completeness; if your data is xml (such as xhtml), another
alternative is an XmlReader; again very quick, but geared towards xml;
if the scanario gets any more complicated, it might be worth
consideration - however, if the regex does what you need, I'd stick
with it! Just be aware that regex can't handle every scenario without
turning into a monster (although the simple case is, er, simple -
html / xml can have some complicated scenarios [just like e-mail]).

Marc
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top