efficient routine to parse a text string

B

Bill nguyen

I'm looking for a good routine to parse the following text pattern:

a:18: {i:0;i:408;i:1;i:409;i:2;i:410;i:3;i:411;i:4;i:413;i:5;i:414;}

a: = page_id
a:18 -> page_id = 18

Those within curly brackets { }

i:0;i:408

the 1st group i:0;
i : section order within a page (in this case page with page_id = 18)
i:0 -> section_order = 0 (top of page)

the 2nd group i:408;

i: section_id
i:408 -> section_id = 408

therefore, {i:0;i:408;} -> section_id 408 has section_order_id 0
This is a weird way to arrange data. I'd rather assign a different letter to
section_id so that the above example can be rewritten as:
a:18{i:0;s:408;...} where s = section_id

By the way, if you're familiar with or care about phpWebsite CMS (LAMP or
LinuxApacheMysqlPhp world), this come from the page master mod in their open
source app. I'm trying to penetrate their world with .NET :)


Your help regarding the parsing routine is greatly appreciated.

Bill
 
G

Guest

I'm looking for a good routine to parse the following text pattern:

a:18: {i:0;i:408;i:1;i:409;i:2;i:410;i:3;i:411;i:4;i:413;i:5;i:414;}


Have you taken a look at regular expressions? It would be perfect for
parsing your data : )
 
B

Bill nguyen

I think reg ex is what I have to take a look. My question is still about an
efficient method to parse and insert the data elements into a datarow.

Thanks
Bill
 
G

Guest

I think reg ex is what I have to take a look. My question is still
about an efficient method to parse and insert the data elements into a
datarow.


Write a regular expression to parse the string into two sections:

a:18: {i:0;i:408;i:1;i:409;i:2;i:410;i:3;i:411;i:4;i:413;i:5;i:414;}
^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

A neat feature of .NET's regular expression is that you can reference
components with a section. For example:

{i:0;i:408;i:1;i:409;i:2;i:410;i:3;i:411;i:4;i:413;i:5;i:414;}

is actually composed of subelements - so you can access each of these
subelements (called group or named captures). More info is here:

http://www.regular-expressions.info/named.html

Once your string has been successfully parsed by the regular expression,
loop over the second section, and retrieve each group - then stuff the
results into a datatable and submit to a database.

Your code should be pretty efficent - in summary:

1. A couple lines to prase the data
2. Retreive the matches from the regular expression
3. Loop over the second section's submatches (named captures) and fill in
your data object.

Here is a general tutorial on regular expressions:

http://aspnet.4guysfromrolla.com/articles/022603-1.aspx
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top