how to split a string using ,fixed character length, variable text delimmiter

O

Oliver Sturm

Hello,
Emailed a sample, thanks very much.

Replied by email. Just a quick summary of what was wrong with your
previous code.

Looking at the regex I had posted previously:

[ ]([A-Z][A-Z])([A-Z0-9]+)(\1)([A-Z0-9]+)(\1)([A-Z0-9]+)[ ]

This regex has a number of groups that I had added for test purposes. It
can be stripped down to this without any changes:

[ ]([A-Z][A-Z])[A-Z0-9]+\1[A-Z0-9]+\1[A-Z0-9]+[ ]

It still has one capture group that is absolutely necessary to make the
back reference work. The Regex.Split method has the peculiar behaviour of
adding the result of the capture group itself to the string array it
returns, and there doesn't seem to be a way around that. So in the sample
program I sent you, I used the matching functionality of the Regex class
instead and picked out the pieces from the string "manually".

All this is probably not the most efficient algorithm in the world -
including the idea of reading the whole 14MB file into a string - but I
wouldn't expect any big performance problems on a modern system... if
performance is important, there are certainly lots of optimizations that
can be done.


Oliver Sturm
 
G

garyusenet

You are a gentleman and a scholar sir, I'm going to spend a good couple
of days reading over the code in your email when it arrives - until I
become confident with these techniques.

Regex is very new to me, I would have been completely lost without your
help.

Many, many, thanks again,

Gary-

Oliver said:
Hello,
Emailed a sample, thanks very much.

Replied by email. Just a quick summary of what was wrong with your
previous code.

Looking at the regex I had posted previously:

[ ]([A-Z][A-Z])([A-Z0-9]+)(\1)([A-Z0-9]+)(\1)([A-Z0-9]+)[ ]

This regex has a number of groups that I had added for test purposes. It
can be stripped down to this without any changes:

[ ]([A-Z][A-Z])[A-Z0-9]+\1[A-Z0-9]+\1[A-Z0-9]+[ ]

It still has one capture group that is absolutely necessary to make the
back reference work. The Regex.Split method has the peculiar behaviour of
adding the result of the capture group itself to the string array it
returns, and there doesn't seem to be a way around that. So in the sample
program I sent you, I used the matching functionality of the Regex class
instead and picked out the pieces from the string "manually".

All this is probably not the most efficient algorithm in the world -
including the idea of reading the whole 14MB file into a string - but I
wouldn't expect any big performance problems on a modern system... if
performance is important, there are certainly lots of optimizations that
can be done.


Oliver Sturm
 
O

Oliver Sturm

Well thank you - mail should be there, it was sent even before my previous
post.


Oliver Sturm
 
G

garyusenet

Thanks again Oliver, i'm just working through that code today. I
understand (at least at a very basic level) what most of the code is
doing. With the exception of the following line: -

string content = i == matches.Count - 1 ?

could you explain that line for me please,

Thank you,

Gary-
 
O

Oliver Sturm

Hello,
Thanks again Oliver, i'm just working through that code today. I
understand (at least at a very basic level) what most of the code is
doing. With the exception of the following line: -

string content = i == matches.Count - 1 ?

could you explain that line for me please,

It actually continues to say

string content = i == matches.Count - 1 ?
text.Substring(match.Index + match.Length) :
text.Substring(match.Index + match.Length, matches[i + 1].Index - match.Index - match.Length);


Sorry I used this - it's not the most widely understood or liked
construct. The whole thing is called a ternary expression and it's a
slightly shorter way of saying

if (i == matches.Count - 1)
content = text.Substring(match.Index + match.Length);
else
content = text.Substring(match.Index + match.Length, matches[i + 1].Index - match.Index - match.Length);


Oliver Sturm
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top