Tom said:
VMI wrote:
I'm parsing a comma-delimited record but I want it to do something
if
some of
the string is between "". How can I do this? With the Excel import
it
does it correct. I'm using String.Split().
Basically, this is what I want to do: Use string.Split() on the
whole
string UNLESS the string is in between double-quotes. The part of
the
string in-between the "" will be ignored by String.Split
I wrote a wee lexer that did this, somewhere earlier in the
newsgroup. I've found it, and here it is:
Tom / VMI:
If you don't mind a bit of reworking, this version of Tom's lexer
uses StringBuilder rather than String, and so will not create so many
intermediate strings on the heap that later have to be
garbage-collected. For large volumes of data it will make a
significant difference:
public string[] GetStringParts ( string inputString )
{
List<string> retVal = new List<string>();
StringBuilder currentPart = new StringBuilder();
bool withinQuotes = false;
for ( int i = 0; i < inputString.Length; i++ )
{
char c = inputString
;
if (withinQuotes)
{
if (c == '"')
{
withinQuotes = false;
}
else
{
currentPart.Append(c);
}
}
else
{
if (c == ',')
{
retVal.Add( currentPart.ToString().Trim() );
currentPart.Length = 0;
}
else if ( c == '"' )
{
withinQuotes = true;
}
else
{
currentPart.Append(c);
}
}
}
retVal.Add( currentPart.ToString().Trim() );
return retVal.ToArray();
}
This version also fixes a bug whereby the last item in the
comma-separated list wasn't being added to the return array.
Anyway, this is the kind of simple solution I was talking about: easy
to read, easy to maintain. It's also easy to add refinements like
backslash-escapes for quote characters, etc.