Handling nulls in tab-delimited string?

V

Vagabond Software

Apparently, the Split method handles consecutive tabs as a single delimiter. Does anyone have any suggestions for handling consecutive tabs?

I am reading in text files that contain lines of tab-delimited data. I was using string[] stringArray = lineOfText.Split('\t') to automatically populate an array used to populate the values in a new DataRow.

However, sometimes the lines of text contain null values. I can find these null values by opening the file in a text editor and using the arrow keys locate consecutive Tabs with, presumably, a null value between them.

Any help is greatly appreciated.

- VS
 
J

Jon Skeet [C# MVP]

Vagabond Software said:
Apparently, the Split method handles consecutive tabs as a single
delimiter.

String.Split doesn't. For instance:

using System;

public class Test
{
static void Main()
{
string x ="a\t\tb";
string[] values = x.Split();
foreach (string value in values)
{
Console.WriteLine(value);
}
}
}

prints "a" then a blank line (the empty value) then "b".
 
V

Vagabond Software

Jon Skeet said:
Vagabond Software said:
Apparently, the Split method handles consecutive tabs as a single
delimiter.

String.Split doesn't. For instance:

using System;

public class Test
{
static void Main()
{
string x ="a\t\tb";
string[] values = x.Split();
foreach (string value in values)
{
Console.WriteLine(value);
}
}
}

prints "a" then a blank line (the empty value) then "b".

Thanks for the reply. I discovered late last night that it was I who was mishandling the consecutive tabs and not the Split method.

- VS
 
B

Bob Grommes

As you've already discovered, Split() does work correctly for the situation
you mentioned; however one thing it doesn't work correctly for is
quote-delimited strings containing the field delimiter itself. For example,
in a comma-delimited file:

Joe,Schumuck,"123 Maple St, Apt 4",Anytown,CA,90000

Would be mis-parsed as Split() is not designed to recognize a delimiter
within a quoted string as not being a delimiter.

You would have to write your own routine to handle situations like that.

I mention this in case quoted strings are possible in your input.

--Bob

Apparently, the Split method handles consecutive tabs as a single delimiter.
Does anyone have any suggestions for handling consecutive tabs?

I am reading in text files that contain lines of tab-delimited data. I was
using string[] stringArray = lineOfText.Split('\t') to automatically
populate an array used to populate the values in a new DataRow.

However, sometimes the lines of text contain null values. I can find these
null values by opening the file in a text editor and using the arrow keys
locate consecutive Tabs with, presumably, a null value between them.

Any help is greatly appreciated.

- VS
 
V

Vagabond Software

Bob Grommes said:
As you've already discovered, Split() does work correctly for the situation
you mentioned; however one thing it doesn't work correctly for is
quote-delimited strings containing the field delimiter itself. For example,
in a comma-delimited file:

Joe,Schumuck,"123 Maple St, Apt 4",Anytown,CA,90000

Would be mis-parsed as Split() is not designed to recognize a delimiter
within a quoted string as not being a delimiter.

You would have to write your own routine to handle situations like that.

I mention this in case quoted strings are possible in your input.

I certainly appreciate the tip. Though I have not run into quoted data yet, there are literally thousands of the client's files I have not yet run through my parser. There's no telling what I might find.

- VS
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top