Is there an escape sequence for Space char? for space delimted tx

R

Rich

I need to read a large space delimted text file. I can do this using a
streamReader except it takes twice as long as an OleDBDataAdapter (using the
following delimiters: tabDelimited/comma/| pipe). My problem is in using a
space as the delimiter for reading a space delimited text file using OleDB.
Here are some sample connection strings for a tabDelimited text file or Pipe
Delimited (which both work fine):

string s1 = Application.StartupPath;

connOle.ConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source="
+ s1 + ";Extended Properties=\"text;HDR=Yes;FMT=TabDelimited\"";

connOle.ConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source="
+ s1 + ";Extended Properties=\"text;HDR=Yes;FMT=Delimited(|)\""

Note: oleDB also requires a schema.ini file to be placed in the same folder
as the text file to be read

save as schema.ini

[fileName.extention]
ColNameHeader=true
CharacterSet=ANSI
Format=Delimited(|)

I have tried variations for space delimiting without success

connOle.ConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source="
+ s1 + ";Extended Properties=\"text;HDR=Yes;FMT=Delimited(' ')\""

//here I try a hex sequence which works with console.writeline() but not
with oleDB

connOle.ConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source="
+ s1 + ";Extended Properties=\"text;HDR=Yes;FMT=Delimited(\x20)\""

Any suggestions would be greatly appreciated for an escape sequence for a
space delimiter.

Thanks,
Rich
 
R

Rich

Thanks. This does look promising. Any you say this has performance like the
Jet method? How about the delimiter? can I say

parser.ColumnDelimiter = " ".ToCharArray();

for a space delimiter?
 
P

Peter Duniho

Rich said:
Thanks. This does look promising. Any you say this has performance like the
Jet method? How about the delimiter? can I say

parser.ColumnDelimiter = " ".ToCharArray();

for a space delimiter?

I can't answer the parser-specific aspect (though, seems like that's
something you could just try), but if you can do the above, wouldn't you
prefer this instead:

parser.ColumnDelimiter = new char[] { ' ' };

?

Why create a whole string instance only to just turn around and then
create a new array based on it, when you can just create the array directly?

Pete
 
J

Jeff Johnson

Thanks. This does look promising. Any you say this has performance like
the
Jet method?

The performance is spectacular. I've never done any side-by-side
comparisons, but I know this thing is fast.
 
R

Rich

I started experimenting with this sample project. I noticed that stream
reader is being used here, along with a creation of a schema.ini file like
the Jet technique. It looks to me like the Jet technique wraps up all of its
coding to a one liner - where the Jet underlying code is probably simlar to
the code being used in this sample. And I guess the benefit with the code in
this sample is that can be modified where the Jet code can't.

The downside with this sample - for me - is the learning curve. I will have
to study this a bit. And then once I compile the class I would have to
reference it - adding a dependency to my project.

It looks - for the time being - I will resign myself to my elementary usage
of StreamReader. The Jet technique would be nice because it is a one liner,
but alas! it does not seem to support a space as a delimiter.
 
J

Jeff Johnson

I started experimenting with this sample project. I noticed that stream
reader is being used here, along with a creation of a schema.ini file like
the Jet technique. It looks to me like the Jet technique wraps up all of
its
coding to a one liner - where the Jet underlying code is probably simlar
to
the code being used in this sample. And I guess the benefit with the code
in
this sample is that can be modified where the Jet code can't.

The downside with this sample - for me - is the learning curve. I will
have
to study this a bit. And then once I compile the class I would have to
reference it - adding a dependency to my project.

It looks - for the time being - I will resign myself to my elementary
usage
of StreamReader. The Jet technique would be nice because it is a one
liner,
but alas! it does not seem to support a space as a delimiter.

....learning curve? It should be about as simple as

TextParserAdapter parser = new
TextParserAdapter(@"<path>\SpaceDelimFile.txt");
parser.ColumnDelimiter = new char[] { ' ' };

DataTable dt = parser.GetDataTable();

And then you just work with the data in the DataTable like you would with
data from any other data source. Now I've made a lot of modifications to
that library over time, but I think the code I have right there should work
out-of-the-box.
 
R

Rich

Here is what I mean by learning curve:
.....learning curve? It should be about as simple as

TextParserAdapter parser = new
textparseradapter(@"<path>\SpaceDelimFile.txt");
parser.ColumnDelimiter = new char[] { ' ' };

DataTable dt = parser.GetDataTable();
<

I compiled GenericParsing, I added a reference to my project to the
GenericParsing library, and also added a using directive -- using
GenericParsing

But I do not get TextParserAdapter to show up in the intellisense and if I
just type it - VS(2008) complains. Is it because GenericParsing is from
VS2003? How do Implement this in my (VS2008 C#) project?

Thanks
 
J

Jeff Johnson

Here is what I mean by learning curve:
....learning curve? It should be about as simple as

TextParserAdapter parser = new
textparseradapter(@"<path>\SpaceDelimFile.txt");
parser.ColumnDelimiter = new char[] { ' ' };

DataTable dt = parser.GetDataTable();
<

I compiled GenericParsing, I added a reference to my project to the
GenericParsing library, and also added a using directive -- using
GenericParsing

But I do not get TextParserAdapter to show up in the intellisense and if I
just type it - VS(2008) complains. Is it because GenericParsing is from
VS2003? How do Implement this in my (VS2008 C#) project?

Well, crap. It appears I didn't like "GenericParser" and renamed it
"TextParser". (I must have thought it was too "generic"....)

So use GenericParserAdapter instead and see if that works.
 
R

Rich

Well, I tried the following, but VS complained at the DataTable part. In the
demo project I did not see any GetDataTable() methods.

GenericParsing.GenericParser parser = new GenericParsing.GenericParser(s1);

DataTable dt = parser.getdatatable();



Jeff Johnson said:
I started experimenting with this sample project. I noticed that stream
reader is being used here, along with a creation of a schema.ini file like
the Jet technique. It looks to me like the Jet technique wraps up all of
its
coding to a one liner - where the Jet underlying code is probably simlar
to
the code being used in this sample. And I guess the benefit with the code
in
this sample is that can be modified where the Jet code can't.

The downside with this sample - for me - is the learning curve. I will
have
to study this a bit. And then once I compile the class I would have to
reference it - adding a dependency to my project.

It looks - for the time being - I will resign myself to my elementary
usage
of StreamReader. The Jet technique would be nice because it is a one
liner,
but alas! it does not seem to support a space as a delimiter.

....learning curve? It should be about as simple as

TextParserAdapter parser = new
TextParserAdapter(@"<path>\SpaceDelimFile.txt");
parser.ColumnDelimiter = new char[] { ' ' };

DataTable dt = parser.GetDataTable();

And then you just work with the data in the DataTable like you would with
data from any other data source. Now I've made a lot of modifications to
that library over time, but I think the code I have right there should work
out-of-the-box.


.
 
R

Rich

OK. I am kind of a lamo, but I finally tried this which sort of worked up to
---

GenericParsing.GenericParserAdapter parser = new
GenericParsing.GenericParserAdapter(s1);
parser.ColumnDelimiter = new char[] { ' ' };
DataTable dt = parser.GetDataTable();

Console.WriteLine(dt.Rows.Count.ToString());

dgrv1.DataSource = dt;

--the datagridview (dgrv1) complained that the column fill width could not
exceed 65535 (or some number like that).

I am sure that the parser read the text file (this textfile only had 52,000
rows) because it took it about 30 seconds to load - which is way quicker than
my streamRead routine. But on the console.writeline (above) it only wrote 1
row for dt.Rows.Count.ToString()

I think I like the performance - just trying to get it to work correctly is
a little bit challenging (for me).

Jeff Johnson said:
Here is what I mean by learning curve:
....learning curve? It should be about as simple as

TextParserAdapter parser = new
textparseradapter(@"<path>\SpaceDelimFile.txt");
parser.ColumnDelimiter = new char[] { ' ' };

DataTable dt = parser.GetDataTable();
<

I compiled GenericParsing, I added a reference to my project to the
GenericParsing library, and also added a using directive -- using
GenericParsing

But I do not get TextParserAdapter to show up in the intellisense and if I
just type it - VS(2008) complains. Is it because GenericParsing is from
VS2003? How do Implement this in my (VS2008 C#) project?

Well, crap. It appears I didn't like "GenericParser" and renamed it
"TextParser". (I must have thought it was too "generic"....)

So use GenericParserAdapter instead and see if that works.


.
 
R

Rich

Yay! I got it work -- turns out that my text file with the 52000 rows was
this type:

"abc" "def" "ghi" "jkl"
"abc" "def" "ghi" "jkl"
"abc" "def" "ghi" "jkl"
....

with double quotes surrounding the text. It must have been generated with
VBA. Anyway, the streamreader in my original routine will read the double
quotes OK, I was just doing a .Replace(..."\"","") for each piece of data.
Once I understand the workings of GenericParser I could probably add a
..Replace to it (somewhere).

Rich said:
OK. I am kind of a lamo, but I finally tried this which sort of worked up to
---

GenericParsing.GenericParserAdapter parser = new
GenericParsing.GenericParserAdapter(s1);
parser.ColumnDelimiter = new char[] { ' ' };
DataTable dt = parser.GetDataTable();

Console.WriteLine(dt.Rows.Count.ToString());

dgrv1.DataSource = dt;

--the datagridview (dgrv1) complained that the column fill width could not
exceed 65535 (or some number like that).

I am sure that the parser read the text file (this textfile only had 52,000
rows) because it took it about 30 seconds to load - which is way quicker than
my streamRead routine. But on the console.writeline (above) it only wrote 1
row for dt.Rows.Count.ToString()

I think I like the performance - just trying to get it to work correctly is
a little bit challenging (for me).

Jeff Johnson said:
Here is what I mean by learning curve:


....learning curve? It should be about as simple as

TextParserAdapter parser = new
textparseradapter(@"<path>\SpaceDelimFile.txt");
parser.ColumnDelimiter = new char[] { ' ' };

DataTable dt = parser.GetDataTable();
<

I compiled GenericParsing, I added a reference to my project to the
GenericParsing library, and also added a using directive -- using
GenericParsing

But I do not get TextParserAdapter to show up in the intellisense and if I
just type it - VS(2008) complains. Is it because GenericParsing is from
VS2003? How do Implement this in my (VS2008 C#) project?

Well, crap. It appears I didn't like "GenericParser" and renamed it
"TextParser". (I must have thought it was too "generic"....)

So use GenericParserAdapter instead and see if that works.


.
 
J

Jeff Johnson

Yay! I got it work -- turns out that my text file with the 52000 rows was
this type:

"abc" "def" "ghi" "jkl"
"abc" "def" "ghi" "jkl"
"abc" "def" "ghi" "jkl"
...

with double quotes surrounding the text. It must have been generated with
VBA. Anyway, the streamreader in my original routine will read the double
quotes OK, I was just doing a .Replace(..."\"","") for each piece of data.
Once I understand the workings of GenericParser I could probably add a
.Replace to it (somewhere).

parser.TextQualifier = '"' // <-- apostrophe quotation-mark apostrophe
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top