CSV files

C

Cat

I don't understand why there's no class included in the libraries for
reading CSV files.. I've created my own CSV reader class which reads a CSV
file, generates a report and returns records etc. Although I'm proud of
having tackled the problem and produced code that works I worry that I could
have saved a lot of time if I could have just found that class in the
library which I'm convinced must work.

Does anyone have an explanation as to why there's no such class?

Cat
 
N

Nicholas Paldino [.NET/C# MVP]

Cat,

You can use the classes in the System.Data.OleDb namespace. Basically,
you will have to use a provider for OLEDB (I believe there is one for comma
delimited text, or any delimited text in general) which will do what you
want. Using that provider, you can query the data like you would any other
data and get the results in a DataReader, or a DataSet.

Hope this helps.
 
K

Konrad L. M. Rudolph

Cat said:
Does anyone have an explanation as to why there's no such class?

perhaps because the format is not /that/ often used and it's fairly
simple so a 30-line-code is sufficent to parse it?
 
C

Cat

I wouldn't say the code is fairly simple, since there's a lot of error
checking to be done and decisions to be made depending on where quotation
marks and commas appear in the csv file. Still, I am very much a newbie to
C# and commercial programming in general so I may have a very different
opinion in a year or so.

As for csv files not being used so often - unfortunately due to the nature
of the company I work for I'll be dealing with files like that quite a lot
:(
 
C

Cat

Thanks for the advice. I still haven't quite got a handle on how to work
with OleDB, but I'll no doubt work through some of the msdn tutuorials in
the near future.

Nicholas Paldino said:
Cat,

You can use the classes in the System.Data.OleDb namespace. Basically,
you will have to use a provider for OLEDB (I believe there is one for comma
delimited text, or any delimited text in general) which will do what you
want. Using that provider, you can query the data like you would any other
data and get the results in a DataReader, or a DataSet.

Hope this helps.


--
- Nicholas Paldino [.NET/C# MVP]
- (e-mail address removed)

Cat said:
I don't understand why there's no class included in the libraries for
reading CSV files.. I've created my own CSV reader class which reads a CSV
file, generates a report and returns records etc. Although I'm proud of
having tackled the problem and produced code that works I worry that I could
have saved a lot of time if I could have just found that class in the
library which I'm convinced must work.

Does anyone have an explanation as to why there's no such class?

Cat
 
M

Mark Broadbent

Hehe LMAO.
Guess what, I did exactly the same thing -was good coding practice anyhow
and importantly was to go through the jobserve.csv file (any guesses why?!)

Then one day I stumbled upon some code and thought "you plum!". This
following code came from the MCAD/ MCSD Developing Windows Based
Applications book from MSPRESS (dont buy it though the dual vb.net/ c#.net
training is a pain in the proverbial). The most important line is the very
last one in the listing.

// This example assumes the existence of a text file named myFile.txt
// that contains an undetermined number of rows with seven entries
// in each row. Creates a new DataSet
DataSet myDataSet = new DataSet();
// Creates a new DataTable and adds it to the Tables collection
DataTable aTable = new DataTable("Table 1");
myDataSet.Tables.Add("Table 1");
// Creates and names seven columns and adds them to Table 1
DataColumn aColumn;
for (int counter = 0; counter < 7; counter ++)
{
aColumn = new DataColumn("Column " + counter.ToString());
myDataSet.Tables["Table 1"].Columns.Add(aColumn);
}
// Creates the StreamReader to read the file and a string variable to
// hold the output of the StreamReader
System.IO.StreamReader myReader = new
System.IO.StreamReader("C:\\myFile.txt");
string myString;
// Checks to see if the Reader has reached the end of the stream
while (myReader.Peek() != –1)
{
// Reads a line of data from the text file
myString = myReader.ReadLine();
// Uses the String.Split method to create an array of strings that
// represents each entry in the line. That array is then added as
// a new DataRow to Table 1
myDataSet.Tables["Table 1"].
Rows.Add(myString.Split(char.Parse(",")));
}

--

--

Br,
Mark Broadbent
mcdba , mcse+i
=============
 
C

Cat

The problem with this code is that it wouldn't know how to deal with a line
like this:

"First Name, Last Name", "Address1, Address2, Address3, Code", Telephone
number

- the code I've been writing has to take into account the fact that commas
can appear in field data and that not all fields will have surrounding
quotations marks. It has to check for errors too, like quotation marks
appearing mid field when the field data did or didn't start with a quotation
mark. Annoyingly makes the file reading code a bit more complex

Mark Broadbent said:
Hehe LMAO.
Guess what, I did exactly the same thing -was good coding practice anyhow
and importantly was to go through the jobserve.csv file (any guesses why?!)

Then one day I stumbled upon some code and thought "you plum!". This
following code came from the MCAD/ MCSD Developing Windows Based
Applications book from MSPRESS (dont buy it though the dual vb.net/ c#.net
training is a pain in the proverbial). The most important line is the very
last one in the listing.

// This example assumes the existence of a text file named myFile.txt
// that contains an undetermined number of rows with seven entries
// in each row. Creates a new DataSet
DataSet myDataSet = new DataSet();
// Creates a new DataTable and adds it to the Tables collection
DataTable aTable = new DataTable("Table 1");
myDataSet.Tables.Add("Table 1");
// Creates and names seven columns and adds them to Table 1
DataColumn aColumn;
for (int counter = 0; counter < 7; counter ++)
{
aColumn = new DataColumn("Column " + counter.ToString());
myDataSet.Tables["Table 1"].Columns.Add(aColumn);
}
// Creates the StreamReader to read the file and a string variable to
// hold the output of the StreamReader
System.IO.StreamReader myReader = new
System.IO.StreamReader("C:\\myFile.txt");
string myString;
// Checks to see if the Reader has reached the end of the stream
while (myReader.Peek() != -1)
{
// Reads a line of data from the text file
myString = myReader.ReadLine();
// Uses the String.Split method to create an array of strings that
// represents each entry in the line. That array is then added as
// a new DataRow to Table 1
myDataSet.Tables["Table 1"].
Rows.Add(myString.Split(char.Parse(",")));
}

--

--

Br,
Mark Broadbent
mcdba , mcse+i
=============
Cat said:
I don't understand why there's no class included in the libraries for
reading CSV files.. I've created my own CSV reader class which reads a CSV
file, generates a report and returns records etc. Although I'm proud of
having tackled the problem and produced code that works I worry that I could
have saved a lot of time if I could have just found that class in the
library which I'm convinced must work.

Does anyone have an explanation as to why there's no such class?

Cat
 
M

Mark Broadbent

I see. :(
You could potentially split on commas (refer to code sample) and for each
string in string array returned split again on " queuing it all up and then
combining it all together. Dont know how much error checking youd need but
obviously it would still be a requirement.
OLEDB classes might be worth a look for this, however its probably easier to
stick with a custom class.

--

--

Br,
Mark Broadbent
mcdba , mcse+i
=============
Cat said:
The problem with this code is that it wouldn't know how to deal with a line
like this:

"First Name, Last Name", "Address1, Address2, Address3, Code", Telephone
number

- the code I've been writing has to take into account the fact that commas
can appear in field data and that not all fields will have surrounding
quotations marks. It has to check for errors too, like quotation marks
appearing mid field when the field data did or didn't start with a quotation
mark. Annoyingly makes the file reading code a bit more complex

Mark Broadbent said:
Hehe LMAO.
Guess what, I did exactly the same thing -was good coding practice anyhow
and importantly was to go through the jobserve.csv file (any guesses why?!)

Then one day I stumbled upon some code and thought "you plum!". This
following code came from the MCAD/ MCSD Developing Windows Based
Applications book from MSPRESS (dont buy it though the dual vb.net/ c#.net
training is a pain in the proverbial). The most important line is the very
last one in the listing.

// This example assumes the existence of a text file named myFile.txt
// that contains an undetermined number of rows with seven entries
// in each row. Creates a new DataSet
DataSet myDataSet = new DataSet();
// Creates a new DataTable and adds it to the Tables collection
DataTable aTable = new DataTable("Table 1");
myDataSet.Tables.Add("Table 1");
// Creates and names seven columns and adds them to Table 1
DataColumn aColumn;
for (int counter = 0; counter < 7; counter ++)
{
aColumn = new DataColumn("Column " + counter.ToString());
myDataSet.Tables["Table 1"].Columns.Add(aColumn);
}
// Creates the StreamReader to read the file and a string variable to
// hold the output of the StreamReader
System.IO.StreamReader myReader = new
System.IO.StreamReader("C:\\myFile.txt");
string myString;
// Checks to see if the Reader has reached the end of the stream
while (myReader.Peek() != -1)
{
// Reads a line of data from the text file
myString = myReader.ReadLine();
// Uses the String.Split method to create an array of strings that
// represents each entry in the line. That array is then added as
// a new DataRow to Table 1
myDataSet.Tables["Table 1"].
Rows.Add(myString.Split(char.Parse(",")));
}

--

--

Br,
Mark Broadbent
mcdba , mcse+i
=============
Cat said:
I don't understand why there's no class included in the libraries for
reading CSV files.. I've created my own CSV reader class which reads a CSV
file, generates a report and returns records etc. Although I'm proud of
having tackled the problem and produced code that works I worry that I could
have saved a lot of time if I could have just found that class in the
library which I'm convinced must work.

Does anyone have an explanation as to why there's no such class?

Cat
 
D

Daniel O'Connell [C# MVP]

Cat said:
The problem with this code is that it wouldn't know how to deal with a
line
like this:

"First Name, Last Name", "Address1, Address2, Address3, Code", Telephone
number

- the code I've been writing has to take into account the fact that commas
can appear in field data and that not all fields will have surrounding
quotations marks. It has to check for errors too, like quotation marks
appearing mid field when the field data did or didn't start with a
quotation
mark. Annoyingly makes the file reading code a bit more complex
And that very well may hit on the head a good part of why its not
included(not including, of course, that OleDb supports it already). One of
the most annoying things about csv files is the lack of standardization.
Everything from parsing expectations(quotes required, quotes required only
when containing commas, quotes not allowed) down to any specific escape
sequences your file may use so that quotes or newlines may appear varies
somewhat between files. Writing a generic reader that reads all of the
combonations correctly that may not be possible to do in a way that makes
the generic reader worth the time. Last thing you want to do is provide a
generic reader that no one is happy with.
Mark Broadbent said:
Hehe LMAO.
Guess what, I did exactly the same thing -was good coding practice anyhow
and importantly was to go through the jobserve.csv file (any guesses why?!)

Then one day I stumbled upon some code and thought "you plum!". This
following code came from the MCAD/ MCSD Developing Windows Based
Applications book from MSPRESS (dont buy it though the dual vb.net/
c#.net
training is a pain in the proverbial). The most important line is the
very
last one in the listing.

// This example assumes the existence of a text file named myFile.txt
// that contains an undetermined number of rows with seven entries
// in each row. Creates a new DataSet
DataSet myDataSet = new DataSet();
// Creates a new DataTable and adds it to the Tables collection
DataTable aTable = new DataTable("Table 1");
myDataSet.Tables.Add("Table 1");
// Creates and names seven columns and adds them to Table 1
DataColumn aColumn;
for (int counter = 0; counter < 7; counter ++)
{
aColumn = new DataColumn("Column " + counter.ToString());
myDataSet.Tables["Table 1"].Columns.Add(aColumn);
}
// Creates the StreamReader to read the file and a string variable to
// hold the output of the StreamReader
System.IO.StreamReader myReader = new
System.IO.StreamReader("C:\\myFile.txt");
string myString;
// Checks to see if the Reader has reached the end of the stream
while (myReader.Peek() != -1)
{
// Reads a line of data from the text file
myString = myReader.ReadLine();
// Uses the String.Split method to create an array of strings that
// represents each entry in the line. That array is then added as
// a new DataRow to Table 1
myDataSet.Tables["Table 1"].
Rows.Add(myString.Split(char.Parse(",")));
}

--

--

Br,
Mark Broadbent
mcdba , mcse+i
=============
Cat said:
I don't understand why there's no class included in the libraries for
reading CSV files.. I've created my own CSV reader class which reads a CSV
file, generates a report and returns records etc. Although I'm proud of
having tackled the problem and produced code that works I worry that I could
have saved a lot of time if I could have just found that class in the
library which I'm convinced must work.

Does anyone have an explanation as to why there's no such class?

Cat
 
K

Konrad L. M. Rudolph

Mark said:
I see. :(
You could potentially split on commas (refer to code sample) and for each
string in string array returned split again on " queuing it all up and then
combining it all together. Dont know how much error checking youd need but
obviously it would still be a requirement.

I don't want to interrupt ...

but has either of you considered using regex to parse CSV?
 
M

Mark Broadbent

no but next time I need to parse a csv I shall certainly look into it.
Cheers.

--

--

Br,
Mark Broadbent
mcdba , mcse+i
=============
 
C

Cat

Konrad L. M. Rudolph said:
I don't want to interrupt ...

but has either of you considered using regex to parse CSV?

I did consider regex but if you read the posting on this thread by Daniel O'
Connell that might give you an idea of why I decided not to use it. When
field data can be surrounded by commas or not (and csv files can come in
different forms) regular expressions don't really work. (I think. Correct me
if I'm wrong). I'd be interested to see a solutions to the csv problem using
regular expressions however, so if you have any example code please post it.

Cat
 
C

Cat

Daniel O'Connell said:
And that very well may hit on the head a good part of why its not
included(not including, of course, that OleDb supports it already). One of
the most annoying things about csv files is the lack of standardization.
Everything from parsing expectations(quotes required, quotes required only
when containing commas, quotes not allowed) down to any specific escape
sequences your file may use so that quotes or newlines may appear varies
somewhat between files. Writing a generic reader that reads all of the
combonations correctly that may not be possible to do in a way that makes
the generic reader worth the time. Last thing you want to do is provide a
generic reader that no one is happy with.

Yes, it's all becoming a bit clearer now. Regarding the OleDb support
(which, I'll admit I know very little about) - would I be correct in
assuming that the data gets imported and converted to whatever type seems
fitting? so, for example, if 23.2 is found in a field in the text file then
this is converted to a double and so on.. or am I barking up the wrong tree
completely?
Mark Broadbent said:
Hehe LMAO.
Guess what, I did exactly the same thing -was good coding practice anyhow
and importantly was to go through the jobserve.csv file (any guesses why?!)

Then one day I stumbled upon some code and thought "you plum!". This
following code came from the MCAD/ MCSD Developing Windows Based
Applications book from MSPRESS (dont buy it though the dual vb.net/
c#.net
training is a pain in the proverbial). The most important line is the
very
last one in the listing.

// This example assumes the existence of a text file named myFile.txt
// that contains an undetermined number of rows with seven entries
// in each row. Creates a new DataSet
DataSet myDataSet = new DataSet();
// Creates a new DataTable and adds it to the Tables collection
DataTable aTable = new DataTable("Table 1");
myDataSet.Tables.Add("Table 1");
// Creates and names seven columns and adds them to Table 1
DataColumn aColumn;
for (int counter = 0; counter < 7; counter ++)
{
aColumn = new DataColumn("Column " + counter.ToString());
myDataSet.Tables["Table 1"].Columns.Add(aColumn);
}
// Creates the StreamReader to read the file and a string variable to
// hold the output of the StreamReader
System.IO.StreamReader myReader = new
System.IO.StreamReader("C:\\myFile.txt");
string myString;
// Checks to see if the Reader has reached the end of the stream
while (myReader.Peek() != -1)
{
// Reads a line of data from the text file
myString = myReader.ReadLine();
// Uses the String.Split method to create an array of strings that
// represents each entry in the line. That array is then added as
// a new DataRow to Table 1
myDataSet.Tables["Table 1"].
Rows.Add(myString.Split(char.Parse(",")));
}

--

--

Br,
Mark Broadbent
mcdba , mcse+i
=============
I don't understand why there's no class included in the libraries for
reading CSV files.. I've created my own CSV reader class which reads
a
CSV
file, generates a report and returns records etc. Although I'm proud of
having tackled the problem and produced code that works I worry that I
could
have saved a lot of time if I could have just found that class in the
library which I'm convinced must work.

Does anyone have an explanation as to why there's no such class?

Cat
 
D

Daniel O'Connell [C# MVP]

Yes, it's all becoming a bit clearer now. Regarding the OleDb support
(which, I'll admit I know very little about) - would I be correct in
assuming that the data gets imported and converted to whatever type seems
fitting? so, for example, if 23.2 is found in a field in the text file
then
this is converted to a double and so on.. or am I barking up the wrong
tree
completely?

That sounds right, but I can't say for sure. I've never actually used OleDb
programatically to read csv files. I've just used it a few times in programs
that support OleDb connections to import data.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top