read csv file to sql

G

Guest

I have a CSV file, like the one you can save in Excel.

I want to use C# .NET to read the file and place the data into a sql2k data
base. and then import another file that would update those same records.

Assume file1.csv file contains 5 fields separated by commas named a,b,c,d,e
and the sql data bse has 5 char fields named f1,f2,f3,f4,f5

I want field 'a' from the csv to be imported into field 'f1' of the sql db
and so on to 'e' goes to 'f5'.

Then I have file2.csv which has the same fields a thru e with data that may
or may not need updating to the db. Field 'a' of the csv file is unique for
that row. So I need to read file2.csv, check to see if the data in field 'a'
matches a row of data in field 'f1' of the db, if it finds a match, then
replace the db row with the new data from file2.csv. If no match then just
leave it alone.

Thanks You

Paul
 
K

Kevin Yu [MSFT]

Hi Paul,

First of all, I would like to confirm my understanding of your issue. From
your description, I understand that you need to import data from a CSV file
to SQL Server. If there is any misunderstanding, please feel free to let me
know.

The simplest way to achieve this goal, is to use SQL DTS. You can find
"Import and Export Data" from the Start menu, in SQL server folder. You can
customize the transfer process by setting the options in it.

If you need to do it through programming, you have to make the CSV file as
an ODBC data source first. We can add a new ODBC data source under
administrative tools/Data Sources(ODBC). Then we use an ODBC data provider
to get data from the CSV file into a DataSet, and update them to a SQL
server database using SQL data provider.

Since you need to f1 field to be unique, I suggest you make f1 field
primary key of the table. When transfering data, we can insert records one
by one in a loop. When the primary key constraint is violated, an exception
will be thrown, we just catch that exception and ignore it. Thus,
duplicated rows will be ignored.

HTH.

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."
 
G

Guest

OK, I am able to get the CSV file into a dataset like this:

string ConnectionString =
@"Driver={Microsoft Text Driver (*.txt; *.csv)};DBQ=c:\";
OdbcConnection conn = new OdbcConnection(ConnectionString);
conn.Open();
OdbcDataAdapter da = new OdbcDataAdapter("Select * FROM Test.txt", conn);
DataSet ds = new DataSet();
da.Fill(ds, "TheTable");
conn.Close();

So now what code do I use to get the dataset into an exsisting SQL table.

The primary field of the SQL table is not in the CSV file so I can't just
recreate the structure of the SQL table and do an update on the dataset.

Like you said I can loop through the dataset inserting each row into an SQL
table, while doing that I can check for the records I want to update.

Can you just show me the code you would use to loop through the dataset and
insert all the records of the dataset into the exsisting SQL table.
 
K

Kevin Yu [MSFT]

Hi Paul,

Since the f1 field cannot be the primary key field, we have to check for
existence first. Here I use a SELECT COUNT sql statement. If the record
exists, just ignore it. If it doesn't, insert it with INSERT statement.

Here's a sample code.

SqlCommand cmdCheck = new SqlCommand("SELECT COUNT(*) FROM Table1 Where
f1=@f1", cnn);
foreach(DataRow dr in ds.Tables["Table1"].Rows)
{
cmdCheck.Parameters.Clear();
SqlParameter p = cmdCheck.Parameters.Add("@f1", SqlDbType.NVarChar, 50);
p.Value = dr["a"];
int iCount = (int)cmdCheck.ExecuteScalar();
if(iCount == 0)
{
//run insert commands
}
}


Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."
 
G

Guest

OK,

I has added sqlConn.open() but it is not working correctly.

Here is your code with a few of my mofifications, I run the querry directly
against the Users table with f1 being equal to 'paul' and it returns 1.

But when I step through the code the ExecuteScalar() line returns 0 when
p.value is 'paul', it should return 1.


SqlCommand cmdCheck = new SqlCommand("SELECT COUNT(*) FROM [User] where
First like @f1", sqlConn);

sqlConn.Open();

foreach(DataRow dr in dt.Rows)
{
cmdCheck.Parameters.Clear();
SqlParameter p = cmdCheck.Parameters.Add("@f1", SqlDbType.NVarChar, 50);
p.Value = dr["b"];
int iCount = (int)cmdCheck.ExecuteScalar();
if(iCount == 0)
{
//run insert commands
}
}
 
G

Guest

Ignore that last post, it works now. I was using 'like' to compare f1 because
that's what worked when running against the db, '=' did not work on the db.
but the way you had it with the '=' is what works inside the command. Why is
that?

Also, my connection string for getting the csv file:

string ConnectionString =
@"Driver={Microsoft Text Driver (*.txt; *.csv)};DBQ=c:\;Extended
Properties=""HDR=NO""";

The Extended properties HDR=NO does not work. set at yes or no, when I
assign the dataset to the source of a datagrid it always makes the first line
a header. Does the connection string look correct to you or is there
something missing.

Thanks



OK,

I has added sqlConn.open() but it is not working correctly.

Here is your code with a few of my mofifications, I run the querry directly
against the Users table with f1 being equal to 'paul' and it returns 1.

But when I step through the code the ExecuteScalar() line returns 0 when
p.value is 'paul', it should return 1.


SqlCommand cmdCheck = new SqlCommand("SELECT COUNT(*) FROM [User] where
First like @f1", sqlConn);

sqlConn.Open();

foreach(DataRow dr in dt.Rows)
{
cmdCheck.Parameters.Clear();
SqlParameter p = cmdCheck.Parameters.Add("@f1", SqlDbType.NVarChar, 50);
p.Value = dr["b"];
int iCount = (int)cmdCheck.ExecuteScalar();
if(iCount == 0)
{
//run insert commands
}
}

Kevin Yu said:
Hi Paul,

Since the f1 field cannot be the primary key field, we have to check for
existence first. Here I use a SELECT COUNT sql statement. If the record
exists, just ignore it. If it doesn't, insert it with INSERT statement.

Here's a sample code.

SqlCommand cmdCheck = new SqlCommand("SELECT COUNT(*) FROM Table1 Where
f1=@f1", cnn);
foreach(DataRow dr in ds.Tables["Table1"].Rows)
{
cmdCheck.Parameters.Clear();
SqlParameter p = cmdCheck.Parameters.Add("@f1", SqlDbType.NVarChar, 50);
p.Value = dr["a"];
int iCount = (int)cmdCheck.ExecuteScalar();
if(iCount == 0)
{
//run insert commands
}
}


Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."
 
K

Kevin Yu [MSFT]

Hi Paul,

As far as I know, we can use '=' when comparing strings in SQL. I have
checked it in the Query Analyzer, and it works fine. Maybe the two strings
are not exact match or there might be something wrong with the collation
settings on SQL Server. For this issue I suggest you ask in the SQL Server
newsgroup.

You can also try to modify the connection string as the following:

string ConnectionString = @"Driver={Microsoft Text Driver (*.txt;
*.csv)};DBQ=c:\;HDR=YES;Extensions=asc,csv,tab,txt;HDR=YES;Persist Security
Info=False";

HTH.

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."
 
G

Guest

No matter which way I set HDR 'YES' or 'NO' the dataset displays the first
row of data in the header section of the grid.

I noticed HDR is located twice in the connection string, is that needed or
just a typo. I tried removing it from each section and I get the same results.
 
M

[MSFT]

Hello Paul,

I am reviewing this post. I also agree with Kevin's Connection string

"Driver={Microsoft Text Driver (*.txt;
*.csv)};DBQ=c:\;HDR=YES;Extensions=asc,csv,tab,txt;HDR=YES;Persist Security
Info=False";

Except the double "HDR=YES".

With this Connection string, it should be able to recognize the Column
Header

Can you post a sample .csv file you used? We can test with it to see what
happened.

Luke
 
G

Guest

My test.csv or test.txt file looks like this.

I change HDR= to YES or NO and the datagrid still shows the first row as
the header.

3,fred,Brown,Engineer
7,mike,fat,programmer
8,ken,small,janitor

I don't know how to upload the actual file in the newsgroup.
 
K

Kevin Yu [MSFT]

Hi Paul,

Sorry, it was my fault to add another HDR in the connection string. It has
to appear only once. I tested it on my machine, however, it works fine if I
set HDR=NO. So if that still doesn't work, I suggest you manually add a
header row for workaround. HTH.

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."
 
K

Kevin Yu [MSFT]

Hi Paul,

With my further research, we can add a schema.ini file in the same folder
as the csv file. Here is an example. Assume that the file name is a.csv.

[a.csv]
ColNameHeader=False
Format=CSVDelimited
MaxScanRows=0
CharacterSet=ANSI

We can change the ColNameHeader value to true to enable the column header,
and set it to false to disable it. HTH.

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top