Import large text file to a MS Access database

M

mfrsousa

hi there,

i have a huge large text file (350.000 lines) that i want to import to
a MS Acccess Database, of course i don't want to use Access, but do it
with C#.

i already have tried the AddRow method or Insert, reading each line of
the text file, the problem of course is velocity, it would take more
than 4 hours to add all lines/records to the database.

any suggestion?


than you
 
R

Rad [Visual C# MVP]

hi there,

i have a huge large text file (350.000 lines) that i want to import to
a MS Acccess Database, of course i don't want to use Access, but do it
with C#.

i already have tried the AddRow method or Insert, reading each line of
the text file, the problem of course is velocity, it would take more
than 4 hours to add all lines/records to the database.

any suggestion?

than you

A bit more info would help ... do you want each line to go into its own
row in the database or all the lines into one field
 
M

Mark Rae

i have a huge large text file (350.000 lines) that i want to import to
a MS Acccess Database, of course i don't want to use Access, but do it
with C#.

i already have tried the AddRow method or Insert, reading each line of
the text file, the problem of course is velocity, it would take more
than 4 hours to add all lines/records to the database.

First of all, there is no such thing as an "Access database" per se -
Microsoft Access, like several other Micrsoft applications, uses the Jet
database format...

You say you don't want to use Access - is there any particular reason for
this...? I strongly suggest you open up the database in Access, open the
table in question, then import the text file using the Get External Data
functionality and time how long it takes - I'm willing to bet it will be
nothing like 4 hours...

Obviously if Access isn't installed, then your options are pretty much
limited to ADO.NET and the Jet OleDb provider. However, you might like to
consider using transactions to "batch" the imports into groups of, say
10,000 records - I've achieved massive speed gains with this technique in
the past...
 
R

rossum

hi there,

i have a huge large text file (350.000 lines) that i want to import to
a MS Acccess Database, of course i don't want to use Access, but do it
with C#.

i already have tried the AddRow method or Insert, reading each line of
the text file, the problem of course is velocity, it would take more
than 4 hours to add all lines/records to the database.

any suggestion?

For a bulk load of a database, such as you describe, it is often
better to switch off indexing on the affected tables before starting
the load and then rebuilding the indexes from scratch after the bulk
load has finished.

rossum
 
M

mfrsousa

rossum said:
For a bulk load of a database, such as you describe, it is often
better to switch off indexing on the affected tables before starting
the load and then rebuilding the indexes from scratch after the bulk
load has finished.

rossum

Right i was just reading this bulk thing (sorry my ignorance), the
problem is that i have tried something like this:


/************************************** CODE
*******************************/
string conn = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=|
DataDirectory|\\db.mdb";
OleDbConnection oleconn = new OleDbConnection(conn);
oleconn.Open();

string sqlstr = "BULK INSERT INTO tblExtData (ext_disc, ext_num,
ext_track, ext_artist) FROM 'c:\\data1.lst')";
OleDbCommand olecom = new OleDbCommand(sqlstr, oleconn);
olecom.ExecuteNonQuery();


/************************************** CODE
*******************************/


and i get this error

Invalid SQL statement; expected 'DELETE', 'INSERT', 'PROCEDURE',
'SELECT', or 'UPDATE'.


i think that the OleDB doesnt support the bulk command

could you point to some bulk reading or explain better please

thanks
 
M

mfrsousa

Mark said:
First of all, there is no such thing as an "Access database" per se -
Microsoft Access, like several other Micrsoft applications, uses the Jet
database format...

You say you don't want to use Access - is there any particular reason for
this...? I strongly suggest you open up the database in Access, open the
table in question, then import the text file using the Get External Data
functionality and time how long it takes - I'm willing to bet it will be
nothing like 4 hours...

Obviously if Access isn't installed, then your options are pretty much
limited to ADO.NET and the Jet OleDb provider. However, you might like to
consider using transactions to "batch" the imports into groups of, say
10,000 records - I've achieved massive speed gains with this technique in
the past...

Yes i'm developing an applicattion so i don't expect that everyone
will have the MAccess installed to make this operation. I will have a
look on the batch imports.

Thanks
 
R

rossum

Right i was just reading this bulk thing (sorry my ignorance), the
problem is that i have tried something like this:


/************************************** CODE
*******************************/
string conn = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=|
DataDirectory|\\db.mdb";
OleDbConnection oleconn = new OleDbConnection(conn);
oleconn.Open();

string sqlstr = "BULK INSERT INTO tblExtData (ext_disc, ext_num,
ext_track, ext_artist) FROM 'c:\\data1.lst')";
OleDbCommand olecom = new OleDbCommand(sqlstr, oleconn);
olecom.ExecuteNonQuery();


/************************************** CODE
*******************************/


and i get this error

Invalid SQL statement; expected 'DELETE', 'INSERT', 'PROCEDURE',
'SELECT', or 'UPDATE'.


i think that the OleDB doesnt support the bulk command

could you point to some bulk reading or explain better please

thanks
I do not know MS SQL, all my experience was with Oracle SQL some time
ago.

Basically the sequence is:

Delete all indexes for tables to be loaded
Load data into tables
Rebuild/create indexes for tables just loaded

If you load data into an indexed table then each line inserted into
the table also requires an update to the index(es) for that table. By
dropping all the indexing before a bulk load you do not have the
overhead of re-indexing during the run.

Rebuilding the indexes from scratch is faster than trying to keep them
up to date while loading 350,000 lines into a table.

HTH

rossum
 
R

RobinS

mfrsousa said:
Right i was just reading this bulk thing (sorry my ignorance), the
problem is that i have tried something like this:


/************************************** CODE
*******************************/
string conn = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=|
DataDirectory|\\db.mdb";
OleDbConnection oleconn = new OleDbConnection(conn);
oleconn.Open();

string sqlstr = "BULK INSERT INTO tblExtData (ext_disc, ext_num,
ext_track, ext_artist) FROM 'c:\\data1.lst')";
OleDbCommand olecom = new OleDbCommand(sqlstr, oleconn);
olecom.ExecuteNonQuery();


/************************************** CODE
*******************************/


and i get this error

Invalid SQL statement; expected 'DELETE', 'INSERT', 'PROCEDURE',
'SELECT', or 'UPDATE'.


i think that the OleDB doesnt support the bulk command

could you point to some bulk reading or explain better please

thanks

You are correct, I do not believe Access supports BULK Insert. SQLServer
does, but that's not much help to you.

Your choices are to try to figure out how to use OLE Automation to get
Access to import the file, or to write a line-by-line import. (First option
will be much faster). You might try posting your question to
microsoft.public.framework.adonet unless there's someone over there who can
help you who doesn't read this forum.

Robin S.
 
M

Mark Rae

If you load data into an indexed table then each line inserted into
the table also requires an update to the index(es) for that table. By
dropping all the indexing before a bulk load you do not have the
overhead of re-indexing during the run.

Or perform the writes within a transaction - indexes are only updated when
the transaction is committed...
 
M

mfrsousa

You are correct, I do not believe Access supports BULK Insert. SQLServer
does, but that's not much help to you.

Your choices are to try to figure out how to use OLE Automation to get
Access to import the file, or to write a line-by-line import. (First option
will be much faster). You might try posting your question to
microsoft.public.framework.adonet unless there's someone over there who can
help you who doesn't read this forum.

Robin S.

thanks
 
R

RobinS

Does Access support transactions? I didn't think it did. The real answer to
his question is "use SQLServer instead".

Robin S.
---------------------------------------------
 
C

Colin Stutley

Going back in time to ADO we created a link to the text file (via Odbc Text
driver) and then issued a INSERT/SELECT FROM to copy the data from the link
into database proper ... for example;

CREATE "ADOX.Catalog" w_Catalog NO-ERROR.
w_Catalog:ActiveConnection = "<connection-string-to-mdb-database>".
CREATE "ADOX.Table" w_Table.
w_Table:Name = "zzData".
w_Table:parentCatalog = w_Catalog.
w_Table:properties("Jet OLEDB:Link Datasource") =
"<path-in-which-test-file-is-located>".
w_Table:properties("Jet OLEDB:Remote Table Name") = "<name-of-test-file,
eg WSReport#txt>".
w_Table:properties("Jet OLEDB:Create Link") = True.
w_Table:properties("Jet OLEDB:Link Provider String")= "TEXT;HDR=No".
w_Catalog:Tables:APPEND(w_Table).
.... (and then )
CREATE "ADODB.Connection" w_DB NO-ERROR.
w_Sql = "INSERT INTO Booking (<fields>) SELECT <fields> FROM zzData WHERE
<filter-if-required>".
w_DB:EXECUTE(w_Sql, OUTPUT w_Cnt, 0).
w_Sql = 'DROP TABLE zzData'.
w_DB:EXECUTE(w_Sql, OUTPUT w_Cnt, 0).

The code is from a non .Net 4gl, so "some" conversion is required, I have
placed <> around most of the parameter elements - hopefully it will give you
some ideas. The down side is the use of the ADOX com for the schema link
(the "INSERT/SELECT FROM" statement you can do through managed code) - not
sure how much of an issue it will be though.
 
M

mfrsousa

Going back in time to ADO we created a link to the text file (via Odbc Text
driver) and then issued a INSERT/SELECT FROM to copy the data from the link
into database proper ... for example;

CREATE "ADOX.Catalog" w_Catalog NO-ERROR.
w_Catalog:ActiveConnection = "<connection-string-to-mdb-database>".
CREATE "ADOX.Table" w_Table.
w_Table:Name = "zzData".
w_Table:parentCatalog = w_Catalog.
w_Table:properties("Jet OLEDB:Link Datasource") =
"<path-in-which-test-file-is-located>".
w_Table:properties("Jet OLEDB:Remote Table Name") = "<name-of-test-file,
eg WSReport#txt>".
w_Table:properties("Jet OLEDB:Create Link") = True.
w_Table:properties("Jet OLEDB:Link Provider String")= "TEXT;HDR=No".
w_Catalog:Tables:APPEND(w_Table).
... (and then )
CREATE "ADODB.Connection" w_DB NO-ERROR.
w_Sql = "INSERT INTO Booking (<fields>) SELECT <fields> FROM zzData WHERE
<filter-if-required>".
w_DB:EXECUTE(w_Sql, OUTPUT w_Cnt, 0).
w_Sql = 'DROP TABLE zzData'.
w_DB:EXECUTE(w_Sql, OUTPUT w_Cnt, 0).

The code is from a non .Net 4gl, so "some" conversion is required, I have
placed <> around most of the parameter elements - hopefully it will give you
some ideas. The down side is the use of the ADOX com for the schema link
(the "INSERT/SELECT FROM" statement you can do through managed code) - not
sure how much of an issue it will be though.

Solution is here, thank you all for the help


http://groups.google.com/group/micr....adonet/browse_thread/thread/2a60421b4295430b
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top