Create Access database as fast as possible

A

Arnaki

Dear c-sharpers,

I'm trying to make a library (eventually to be used by Fortran-users
within our company) in which large access databases can be made. To be
as flexible as possible, I want to be able to create tables from this
library and fill the database column by column, using separate
routines for int/double/char-columns.

Important is that it should still work for large databases (i.e. 2
million records, 20 columns or maybe even more...).
First I tried to find the fastest way in general to fill a database
with multiple columns row by row. I tried SQL-statements using
OleDbCommands with a executenonquery. This is quite slow (45-60
minutes). Using an old-fashioned recordset works about 5 times faster.
Using a datatable/datarow[] works as well, and is a bit slower again,
but the size of the access-database is large and needs to be
compressed (for 2M records it fails, because the size it wants to
create exceeds 2G).

However, adding column by column I fail to manage. Below is my
"recordset" solution. In this case I already have a database with a
table containing all the columns. This should be the routine for
adding a column containing string-values. If the table doesn't contain
any records yet I create them with .addnew. This works fine and quite
fast. If records are already present (all columns were created as
nullable) I simply set the value and issue an update. This is also
works fine, but doing this record by record a .Movenext() is
necessary. This makes the size of the access database explode even
faster (updating only the second column already gives a size of xx for
300000 records).

Is there anyone that had a similar problem and found a solution for
this. Maybe adding the column afterwards is an idea or datatables/
oledbcommands provide a fast solution?

Thanks for the help in advance!

Cheers Arnaki

ADODB.RecordsetClass recordset = new
ADODB.RecordsetClass();
string sqlString = "select " + columnName + " from " +
tableName + " order by recNr asc";
recordset.Open(sqlString, catalog.ActiveConnection,
ADODB.CursorTypeEnum.adOpenKeyset,
ADODB.LockTypeEnum.adLockOptimistic, 0);

bool noRecords = false;
if (recordset.BOF && recordset.EOF)
{
noRecords = true;
}
else
{
noRecords = false;
recordset.MoveFirst();
}

if (noRecords)
{
for (int iRec = 0; iRec < stringValues.Length; ++iRec)
{
recordset.AddNew(columnName, stringValues[iRec]);
}
recordset.Update(System.Reflection.Missing.Value,
System.Reflection.Missing.Value);
}
else
{
for (int iRec = 0; iRec < stringValues.Length; ++iRec)
{
recordset.Fields[columnName].Value =
stringValues[iRec];
recordset.Update(columnName, stringValues[iRec]);
recordset.MoveNext();
}
}

recordset.Close();
 
P

Peter Bromberg [C# MVP]

Try putting all your "stuff" inside a transaction. If you really want
something faster than Access, try SQLite. Or, use SQLExpress, which is the
free version of SQL Server 2005.
-- Peter
Site: http://www.eggheadcafe.com
UnBlog: http://petesbloggerama.blogspot.com
MetaFinder: http://www.blogmetafinder.com


Arnaki said:
Dear c-sharpers,

I'm trying to make a library (eventually to be used by Fortran-users
within our company) in which large access databases can be made. To be
as flexible as possible, I want to be able to create tables from this
library and fill the database column by column, using separate
routines for int/double/char-columns.

Important is that it should still work for large databases (i.e. 2
million records, 20 columns or maybe even more...).
First I tried to find the fastest way in general to fill a database
with multiple columns row by row. I tried SQL-statements using
OleDbCommands with a executenonquery. This is quite slow (45-60
minutes). Using an old-fashioned recordset works about 5 times faster.
Using a datatable/datarow[] works as well, and is a bit slower again,
but the size of the access-database is large and needs to be
compressed (for 2M records it fails, because the size it wants to
create exceeds 2G).

However, adding column by column I fail to manage. Below is my
"recordset" solution. In this case I already have a database with a
table containing all the columns. This should be the routine for
adding a column containing string-values. If the table doesn't contain
any records yet I create them with .addnew. This works fine and quite
fast. If records are already present (all columns were created as
nullable) I simply set the value and issue an update. This is also
works fine, but doing this record by record a .Movenext() is
necessary. This makes the size of the access database explode even
faster (updating only the second column already gives a size of xx for
300000 records).

Is there anyone that had a similar problem and found a solution for
this. Maybe adding the column afterwards is an idea or datatables/
oledbcommands provide a fast solution?

Thanks for the help in advance!

Cheers Arnaki

ADODB.RecordsetClass recordset = new
ADODB.RecordsetClass();
string sqlString = "select " + columnName + " from " +
tableName + " order by recNr asc";
recordset.Open(sqlString, catalog.ActiveConnection,
ADODB.CursorTypeEnum.adOpenKeyset,
ADODB.LockTypeEnum.adLockOptimistic, 0);

bool noRecords = false;
if (recordset.BOF && recordset.EOF)
{
noRecords = true;
}
else
{
noRecords = false;
recordset.MoveFirst();
}

if (noRecords)
{
for (int iRec = 0; iRec < stringValues.Length; ++iRec)
{
recordset.AddNew(columnName, stringValues[iRec]);
}
recordset.Update(System.Reflection.Missing.Value,
System.Reflection.Missing.Value);
}
else
{
for (int iRec = 0; iRec < stringValues.Length; ++iRec)
{
recordset.Fields[columnName].Value =
stringValues[iRec];
recordset.Update(columnName, stringValues[iRec]);
recordset.MoveNext();
}
}

recordset.Close();
 
A

Arnaki

Dear Peter,

Thank you for your suggestion. SQL server is also a wannahave, but we
use Access a lot as well. I didn't try a transaction and am not
familiar with that, but I will give it a try.

Arnaki
 
M

Marc Gravell

One appealing aspect of SqlExpress is that it can (unless I am
mistaken) support SqlBulkCopy (akin to "bcp"), so you can throw data
into the tables at *much* greater speed than you can with regular
code; very useful if you are trying to fill a system with data.

Marc
 
M

Marc Gravell

I'd have thought that a transaction would slow things down... I could
be wrong, though...

Marc
 
A

Arnaki

I'd have thought that a transaction would slow things down... I could
be wrong, though...

Marc
Dear Marc,

I just tried a transaction, and it surely doesn't speed things up, so
you're probably right.
I basically want to use it for 1 array of data (int/double) per
column. In principle it would be fine to add the column the moment the
data is written to it, I don't want to use afterwards with all sorts
of update queries. I don't work with C# for a long time, but I have
the feeling it should not really matter, I guess once you add a
column to a database that already has rows, the rows will
automatically also exist for this column, i.e the cells are
immediately created?

Arnaki
 
M

Marc Gravell

Yes; although the default value will depend on the setup (probably
NULL unless you specify an explicit default). Although I'm not sure
what this has to do with the original problem... but what the heck ;-p

Marc
 
A

Arnaki

Yes; although the default value will depend on the setup (probably
NULL unless you specify an explicit default). Although I'm not sure
what this has to do with the original problem... but what the heck ;-p

Marc

My problem started with the recordset.movenext() yielding a
unnecessarily large access file. I hoped that by creation of a new
column it would be possible to avoid issuing this command and do like
with the first column: recordset.addnew().
I still prefer recordset above oledbcommand or/and dataadapters since
it is so much faster if you add all column values row by row at once.
But for the library I want to create it should be possible to do it
column by column somehow.
If I compress the access database afterwards by hand it works as well,
but than I'm limited by the temporary filesize that exceeds the 2 Gb.
Is there a way to compress it from the C# code after each movenext()
command without slowing down the code too much?

Arnaki
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top