A
Arnaki
Dear c-sharpers,
I'm trying to make a library (eventually to be used by Fortran-users
within our company) in which large access databases can be made. To be
as flexible as possible, I want to be able to create tables from this
library and fill the database column by column, using separate
routines for int/double/char-columns.
Important is that it should still work for large databases (i.e. 2
million records, 20 columns or maybe even more...).
First I tried to find the fastest way in general to fill a database
with multiple columns row by row. I tried SQL-statements using
OleDbCommands with a executenonquery. This is quite slow (45-60
minutes). Using an old-fashioned recordset works about 5 times faster.
Using a datatable/datarow[] works as well, and is a bit slower again,
but the size of the access-database is large and needs to be
compressed (for 2M records it fails, because the size it wants to
create exceeds 2G).
However, adding column by column I fail to manage. Below is my
"recordset" solution. In this case I already have a database with a
table containing all the columns. This should be the routine for
adding a column containing string-values. If the table doesn't contain
any records yet I create them with .addnew. This works fine and quite
fast. If records are already present (all columns were created as
nullable) I simply set the value and issue an update. This is also
works fine, but doing this record by record a .Movenext() is
necessary. This makes the size of the access database explode even
faster (updating only the second column already gives a size of xx for
300000 records).
Is there anyone that had a similar problem and found a solution for
this. Maybe adding the column afterwards is an idea or datatables/
oledbcommands provide a fast solution?
Thanks for the help in advance!
Cheers Arnaki
ADODB.RecordsetClass recordset = new
ADODB.RecordsetClass();
string sqlString = "select " + columnName + " from " +
tableName + " order by recNr asc";
recordset.Open(sqlString, catalog.ActiveConnection,
ADODB.CursorTypeEnum.adOpenKeyset,
ADODB.LockTypeEnum.adLockOptimistic, 0);
bool noRecords = false;
if (recordset.BOF && recordset.EOF)
{
noRecords = true;
}
else
{
noRecords = false;
recordset.MoveFirst();
}
if (noRecords)
{
for (int iRec = 0; iRec < stringValues.Length; ++iRec)
{
recordset.AddNew(columnName, stringValues[iRec]);
}
recordset.Update(System.Reflection.Missing.Value,
System.Reflection.Missing.Value);
}
else
{
for (int iRec = 0; iRec < stringValues.Length; ++iRec)
{
recordset.Fields[columnName].Value =
stringValues[iRec];
recordset.Update(columnName, stringValues[iRec]);
recordset.MoveNext();
}
}
recordset.Close();
I'm trying to make a library (eventually to be used by Fortran-users
within our company) in which large access databases can be made. To be
as flexible as possible, I want to be able to create tables from this
library and fill the database column by column, using separate
routines for int/double/char-columns.
Important is that it should still work for large databases (i.e. 2
million records, 20 columns or maybe even more...).
First I tried to find the fastest way in general to fill a database
with multiple columns row by row. I tried SQL-statements using
OleDbCommands with a executenonquery. This is quite slow (45-60
minutes). Using an old-fashioned recordset works about 5 times faster.
Using a datatable/datarow[] works as well, and is a bit slower again,
but the size of the access-database is large and needs to be
compressed (for 2M records it fails, because the size it wants to
create exceeds 2G).
However, adding column by column I fail to manage. Below is my
"recordset" solution. In this case I already have a database with a
table containing all the columns. This should be the routine for
adding a column containing string-values. If the table doesn't contain
any records yet I create them with .addnew. This works fine and quite
fast. If records are already present (all columns were created as
nullable) I simply set the value and issue an update. This is also
works fine, but doing this record by record a .Movenext() is
necessary. This makes the size of the access database explode even
faster (updating only the second column already gives a size of xx for
300000 records).
Is there anyone that had a similar problem and found a solution for
this. Maybe adding the column afterwards is an idea or datatables/
oledbcommands provide a fast solution?
Thanks for the help in advance!
Cheers Arnaki
ADODB.RecordsetClass recordset = new
ADODB.RecordsetClass();
string sqlString = "select " + columnName + " from " +
tableName + " order by recNr asc";
recordset.Open(sqlString, catalog.ActiveConnection,
ADODB.CursorTypeEnum.adOpenKeyset,
ADODB.LockTypeEnum.adLockOptimistic, 0);
bool noRecords = false;
if (recordset.BOF && recordset.EOF)
{
noRecords = true;
}
else
{
noRecords = false;
recordset.MoveFirst();
}
if (noRecords)
{
for (int iRec = 0; iRec < stringValues.Length; ++iRec)
{
recordset.AddNew(columnName, stringValues[iRec]);
}
recordset.Update(System.Reflection.Missing.Value,
System.Reflection.Missing.Value);
}
else
{
for (int iRec = 0; iRec < stringValues.Length; ++iRec)
{
recordset.Fields[columnName].Value =
stringValues[iRec];
recordset.Update(columnName, stringValues[iRec]);
recordset.MoveNext();
}
}
recordset.Close();