SqlCommand slow on INSERT

John Bailo · Feb 24, 2006

I have a c# program that loops through a table on a DB2 database.

On each iteration it assigns data to values in the SqlParameter
collection. The command text is an INSERT statement to a Sql Server
database, run with an .ExecuteQuery I enclosed the loop in a
SqlTransaction and commit it at the end.

I timed the program and it inserts about 70 records a second...which I
think is sort of slow...so I set up some Debug.WriteLines to show where
the time was being spent.

The DataReader loop to the DB2 table is instantaneous. Almost 0s spent
getting each record. Same with assigning values.

The slow step is the actual execution of the SqlCommand. However, I
also ran a SQL Trace and monitored the execution of the statement on the
server. It took 0s to execute. The SqlCommand itself is adding an
extra 0.01s to 0.03s which can add up over the course of hundreds of
thousands of records.

So the only overhead is running .ExecuteQuery on the SqlCommand
object(!) Is there anyway to reduce or minimize this overhead, or a
setting that can affect performance.

I mean if my external source and target are running at 0s - my code
shouldn't be adding overhead to run a command!

William \(Bill\) Vaughn · Feb 24, 2006

I have a Model A Ford that does not climb hills that well when towing my 65'
boat. What should I do?
Ah, ADO.NET (or any of the data access interfaces) are not designed to do
bulk inserts. While the 2.0 is faster than ever (with batch mode), it's
still orders of magnitude slower than the fastest DAI INSERT loop. Using
SqlBulkCopy or DTS I can move 500,000 rows in about 20 seconds.

--
____________________________________
William (Bill) Vaughn
Author, Mentor, Consultant
Microsoft MVP
INETA Speaker
www.betav.com/blog/billva
www.betav.com
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no rights.
__________________________________

Guest · Feb 24, 2006

I have a Model A Ford that does not climb hills that well when towing my 65'

boat. What should I do?

Put smaller tires on the Model A.

I'm not John, but thanks for the tip on SqlBulkCopy! Some of us don't get
very many chances to peek over the trenches to find the new jewels in 2.0.

-Mike

John Bailo · Feb 24, 2006

Ok, I'm curious.

Just why is BULK INSERT so fast? I mean, does it not go through the
SQL DBMS itself and somehow write directly to the the .mdf file? What
is the mechanism it uses?

Also, running my code on a quad proc (SQL 2000, W2K) I am getting 5,000
records in 10-14s ( or about 450 to 500 recs per second ). Not too shabby!

John Bailo · Feb 24, 2006

One thing I found that seems to make a little difference is using the
..Prepare() method on my SqlCommand.

This seems to take a SqlCommand and turn it into a sproc, /on-the-fly/,
for SqlCommands that need to be reused.

Brian Gideon · Feb 24, 2006

John said:
Ok, I'm curious.

Just why is BULK INSERT so fast? I mean, does it not go through the
SQL DBMS itself and somehow write directly to the the .mdf file? What
is the mechanism it uses?

Well, I'm not a SQL Server expert or anything, but I believe bulk
inserts bypass the transaction log. That probably accounts for some of
the difference anyway.

John Bailo · Feb 24, 2006

To bad I can't programmatically disable the transaction log or request
sql not to log my inserts...

William \(Bill\) Vaughn · Feb 24, 2006

The bulk copy interface and functionality has been in place since the Sybase
versions... a long time. In the Model A days we had to use the BCP utility,
but in SS7 and later we could use one of the SQL Server management "object"
libraries (SQLDMO, SQLSMO) to call Bulk copy functionality. In SS 2000 (or
earlier?) it appeared as TSQL functions too.

Yes, Bulk Copy turns off the TL. It assumes that you're copying to a temp
table (or a permanent work table). Generally, these tables don't have
indexes or other constraints to slow down the import process. BCP also uses
special TDS stream packets to move the data. It's THE way to go when moving
data from/to the server. DTS leverages this technology by permitting you to
setup a scripted BCP operation that can transform (edit) the data as it's
moved from any data source with a provider or driver (.NET, OLE DB, ODBC,
text, tight-string-with-two-cans).

hth

--
____________________________________
William (Bill) Vaughn
Author, Mentor, Consultant
Microsoft MVP
INETA Speaker
www.betav.com/blog/billva
www.betav.com
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no rights.
__________________________________

John Bailo · Feb 24, 2006

The thing is, that answer really doesn't address my question.

As I mentioned, when I run the Trace utility, the duration of each
insert command is 0s.

It's the wait time to return before the SqlCommand and after the
SqlCommand.

So, to me, if I were to string together a bunch of INSERT statements,
that represent the changing parameters of the INSERT, it should run
lightening fast.

My question still stands: why is there so much overhead in the
SqlCommand object? Or, is it the transmission time to send the
command on the network? Or the conversion to TDS protocol?

If either of those factors could be reduced, then I could send my
INSERTs through almost instantly.

William \(Bill\) Vaughn · Feb 24, 2006

It's the fundamental difference in the mechanism. First, each INSERT
statement is sent as text to the server, not as raw data. The SQL Server
compiler then needs to compile the INSERT statement(s) and generate a query
plan. Nope, this does not take long, but it takes time. It then logs the
operation to the TL (which can't be disabled) and then to the database. At
that point the constraints are checked, the indexes are built and any RI
checks are made.

In the case of BCP, the protocol (which is proprietary and subject to
change) opens a channel, sends the meta data (once), and the server starts
an agent that simply writes the inbound data stream (binary) to the rows in
the target table. It requires very little overhead--90% of which can't be
disabled.

--
____________________________________
William (Bill) Vaughn
Author, Mentor, Consultant
Microsoft MVP
INETA Speaker
www.betav.com/blog/billva
www.betav.com
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no rights.
__________________________________

Gabriel Magaña · Feb 24, 2006

It's the fundamental difference in the mechanism. First, each INSERT

statement is sent as text to the server, not as raw data. The SQL Server
compiler then needs to compile the INSERT statement(s) and generate a
query plan.

....Which then causes me to question: Are you Prepare() ing the insert
statement? I know you mentioned you were using parameters, but I think the
speed bonus does not occurr if the statement is not prepared (because, as
Bill said, the insert statements have to be parsed each time).

At least with Oracle it makes a huge difference in speed.

John Bailo · Feb 24, 2006

Yes, I issue the .Prepare() method before entering the loop.

John Bailo · Feb 24, 2006

Shouldn't using a .Prepare() method (which I do) eliminate that overhead?

Also, I enclose loop in a SqlTransaction.

Doesn't that mean that all the log entries are written when the
transaction is committed?

Gabriel Magaña · Feb 24, 2006

Also, I enclose loop in a SqlTransaction.

Doesn't that mean that all the log entries are written when the
transaction is committed?

I might be mistaken for SQL Server (I'm certified as an Oracle DBA, but I
only know SQLServer superficially), but what happens in Oracle (somewhat
simplified) is that the DB itself is updated during a transaction, and the
log keeps the original data in it. That way when you commit a transaction
there is very little cost in terms of IO, there is only a cost if you
rollback the transaction (which is assumed to happen very few times). I
would not be suprised at all if SQL Server worked this way too, since, if
the assumption that the ratio of transaction commits to rollbacks is
extremely high holds, transactions take up very little extra overhead.

The only problem is when a transaction involves a great many operations:
The log system gets bogged down since a very big part of the log is "vital"
and could be needed in case of a rollback. For the stuff I do normally, I
have noticed that committing transactions every could hundred record
insertions is a good mid-point performance-wise. But that number is highly
dependant on the server hardware and also on the size of the records being
inserted. Inserting 100,000 records in a single transaction is just asking
for trouble, in other words.

John Bailo · Feb 25, 2006

I guess I might start believing that the transaction log is the
additional overhead, except that I would expect that to be reflected in
the duration reported in a the SQL trace.

But it isn't! The duration for each INSERT statement is 0 -- or less
than 1 millesecond! Wouldn't the time to write to the transaction log
be reflected in the duration for each INSERT?

Gabriel Magaña · Feb 25, 2006

I guess I might start believing that the transaction log is the additional

overhead, except that I would expect that to be reflected in the duration
reported in a the SQL trace.
But it isn't! The duration for each INSERT statement is 0 -- or less
than 1 millesecond! Wouldn't the time to write to the transaction log be
reflected in the duration for each INSERT?

Writing to the transaction log at insert time is not the only I/O done on
the transaction log. There is another I/O done when you commit the
transaction, to tell the DB that the logs are ok to be recycled, since you
have just commited the transaction (and hence, cannot rollback anymore).

You might try splitting into several smaller transactions (with their own
commits, of course) to see if you get any speed improvements.

But to answer your question, I do not know exactly how SQL trace works, so I
do not know if it includes updates to the log.

Bypassing the log is dangerous though... You would not be able to rollback
(so that means no transactions either), nor would you be able to recover
from data curruption should the DB crash during the insert operation. You
may as well use DBF files instead of SQL server, inserts to that are
lightning fast. I have done this in the past where speed is absolutely of
the essence and data loss is no big deal... But there are lots of
downsides, I guess it's the same as using goto statements: Only people who
fully understand what they are doing should be messing with it!

William \(Bill\) Vaughn · Feb 25, 2006

No. You still don't understand. BCP requires no compile overhead and uses
direct IO to the DBMS instead of "logical" writes. And no, we don't usually
write to a permanent table so the Transaction Log issues are not a factor.
When we're ready to post the data to the live tables we run a SP that edits
the data and does a proper INSERT.

--
____________________________________
William (Bill) Vaughn
Author, Mentor, Consultant
Microsoft MVP
INETA Speaker
www.betav.com/blog/billva
www.betav.com
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no rights.
__________________________________

John Bailo · Feb 28, 2006

Would writing to a #temp table, or to a table variable (and then
transferring the data to a regular table with INSERT...SELECT) be faster
than regular INSERTs to a table?

William \(Bill\) Vaughn · Feb 28, 2006

Basically, yes. We generally transfer to "temporary" tables in the database
(not to tempdb) and perform intelligent INSERT/UPDATE statements from there.
It's far faster.

--
____________________________________
William (Bill) Vaughn
Author, Mentor, Consultant
Microsoft MVP
INETA Speaker
www.betav.com/blog/billva
www.betav.com
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no rights.
__________________________________

John Bailo · Feb 28, 2006

Hmmm...well, there's no reason I can't do that as an intermediate way in
my code.

Do I have to create the temporary table on the same command that I run
my INSERTS on?

Or can I just create two SqlCommand's and run them on the same
SqlConnection? Will the table remain in memory if I do that?

And then I'll need to do the transfer from the temp table to the table
after that.

sqlcommand	1	Nov 1, 2007
sqlcommand question	2	Mar 17, 2008
INSERT from one database to another?	2	Nov 18, 2005
Bug in SqlCommand?	2	Sep 4, 2003
Limit on SQLCommand instances	10	Oct 12, 2005
SqlConnection and transfer encoding	9	Apr 7, 2009
ADO.Net transaction and ExecuteDataset	1	Oct 5, 2007
Bulk Insert Using SqlCommand	1	Jul 14, 2005

SqlCommand slow on INSERT

John Bailo

William \(Bill\) Vaughn

Guest

John Bailo

John Bailo

Brian Gideon

John Bailo

William \(Bill\) Vaughn

John Bailo

William \(Bill\) Vaughn

Gabriel Magaña

John Bailo

John Bailo

Gabriel Magaña

John Bailo

Gabriel Magaña

William \(Bill\) Vaughn

John Bailo

William \(Bill\) Vaughn

John Bailo

Ask a Question

Similar Threads