Optimizing SQL INSERTs from inside C#

  • Thread starter Thread starter Michael C
  • Start date Start date
M

Michael C

I'm writing a program that, among other things, has to create and populate 7
tables on SQL Server. It reads in TAB delimited files, generates INSERT
statements and executes them. So far I've been able to optimize the program
by using the built-in streamreader and string functions (Split, Replace,
etc.) in C#, SqlConnection and SqlCommand for connectivity, and I've tried
to optimize the SQL insert statements (there are over 100,000 of them) by
sending them 500 at a time to cut down on the communication overhead
associated with sending SQL commands over a network. I tried going up to
1,000 at a time, but the prep time on the local computer starts eating away
at my time savings. Is there anything else I can do to optimize and speed
up my SQL inserts? (Stored Procedures are not an option).

Thanks,
Michael C.
 
Michael,

Stored procedures are not an option, but you can prepare the statement
on the server in order to create a temporary stored procedure. This should
speed up your operations a little bit.

Hope this helps.
 
BCP is *the* fastest way of getting large amounts of data into SQL server.
If there are indexes on the table, drop them all first and then recreate
them again once the data's safely in.
 
I thought about BCP, but I'm not too familiar with it. Can I run BCP to
populate a SQL Server from a client machine? And can I run BCP from within
C#?

Thanks,
Michael C.
 
I thought about BCP, but I'm not too familiar with it. Can I run BCP to
populate a SQL Server from a client machine?

God yes - that's what it's intended for!

And can I run BCP from within
C#?

Not from *WITHIN* it as such - it's a separate EXE. But you can use the System.Diagnostics.Process class to shell it and wait till it's finished (using ProcessStartInfo to control such things as whether its console window is to be visible). This will make it transparent to your user that it's a separate program.

The only drawback is you have to have SQL server client tools installed on the machine. But, you should be able to (if you've not got per proc. licenses, install the free MSDE as a hack - not sure whether bcp itself is redistributable on its own, probably not) You need to also make sure that you call it with the correct path
("%programfiles%\microsoft sql server\80\tools\binn"), otherwise you could inadvertently call the bcp of another database provider (such as SYBASE, which has its own bcp) and the similarity could stump you for ages when it refuses to work correctly.
Play around with it on a command prompt till you're sure of the construction of the syntax:
Be aware that you need to pass it a switch to tell it what the field delimeter is (usually "-t," for a comma, or "-t|" for a pipe, say) and you also need to pass it "-c" for a linefeed to be the row terminator.
 
The only drawback is you have to have SQL server client tools installed
on the machine. But, you should be able to (if you've not got per proc.
licenses, install the free MSDE as a hack


Ahh, that's a deal-breaker. I'll be running this program on machines that
may or may not have MSDE, SQL Server or SQL Client tools installed; and I
may or may not have the clearance to install them on these machines. The
only guarantee I have about the configuration is that there's a MS SQL
Server available (either locally or on the network), they're running Win2K
or XP, and .NET Framework 1.1 will be installed.

Thanks,
Michael C.
 
Michael C said:
I'm writing a program that, among other things, has to create and populate 7
tables on SQL Server. It reads in TAB delimited files, generates INSERT
statements and executes them. So far I've been able to optimize the program
by using the built-in streamreader and string functions (Split, Replace,
etc.) in C#, SqlConnection and SqlCommand for connectivity, and I've tried
to optimize the SQL insert statements (there are over 100,000 of them) by
sending them 500 at a time to cut down on the communication overhead

When you say you send them 500 at a time, what do you mean by that and how
do you achieve it? We have an in-house object relational system, and while
reads are optimised (it's smart enough to combine deep tree reads into
joined selects), the best we've managed for inserts is to prepare the
statement if we're going to perform more than one.

Cheers,
Stu
 
Ah, shame. Using an SqlConnection will be the fastest then, that's if you
can't redistribute the DTS dlls, which I'm not sure of.
 
Back
Top