OleDbDataAdapter for csv/Excel

G

Guest

Hi Guys

I just want to let you know that my colleagues and I in the financial industry spend so much time trying to get data from csv/excel files into DataTables. Time and again, I find that OleDb provider fails us in so far as its interpretation of data type. Just today, I found that my "stable" file parser failed for a field. I predefined a DataTable and all columns so if there's any data type mismatch in the file, the user will know about it right away.

This morning, a file came in with a column with 60 rows that's all integer except the last 5 rows, where the data is floats

My DataTable defines this column to accept data of type double. OleDb sees the first 55 rows as int so it truncates all data values as it puts the data into my DataTable, causing miscalculations to happen later. Of course, int can fit into a column of type double, the problem goes unnoticed until the data is audited -- it's too late and too costly to fix then

Other than manually parsing these files (and there are many file formats in my organization), is there anything in the ADO.NET stack that can help with my *very common* situation

Thanks
Paul
 
P

Paul Clement

¤ Hi Guys,
¤
¤ I just want to let you know that my colleagues and I in the financial industry spend so much time trying to get data from csv/excel files into DataTables. Time and again, I find that OleDb provider fails us in so far as its interpretation of data type. Just today, I found that my "stable" file parser failed for a field. I predefined a DataTable and all columns so if there's any data type mismatch in the file, the user will know about it right away.
¤
¤ This morning, a file came in with a column with 60 rows that's all integer except the last 5 rows, where the data is floats.
¤
¤ My DataTable defines this column to accept data of type double. OleDb sees the first 55 rows as int so it truncates all data values as it puts the data into my DataTable, causing miscalculations to happen later. Of course, int can fit into a column of type double, the problem goes unnoticed until the data is audited -- it's too late and too costly to fix then.
¤
¤ Other than manually parsing these files (and there are many file formats in my organization), is there anything in the ADO.NET stack that can help with my *very common* situation?

Are you using schema.ini files for your Text files? If you don't, the ISAM driver attempts to
determine data types based upon row scanning. It isn't an exact science. For more info:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/odbc/htm/odbcjetsdk_98.asp

Excel can pose a different problem, especially when you have mixed mode (numeric and alphanumeric)
data in a column. If you encounter Null values where there should be data, add the IMEX argument to
your connection string:

ConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=e:\My Documents\Book20.xls;Extended Properties=""Excel 8.0;HDR=NO;IMEX=1;"""


Paul ~~~ (e-mail address removed)
Microsoft MVP (Visual Basic)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top