Problem reading text/numeric data from Excel

S

Scott M. Lyon

I've just discovered a bug in some code I wrote a little while ago, and I
need you guys' help to fix it.


My program imports data from a standard Excel Spreadsheet (just with
specific column headers). I used ODBC in my VB.NET program to read that
spreadsheet into a dataset, to make it easy to manipulate. The code I use to
read it is as the bottom of this posting.


The problem I'm having though, is that I have one column of data
(potentially a few others as well) that for one input file starts off as
alphabetic, and then for some rows is numeric.


Unfortunately, when I read it into the dataset, while the alphabetic rows
come in just fine, the numeric ones are showing up as System.DBNull.


How can I fix this, short of requiring the input file to have the cells
formatted explicitly to text?


The code I use to read is the following function:

Public Function ReadExcelDT(ByVal aFileName As String) As DataTable
' Open the Excel Spreadsheet, and using ODBC, read it into a DataTable for
later processing
Dim odbcConnectionString As String = _
"Driver={Microsoft Excel Driver (*.xls)};DriverId=790;" & _
"Dbq=" & aFileName & ";"
Dim odbcConn As New OdbcConnection(odbcConnectionString)
Dim odbcCmd As New OdbcCommand("SELECT * FROM [Sheet1$]", odbcConn)
Dim odbcAdapter As New OdbcDataAdapter(odbcCmd)
Dim Dt As New DataTable
odbcConn.Open()
odbcAdapter.Fill(Dt)

odbcConn.Close()
Return Dt
End Function


The data (for the column in question) looks similar to this:

Unit
-----
ABC123 (returns ABC123 in the dataset as expected)
DEF456 (returns DEF456 as expected)
789123 (returns a System.DBNull)
456789 (returns a System.DBNull)


Any ideas?


Thanks!
-Scott
 
J

Jim Underwood

Not specifically and answer to your question, but maybe a workaround...

Instead of reading the excel spreadsheet using ODBC, try treating it as an
object and accessing the values in the individual cells. It sounds like
ODBC is assuming the first row of data defines the types for each field, and
eliminating ODBC should get you around this.

The problem is this means a significant rewrite of your aproach.
 
S

Scott M. Lyon

I was afraid of that... I should have known it was coming together just a
little TOO easily... ;)


Thanks!

Jim Underwood said:
Not specifically and answer to your question, but maybe a workaround...

Instead of reading the excel spreadsheet using ODBC, try treating it as an
object and accessing the values in the individual cells. It sounds like
ODBC is assuming the first row of data defines the types for each field,
and
eliminating ODBC should get you around this.

The problem is this means a significant rewrite of your aproach.


Scott M. Lyon said:
I've just discovered a bug in some code I wrote a little while ago, and I
need you guys' help to fix it.


My program imports data from a standard Excel Spreadsheet (just with
specific column headers). I used ODBC in my VB.NET program to read that
spreadsheet into a dataset, to make it easy to manipulate. The code I use to
read it is as the bottom of this posting.


The problem I'm having though, is that I have one column of data
(potentially a few others as well) that for one input file starts off as
alphabetic, and then for some rows is numeric.


Unfortunately, when I read it into the dataset, while the alphabetic rows
come in just fine, the numeric ones are showing up as System.DBNull.


How can I fix this, short of requiring the input file to have the cells
formatted explicitly to text?


The code I use to read is the following function:

Public Function ReadExcelDT(ByVal aFileName As String) As DataTable
' Open the Excel Spreadsheet, and using ODBC, read it into a DataTable
for
later processing
Dim odbcConnectionString As String = _
"Driver={Microsoft Excel Driver (*.xls)};DriverId=790;" & _
"Dbq=" & aFileName & ";"
Dim odbcConn As New OdbcConnection(odbcConnectionString)
Dim odbcCmd As New OdbcCommand("SELECT * FROM [Sheet1$]", odbcConn)
Dim odbcAdapter As New OdbcDataAdapter(odbcCmd)
Dim Dt As New DataTable
odbcConn.Open()
odbcAdapter.Fill(Dt)

odbcConn.Close()
Return Dt
End Function


The data (for the column in question) looks similar to this:

Unit
-----
ABC123 (returns ABC123 in the dataset as expected)
DEF456 (returns DEF456 as expected)
789123 (returns a System.DBNull)
456789 (returns a System.DBNull)


Any ideas?


Thanks!
-Scott
 
S

Scott M. Lyon

Hey Jim,

It's official... I'm giving up on getting ODBC working for this problem, and
resigning myself to reading the spreadsheet "manually"...


The only thing is, I am not sure how to do that with an Excel Spreadsheet
(short of reading the file one byte at a time, and trying to figure out what
format Excel spreadsheets are saved in).

Is this something where I'd (somehow) create an Excel object to load the
data into, and that object would allow me access to individual cells?


How would I do all that?


Thanks!
-Scott


Jim Underwood said:
Not specifically and answer to your question, but maybe a workaround...

Instead of reading the excel spreadsheet using ODBC, try treating it as an
object and accessing the values in the individual cells. It sounds like
ODBC is assuming the first row of data defines the types for each field,
and
eliminating ODBC should get you around this.

The problem is this means a significant rewrite of your aproach.


Scott M. Lyon said:
I've just discovered a bug in some code I wrote a little while ago, and I
need you guys' help to fix it.


My program imports data from a standard Excel Spreadsheet (just with
specific column headers). I used ODBC in my VB.NET program to read that
spreadsheet into a dataset, to make it easy to manipulate. The code I use to
read it is as the bottom of this posting.


The problem I'm having though, is that I have one column of data
(potentially a few others as well) that for one input file starts off as
alphabetic, and then for some rows is numeric.


Unfortunately, when I read it into the dataset, while the alphabetic rows
come in just fine, the numeric ones are showing up as System.DBNull.


How can I fix this, short of requiring the input file to have the cells
formatted explicitly to text?


The code I use to read is the following function:

Public Function ReadExcelDT(ByVal aFileName As String) As DataTable
' Open the Excel Spreadsheet, and using ODBC, read it into a DataTable
for
later processing
Dim odbcConnectionString As String = _
"Driver={Microsoft Excel Driver (*.xls)};DriverId=790;" & _
"Dbq=" & aFileName & ";"
Dim odbcConn As New OdbcConnection(odbcConnectionString)
Dim odbcCmd As New OdbcCommand("SELECT * FROM [Sheet1$]", odbcConn)
Dim odbcAdapter As New OdbcDataAdapter(odbcCmd)
Dim Dt As New DataTable
odbcConn.Open()
odbcAdapter.Fill(Dt)

odbcConn.Close()
Return Dt
End Function


The data (for the column in question) looks similar to this:

Unit
-----
ABC123 (returns ABC123 in the dataset as expected)
DEF456 (returns DEF456 as expected)
789123 (returns a System.DBNull)
456789 (returns a System.DBNull)


Any ideas?


Thanks!
-Scott
 
C

Charlie

There is actually a workaround for your problem.

Although accessing individual cell is safer, but what you can do is to
modify the registry that makes Excel guess the data type with the data
from the first 8 rows (that's the default setting on my machine)...

Here is the code in VB .NET that does this. The only problem with this
is that your application have to have the permission to modify the
registry.

' varify Excel settings
Dim regVersion As RegistryKey
Dim keyValue As String
keyValue =
"Software\\Microsoft\\Jet\\4.0\\Engines\\Excel"
regVersion = Registry.LocalMachine.OpenSubKey(keyValue,
True)

Dim intVersion As Integer = 0
If (Not regVersion Is Nothing) Then
intVersion = regVersion.GetValue("TypeGuessRows",
0)
If intVersion <> 0 Then
regVersion.SetValue("TypeGuessRows", 0)
End If
regVersion.Close()
End If

' you can set this value back to it's original value by
doing
regVersion.SetValue("TypeGuessRows", intVersion)

after your operations...

By the way, does anyone know if setting TypeGuessRows to 0 will affect
anything (performance?)

Best regards,

Charlie Chang

Hey Jim,

It's official... I'm giving up on getting ODBC working for this problem, and
resigning myself to reading the spreadsheet "manually"...


The only thing is, I am not sure how to do that with an Excel Spreadsheet
(short of reading the file one byte at a time, and trying to figure out what
format Excel spreadsheets are saved in).

Is this something where I'd (somehow) create an Excel object to load the
data into, and that object would allow me access to individual cells?


How would I do all that?


Thanks!
-Scott


Jim Underwood said:
Not specifically and answer to your question, but maybe a workaround...

Instead of reading the excel spreadsheet using ODBC, try treating it as an
object and accessing the values in the individual cells. It sounds like
ODBC is assuming the first row of data defines the types for each field,
and
eliminating ODBC should get you around this.

The problem is this means a significant rewrite of your aproach.


Scott M. Lyon said:
I've just discovered a bug in some code I wrote a little while ago, and I
need you guys' help to fix it.


My program imports data from a standard Excel Spreadsheet (just with
specific column headers). I used ODBC in my VB.NET program to read that
spreadsheet into a dataset, to make it easy to manipulate. The code I use to
read it is as the bottom of this posting.


The problem I'm having though, is that I have one column of data
(potentially a few others as well) that for one input file starts off as
alphabetic, and then for some rows is numeric.


Unfortunately, when I read it into the dataset, while the alphabetic rows
come in just fine, the numeric ones are showing up as System.DBNull.


How can I fix this, short of requiring the input file to have the cells
formatted explicitly to text?


The code I use to read is the following function:

Public Function ReadExcelDT(ByVal aFileName As String) As DataTable
' Open the Excel Spreadsheet, and using ODBC, read it into a DataTable
for
later processing
Dim odbcConnectionString As String = _
"Driver={Microsoft Excel Driver (*.xls)};DriverId=790;" & _
"Dbq=" & aFileName & ";"
Dim odbcConn As New OdbcConnection(odbcConnectionString)
Dim odbcCmd As New OdbcCommand("SELECT * FROM [Sheet1$]", odbcConn)
Dim odbcAdapter As New OdbcDataAdapter(odbcCmd)
Dim Dt As New DataTable
odbcConn.Open()
odbcAdapter.Fill(Dt)

odbcConn.Close()
Return Dt
End Function


The data (for the column in question) looks similar to this:

Unit
-----
ABC123 (returns ABC123 in the dataset as expected)
DEF456 (returns DEF456 as expected)
789123 (returns a System.DBNull)
456789 (returns a System.DBNull)


Any ideas?


Thanks!
-Scott
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top