Remove 255 character limit from OLE DB .NET and Fix Data Corruptio

G

Guest

Make it possible to query any data from an Excel spreadsheet in exactly the
form it appears in the spreadsheet. The highest priority is to remove the
255 character limit from the size of fields being accessed in a query. I
suppose there may be some way to do this from SQLServer using a character
array, but if so, it is undocumented and probably not very developer
friendly. It is also necessary to remove the "feature" that makes
non-conforming numeric or text data disappear. The query should return all
text visible to the user in every row and column queried. The present OLE
DB interface causes data to disappear or even worse, appear in the wrong row.
This is a computational disaster and potentially a security problem.
Also, in the mean time, adding extra quote characters to tab delimited data
when saving as a text file further corrupts the data.
 
J

Jamie Collins

cfrphoto said:
Make it possible to query any data from an Excel spreadsheet in exactly the
form it appears in the spreadsheet. The highest priority is to remove the
255 character limit from the size of fields being accessed in a query. I
suppose there may be some way to do this from SQLServer using a character
array, but if so, it is undocumented and probably not very developer
friendly. It is also necessary to remove the "feature" that makes
non-conforming numeric or text data disappear.

How did you create your Excel column? If your data is being curtailed
at 255 characters, you probably defined it as VARCHAR(255). You require
MEMO. Read up on Jet's data types for Excel. Get you schema design
correct.

If you have 'non-conforming numeric or text data' appearing as null
values, then you have either defined your table's column incorrectly or
have values that don't fit the data type of the column, probably the
latter. Get you data into the correct columns.

If the design and/or data is not your own and you are trying to work
dynamically with what you're given, you registry values (e.g. what are
you using for TypeGuessRows?) and/or connection string properties (e.g.
which values for IMEX have you chosen?) may be wrong. For some further
hints see:

http://www.dicks-blog.com/archives/2004/06/03/external-data-mixed-data-types/

Have you tried querying a text file using a schema.ini file? If you
prefer this system, I suggest you base any future petitions on this
functionality.

Best of luck with your campaign,
Jamie.

--
 
G

Guest

I have no control over how the spreadsheet is created. The bug relating to
whether a field is text or numeric is the worst problem because changing the
cells to text does not have affect pre-existing data. This means that even
if I am able to modify the Excel spreadsheet, "TypeGuessRows" will fail.

Repeating, to keep it simple, I just want to be able to query the text in an
Excel spread sheet. In many cases, trying to maintian data types is
problematic. This comment applies to SQLServer, Access or any other
database when extracting data into a data warehouse. The safest approach is
to treat all data as text unless the data is extracted from a controlled
source where the data type in a column is known.

I do not agree with the comment about using "memo" data. This seems to be
an artifact of Access and older versions of SQLServer where the varchar limit
was 255. The varchar size limit in SQLServer is 4000 for Unicode and 8000
for single byte characters. Unfortunately the Visual Studio documentation
is rather weak in providing an functional overview before launching into some
example that avoids interesting or useful cases. Links to classes
referenced as properties are not always provided.

I suppose I could change the query parameters if they were documented. So
far, I have not found enough useful (high level) documentation. I have
tried various combinations of IMEX parameters, but there was no indication in
the documentation that Memo existed or had to be used for strings greater
that 255 characters. In that case, an exception should have been thrown.
What actually happened is that the string returned was from the a different
row. THIS IS A BUG.

Exporting to a .txt file defeats the purpose of having an automated
extraction procedure. Worse, Excel adds quotes to strings containing quotes
even in tab delimited mode. One of the Excel spreadsheets created using
"Paste Special" from an Access Database contains embedded new line
characters. Of course, the .txt file was unuseable. I was able to get
around this problem only by replacing all instances of <ctrl>J with a string
like \n that could be restored later.

Again, I would like to be able to query in a "type free" manner returning
every cell in the Excel spreadsheet as a string. Exporting to a file is not
really an object.
 
J

Jamie Collins

cfrphoto said:
So far, I have not found enough useful (high level) documentation.

The documentation is poor and you are sore. Hoping for a response or
action from Microsoft can only lead to more hurt. However, if you have
a specific issue you want to resolve or work around, you may have some
luck here.
I do not agree with the comment about using "memo" data. This seems to
be an artifact of Access and older versions of SQLServer where the
varchar limit was 255.

It's a common misconception that MS Access is the same as Jet. MS
Access *uses* Jet. Excel also uses Jet. Jet may be officially
'depreciated' by Microsoft but it is still the native SQL engine for
both Access2003 and Excel2003.

Tracking down the *fact* that Excel uses the memo data type for a text
column/value that exceeds 255 characters is a case of 'join the dots'
in the documentation and a bit of old-fashioned testing e.g. try this
one:

PRB: Transfer of Data from Jet 4.0LEDB Source Fails with Buffer
Overflow Error
http://support.microsoft.com/default.aspx?scid=kb;en-us;281517
"If any field looks like text and the length of data is more than 255
characters, the column is typed as a memo field."
I suppose I could change the query parameters if they
were documented.

I think all the issues here *are* documented, albeit in a succinct and
fragmented fashion e.g.

How To Use ADO with Excel Data from Visual Basic or VBA
http://support.microsoft.com/default.aspx?scid=kb;en-us;257819
"IMEX=1 in the Extended Properties section of the connection string ...
enforces the ImportMixedTypes=Text registry setting"
there was no indication in
the documentation that Memo existed or had to be used for strings
greater that 255 characters. In that case, an exception should have
been thrown.

I agree the way Jet guesses data types for Excel is unsatisfactory but
throwing an exception when text > 255 is detected would make the
process useless.

I think you need to accept the fact that memo does exist and work with
it. Otherwise, find another way to access the data e.g. automating an
instance of the Excel.Application object, working with the Biff8
format, etc.
What actually happened is that the string returned was from the a
different row. THIS IS A BUG.

This sounds like a bug but we've only got your word for it. For anyone,
even Microsoft, to investigate they need some steps to reproduce the
bug. Please post yours here.

Jamie.

--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top