Reading .csv files containing the "-" character

S

stephane8

I'm able to open and read my .csv file. But when the character "-" is
in one of my fields (ex : 18802-002) I get a system.DBNull value ! Even
if I edit the file with notepad and put "18802-002" I still get a
DBNull value... I'm not able to figure this one out, any help would be
appreciated.

Here's my code :

Dim ConnectionString As String = "Provider=Microsoft.Jet.OLEDB.4.0;Data
Source=" & sPath & ";Extended Properties=""Text;HDR=NO;FMT=Delimited"""

Dim Conn As New System.Data.OleDb.OleDbConnection(ConnectionString)
Conn.Open()

Dim da As New System.Data.OleDb.OleDbDataAdapter("SELECT * FROM
" & sFile, Conn)
Dim ds As New DataSet("TextFiles")

da.Fill(ds, "XYZ")

Dim dt As DataTable = ds.Tables("XYZ")

For Each sRow As DataRow In dt.Rows
....
Next
 
T

tommaso.gastaldi

Hi stephane,

I have tried the same, by using the datareader and I get correct data
with
"18802-002" (no Null).

Could you post the text (a few records and 2-3 fields) so I can make a
trial on
the actual data which gives you problems?

-tom

(e-mail address removed) ha scritto:
 
T

tommaso.gastaldi

I also tried your code (OleDbDataAdapter + FILL) using:

"34";"4";"6"
"18802-002"; "5"; "7"
"xx"; "5"; "5"

works perfectly fine to me (no Null) (?).

-t

(e-mail address removed) ha scritto:
 
S

stephane8

Seems like my problem is in my .csv file. I've made some tests with
different values and found out that it was not related to the "-"
character.

This in my test file :
"1",yo,"3"
"4",5,"6"
"4564-564",564,"rger"

I get a DBNull value for 'yo', but if I change my file for :
"1",yo,"3"
"4","5","6"
"4564-564",564,"rger"

Now everything is fine ! Even if i didn't modified 'yo' who was
actually the wrong one.

What i'm not understanding is why it wouldn't work if I put quotes only
to "yo", then again I get a DBNull value
"1","yo","3"
"4",5,"6"
"4564-564",564,"rger"


Here's my real situation :

I have to import a big file (about 20 000 records) who is generated by
an other system. Here's a part of the file :

95912,"""HRD, Rubber feet, NLX chassis"" "
95913,"""HRD, NLX Rail Guide"" "
95914,"""HRD, 9400 front panel Clip"" "
95957,"""Dis, 8510 Colour 10.4 800x600 250 NIT"" "
95958,"""ASY, 8510 HDD 4.0GB"" "
95959,"""Asy, 8510 grounding Cables"" "
96375,Power Supply Low Temp
97063,"""Assy, Hard Disk Drive 6.4 Gb"" "
97172, Connector Rentention menchanism
97182,"""Assy, Intel Juneau Motherboard (NEW)"" "
97221,"""Assy,Intel PII 500 MHZ Processor"" "
18802-002 ,8045 Firmware (Ext temp)

16352-001 ,Assy Cable 7015 7 pin din to 9 pin Dsub
16365-001 ,Bottom Plastic 7015/7020
16849-001 ,"""25' antenna cable,RG213/U N-M to UHF-M (for 902
S"
17366-003 ,Power Cable Std Temp
17366-004 ,Power Cable Low Temp
17393-001 ,Modem 9600 Baud
17393-002 ,Modem 4800 Baud Conf.coat
17393-003 ,Modem9600 Baud Conf.coat
17483-003 ,Modem 9600 Baud for 8030/9011
17693-001 ,8025 Main logic bd

Everything goes fine until the 18802-002 line. Even if I put "18802-002
","8045 Firmware (Ext temp)" it doesn't work.

I'm not used to play with .csv files, if anybody could suggest me
something it would be really appreciated, thanx
 
T

tommaso.gastaldi

Try this:

-Make a text file called schema.ini

-Place it in the same directory of your delimited file, say
"NameOfYourFile.txt" (or.csv), with data

-Place the following text into schema.ini

[NameOfYourFile.txt]
ColNameHeader=False
Format=Delimited(,)
MaxScanRows=0


where in place of NameOfYourFile.txt you write the name of your
delimited text file

-Run again your program

let me know ...

-tom



PS
here in italy I am force to use (;)
 
T

tommaso.gastaldi

An alternative (for any delimited file)

start REGEDIT clicking non yous start > execute menu

find
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Text\MaxScanRows

right click on the entry > change and set 0 in place of the current
value (I have 25)

this will prevent from scanning the row to attempt to determine the
type. So it will use all string type.
(See ... you should also see "Majority Type" that is it would determine
the type based on most of the scanned values.)


Another solution would be setting the value programmatically when
necessary...

-tom
 
S

stephane8

Thanx Tom for your replies, I found an other way to do it. I'm using
the Regex object with a crazy pattern string ! Even if I have to read
each line of the file by myself it's gonna make the job for me.

Here's my code :

Private Shared Function ParseLine(ByVal oneLine As String) As String()
' Returns an array containing the values of the comma-separated
fields.

' This pattern actually recognizes the correct commas.
' The Regex.Split() command later gets text between the commas.
Dim pattern As String =
",(?=(?:[^""]*""[^""]*"")*(?![^""]*""))"
Dim r As System.Text.RegularExpressions.Regex = _
New System.Text.RegularExpressions.Regex(pattern)

Return r.Split(oneLine)
End Function

Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As
System.EventArgs) Handles Button1.Click

Dim op As New OpenFileDialog
op.ShowDialog()

Dim sLine As String
Dim s() As String

Dim oReader As New IO.StreamReader(op.FileName)

Do
sLine = oReader.ReadLine()

If sLine <> "" Then
s = ParseLine(sLine)
MsgBox(s(0))
End If

Loop Until sLine = ""

End Sub

Thanx again for the help !
-Stephane
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top