TextFieldParser - reading tab delimited file

al jones · Sep 21, 2006

Iâ€™m using textfieldparser to read a data file. which contains, for example:

AmondÃ³ Szegi Amondo Szegi
andrÃ© nossek AndrÃ© Nossek
Â© Characte Character

Note the vowels with diacriticals and the copyright symbol - it is dropping
these (and other similar) characters which fall outside ascii range
(apparently)

The code is simple and looks like:
Using MyReader As New TextFieldParser(Application.StartupPath &
"\designers.txt")
MyReader.TextFieldType = FileIO.FieldType.Delimited
MyReader.CommentTokens = New String() {"#"}
MyReader.Delimiters = New String() {vbTab}
MyReader.TrimWhiteSpace = True
Dim currentRow As String()
intElement = 0
While Not MyReader.EndOfData
Try
currentRow = MyReader.ReadFields()
If Microsoft.VisualBasic.Left(currentRow(0), 7) =
"UNKNOWN" Then
strUnknownDesigner = currentRow(1)
Continue While
End If
arDesigner(intElement, 0) = currentRow(0)
arDesigner(intElement, 1) = currentRow(1)
arDesignerCounter(intElement) = 0
intElement += 1
Catch ex As MalformedLineException
MsgBox("Designer Line " & ex.Message & "is not valid
and will be skipped.")
End Try
End While
End Using

I canâ€™t see any reason in the documentation for it dropping copyright or
the French and German (etcâ€¦) vowels with accents.

Comments or suggestions anyone??

Thanks //al

Andrew Morton · Sep 21, 2006

al said:
I'm using textfieldparser to read a data file. which contains, for
example:

Amondó Szegi Amondo Szegi
andré nossek André Nossek
© Characte Character

Note the vowels with diacriticals and the copyright symbol - it is
dropping these (and other similar) characters which fall outside
ascii range (apparently)

It appears to be an encoding problem where the file uses (I'm guessing)
ISO-8859-1 or maybe Windows-1252 whereas the .NET framework defaults to
Unicode. Does a TextFieldParser have a setting for that (or have a
..BaseClass that does)?

Or perhaps you can arrange for the file to be encoded with Unicode?

Andrew

al jones · Sep 21, 2006

It appears to be an encoding problem where the file uses (I'm guessing)
ISO-8859-1 or maybe Windows-1252 whereas the .NET framework defaults to
Unicode. Does a TextFieldParser have a setting for that (or have a
.BaseClass that does)?

Or perhaps you can arrange for the file to be encoded with Unicode?

Andrew

Possibly my confusion is from the fact that I maintain these files (there
are three of them) within VS 2005 so I would have epected them to be
unicode. The characters exist within the files (the three line examples are
cut & paste from the file itself) so I don't understand why reading them
would literally eliminate the characters.

I've been over the TextFieldParser docs and see nothing that indicates that
it shouldn't take the data as presented.

Jeff Glatt · Sep 22, 2006

Try OrchidGrid control, which can pase/import data from delimited files.

textFieldParser to text	1	Mar 15, 2007
Handling Quotes in a CSV file	1	May 16, 2007
WriteAllText	1	Mar 18, 2007
"using statments"	5	Oct 10, 2006
dataset not commiting data	2	Apr 30, 2007
Data rows not commiting to database.	2	Apr 30, 2007
txt files	2	Jun 16, 2008
Huge data needs to be transfer from Fixed width Text File to SQL S	5	Jun 22, 2005

TextFieldParser - reading tab delimited file

al jones

Andrew Morton

al jones

Jeff Glatt

Ask a Question

Similar Threads