Reading a Text File

K

kronecker

I am trying to delete multiple lines in a text file using the
following

Private Sub Read_TextFile()
Dim objReader As StreamReader
Dim strfull, strContents, strContentsold, strContentsnew As
String
objReader = New StreamReader("C:\answer.txt")
'Clear the Text Box1
TextBox1.Clear()


strContentsold = ""
strContentsnew = ""
strContents = ""
strfull = ""


Do While Not objReader.EndOfStream


strContentsold = strContentsnew
strContentsnew = objReader.ReadLine

If strContentsnew = strContentsold Then
strContents = ""
Else
strContents = strContentsnew
End If

strfull += strContents


Loop

TextBox1.Text = strfull

objReader.Close()
End Sub

The text will be stored in TextBox1.

However, it appears not to work! I was wondering if anybody had any
ideas. Here is tan original text file as an example

I assume that you wanted to know whether I can tell you about Wales.
I assume that you wanted to know whether I can tell you about Wales.
whether I can tell you about Wales
Wales is an administrative division in the UK.
Wales
an administrative division
the UK
Source: START KB
Source:
Go back to the START dialog window.
Go back to the START dialog window.
Go back to the START dialog window.
 
C

Cor Ligthert[MVP]

Hi,

It did work fine for me, however, the code is a little bit very old fashion
(vb6) style. Be aware that everything in Net is an object, while telling
that something is a string is not done anymore.

In the case of appending strings is the stringbuilder more suitable because
that as you append to a string everytime a longer new string is created. (Be
also aware that using the + as a string connecter can give in some cases not
wanted results, use the real connection character & for that, that tells
more direct that it is about string then to prefix everything with str.

I changed your code a little bit.

\\\
Private Sub Read_TextFile()
Dim Reader As IO.StreamReader
Dim ContentsOld, ContentsNew As String
Dim Contents As New System.Text.StringBuilder
Reader = New IO.StreamReader("C:\test\answer.txt")
TextBox1.Clear()
ContentsOld = ""
ContentsNew = ""
Do While Not Reader.EndOfStream
ContentsOld = ContentsNew
ContentsNew = Reader.ReadLine
If ContentsNew <> ContentsOld Then
ContentsOld = ContentsNew
Contents.Append(ContentsNew)
End If
Loop
TextBox1.Text = Contents.ToString
Reader.Close()
End Sub
///

Cor
 
F

falderals

Hi,

It did work fine for me, however, the code is a little bit very old fashion
(vb6) style. Be aware that everything in Net is an object, while telling
that something is a string is not done anymore.

In the case of appending strings is the stringbuilder more suitable because
that as you append to a string everytime a longer new string is created. (Be
also aware that using the + as a string connecter can give in some cases not
wanted results, use the real connection character & for that, that tells
more direct that it is about string then to prefix everything with str.

I changed your code a little bit.

\\\
Private Sub Read_TextFile()
Dim Reader As IO.StreamReader
Dim ContentsOld, ContentsNew As String
Dim Contents As New System.Text.StringBuilder
Reader = New IO.StreamReader("C:\test\answer.txt")
TextBox1.Clear()
ContentsOld = ""
ContentsNew = ""
Do While Not Reader.EndOfStream
ContentsOld = ContentsNew
ContentsNew = Reader.ReadLine
If ContentsNew <> ContentsOld Then
ContentsOld = ContentsNew
Contents.Append(ContentsNew)
End If
Loop
TextBox1.Text = Contents.ToString
Reader.Close()
End Sub
///

Cor

You know that's a great help. What if there were un-ascii characters
in the file for some reason that are not visible?Then two lines
may look similar but differ. How to delete them then!?

K.
 
C

Cor Ligthert [MVP]

If it was my problems then I would do it with this code (a, b, c are not
your fieldnames)

Dim a As String = "1234"
Dim b As New System.Text.StringBuilder
For Each c As Char In a
If AscW(c) > 30 AndAlso AscW(c) < 128 Then
b.Append(c)
End If
Next

Where in this case you should evaluate which charactercode you will use
(I have without checking taken 30 and 128)

Don't ask yourself if this is quick enough, this is more then 100000 times
quicker then one pixel move of a textbox on screen.

Cor
 
S

Stephany Young

So you are starting with:

I assume that you wanted to know whether I can tell you about Wales.
I assume that you wanted to know whether I can tell you about Wales.
whether I can tell you about Wales
Wales is an administrative division in the UK.
Wales
an administrative division
the UK
Source: START KB
Source:
Go back to the START dialog window.
Go back to the START dialog window.
Go back to the START dialog window.

and you want to end up with:

I assume that you wanted to know whether I can tell you about Wales.
whether I can tell you about Wales
Wales is an administrative division in the UK.
Wales
an administrative division
the UK
Source: START KB
Source:
Go back to the START dialog window.

or, do you want to end up with:

I assume that you wanted to know whether I can tell you about Wales.
Wales is an administrative division in the UK.
Source: START KB
Go back to the START dialog window.

If it is the former then it is a simple as:

' Get the content of the file into a List(Of String).
Dim _list = New List(Of String)(File.ReadAllLines("C:\answer.txt"))

' Read the list from the bottom up but do not process the first line.
For _i = _list.count - 1 To 1 Step - 1
' If the line above (_i - 1) the line of interest (_i) is the equivalent
value then remove the line of interest.
If _list(_i - 1) = _List(_i) Then _list.RemoveAt(_i)
Next

' Join the lines together NewLines and put the result in the textbox.
TextBox1.Text = String.Join(Environment.Newline, _list.ToArray())

If it is the latter, then the solution is more complex because you need to
consider parts of lines rather than whole lines. To do this you need to read
from the top down and when you remove a line you need to start a new pass
over the whole list.

' Get the content of the file into a List(Of String).
Dim _list = New List(Of String)(File.ReadAllLines("C:\answer.txt"))

Dim _removal = True

While _removal
_removals = False
' Read the list from the top down starting from line 2.
For _i = 1 To _list.Count - 1
' If the line above (_i - 1) the line of interest (_i) contains the
value of interest then remove the line of interest and start a new pass.
If _list(_i - 1).Contains(_List(_i)) Then
_list.RemoveAt(_i)
_removal = True
Next
End While

' Join the lines together NewLines and put the result in the textbox.
TextBox1.Text = String.Join(Environment.Newline, _list.ToArray())

The first pass will remove the second occurrence of:
I assume that you wanted to know whether I can tell you about Wales.

The second pass will remove:
whether I can tell you about Wales

The third pass will remove:
Wales

The fourth pass will remove:
an administrative division

The fifth pass will remove:
the UK


The sixth pass will remove:
Source:

The seventh pass will remove the second occurrence of:
Go back to the START dialog window.

The eighth pass will remove the second occurrence of:
Go back to the START dialog window.
 
F

falderals

So you are starting with:

I assume that you wanted to know whether I can tell you about Wales.
I assume that you wanted to know whether I can tell you about Wales.
whether I can tell you about Wales
Wales is an administrative division in the UK.
Wales
an administrative division
the UK
Source: START KB
Source:
Go back to the START dialog window.
Go back to the START dialog window.
Go back to the START dialog window.

and you want to end up with:

I assume that you wanted to know whether I can tell you about Wales.
whether I can tell you about Wales
Wales is an administrative division in the UK.
Wales
an administrative division
the UK
Source: START KB
Source:
Go back to the START dialog window.

or, do you want to end up with:

I assume that you wanted to know whether I can tell you about Wales.
Wales is an administrative division in the UK.
Source: START KB
Go back to the START dialog window.

If it is the former then it is a simple as:

' Get the content of the file into a List(Of String).
Dim _list = New List(Of String)(File.ReadAllLines("C:\answer.txt"))

' Read the list from the bottom up but do not process the first line.
For _i = _list.count - 1 To 1 Step - 1
' If the line above (_i - 1) the line of interest (_i) is the equivalent
value then remove the line of interest.
If _list(_i - 1) = _List(_i) Then _list.RemoveAt(_i)
Next

' Join the lines together NewLines and put the result in the textbox.
TextBox1.Text = String.Join(Environment.Newline, _list.ToArray())

If it is the latter, then the solution is more complex because you need to
consider parts of lines rather than whole lines. To do this you need to read
from the top down and when you remove a line you need to start a new pass
over the whole list.

' Get the content of the file into a List(Of String).
Dim _list = New List(Of String)(File.ReadAllLines("C:\answer.txt"))

Dim _removal = True

While _removal
_removals = False
' Read the list from the top down starting from line 2.
For _i = 1 To _list.Count - 1
' If the line above (_i - 1) the line of interest (_i) contains the
value of interest then remove the line of interest and start a new pass.
If _list(_i - 1).Contains(_List(_i)) Then
_list.RemoveAt(_i)
_removal = True
Next
End While

' Join the lines together NewLines and put the result in the textbox.
TextBox1.Text = String.Join(Environment.Newline, _list.ToArray())

The first pass will remove the second occurrence of:
I assume that you wanted to know whether I can tell you about Wales.

The second pass will remove:
whether I can tell you about Wales

The third pass will remove:
Wales

The fourth pass will remove:
an administrative division

The fifth pass will remove:
the UK

The sixth pass will remove:
Source:

The seventh pass will remove the second occurrence of:
Go back to the START dialog window.

The eighth pass will remove the second occurrence of:
Go back to the START dialog window.

That's smart Stephany..what are doing for lunch tomorrow?

K.
 
K

kronecker

If it is the latter, then the solution is more complex because you need to
consider parts of lines rather than whole lines. To do this you need to read
from the top down and when you remove a line you need to start a new pass
over the whole list.

' Get the content of the file into a List(Of String).
Dim _list = New List(Of String)(File.ReadAllLines("C:\answer.txt"))

Dim _removal = True

While _removal
_removals = False
' Read the list from the top down starting from line 2.
For _i = 1 To _list.Count - 1
' If the line above (_i - 1) the line of interest (_i) contains the
value of interest then remove the line of interest and start a new pass.
If _list(_i - 1).Contains(_List(_i)) Then
_list.RemoveAt(_i)
_removal = True
Next
End While

' Join the lines together NewLines and put the result in the textbox.
TextBox1.Text = String.Join(Environment.Newline, _list.ToArray())

The first pass will remove the second occurrence of:
I assume that you wanted to know whether I can tell you about Wales.

The second pass will remove:
whether I can tell you about Wales

The third pass will remove:
Wales

The fourth pass will remove:
an administrative division

The fifth pass will remove:
the UK

The sixth pass will remove:
Source:

The seventh pass will remove the second occurrence of:
Go back to the START dialog window.

The eighth pass will remove the second occurrence of:
Go back to the START dialog window.
I get an error for this last method:

Index was out of range. Must be non-negative and less than the size of
the collection. Parameter name: index

pointing to this line here

If _list(_i - 1).Contains(_list(_i)) Then

I can't see how it can be out of range since you have the index

For _i = 1 To _list.Count - 1

which makes sense.


K.
 
S

Stephany Young

The test inside the For ... Next loop should read:

If _list(_i - 1).Contains(_List(_i)) Then
_list.RemoveAt(_i)
_removal = True
Exit For
Next

Otherwise the index out of range exception will certainly rear it's ugly
head.
 
K

kronecker

The test inside the For ... Next loop should read:

If _list(_i - 1).Contains(_List(_i)) Then
_list.RemoveAt(_i)
_removal = True
Exit For
Next

Otherwise the index out of range exception will certainly rear it's ugly
head.

It's working on that example but now I have a harder one!


slightly smaller than Oregon
Population:
60,776,238 (July 2007 est.)
Population:
60,776,238 (July 2007 est.)
Population:
60,776,238 (July 2007 est.)
Population:
60,776,238 (July 2007 est.)
Population:
Population:
files like this one above. There can be up to 3 lines repeated.

K.
 
R

Rory Becker

Hello (e-mail address removed),

Not to nitpick too much, but is it not simpler to prevent the files from
being created in this haphazard fashion in the first place?
 
K

kronecker

Hello (e-mail address removed),

Not to nitpick too much, but is it not simpler to prevent the files from
being created in this haphazard fashion in the first place?

I am looking into this too.

K.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top