reading text file

M

mp

question: is the following the best way to get the text out of a text file?
I found it on msdn (except for the stringbuilder part)
I don't see other properties that would get the whole text out of the file
without having the while loop but it seems primitive to have to create a
stringbuilder to gather the contents? (and add back the newline that the
loop strips out)

using (StreamReader sr = File.OpenText(_fileName))
{
String input;
StringBuilder sb = new StringBuilder();

while ((input=sr.ReadLine())!=null)
{
sb.Append( sr + Environment.NewLine );
}
sr.Close();
_origContents = sb.ToString();
}
 
M

mp

Peter Duniho said:
Not if all you want is a string containing all the text in the file. The
System.IO.File.ReadAllText() method is the simplest way to do that.

Also, in the code you posted, the statement "sr.Close()" is extraneous.
Putting the StreamReader object in a "using" block is sufficient (and
preferable).

Pete

thanks,
intuitively I assumed there'd be a ReadAllText() method somewhere, just
hadn't found it in searches on msdn/local help yet.
interesting how lacking many of the help examples are...

mark
 
M

mp

Peter Duniho said:
Not if all you want is a string containing all the text in the file. The
System.IO.File.ReadAllText() method is the simplest way to do that.

Also, in the code you posted, the statement "sr.Close()" is extraneous.
Putting the StreamReader object in a "using" block is sufficient (and
preferable).

Pete

actually, come to think of it, I do want the whole contents in one string
and also, I want to parse that string line by line, looking for specific
text (comments in lines of lisp code)
so in that case, i wonder if it's better to
option one
first get the whole string via .ReadAllText()
and then to get the individual lines via streamReader
....or option two
first get the whole string via .ReadAllText()
then for individual lines, i could use split on vbcrlf to get an array of
the individual lines to process
....or option three
use the stream reader as in my op and just build the whole string with the
stringbuilder
???
thanks for any input
mark
 
M

mp

Peter Duniho said:
It all depends on what you need at the end.

If you actually want an array or list of strings, one entry per line, then
using StreamReader (as in your initial example) and adding each line to
your data structure (I'd start with a List<string>, use ToArray() later if
you want an array) would make the most sense.

If you simply want to do something special each time you find a line with
some specific characteristic, then StreamReader is still likely to fit
best, but of course you don't need to save the lines as you read them.

If you really do want a single string representing the whole file _and_
you want to detect specific kinds of text, then perhaps the best solution
is to use ReadAllText(), and then use the System.Text.Regex class to
process the string, looking for whatever specific patterns you want.

Pete

Regex is probably exactly what i need, but i've read about it so many times
and been completely unable to grasp the inscrutable code to grab what i'm
looking for.

I'm parsing lisp code text files.
one objective is to strip out all the comments from the actual code

leading comments are easy to get rid of
if a line begins with the comment character (;) ignore the whole line

if it's at the end of a line i loop through the characters in the line
looking for ;

however, if that ; occurs inside a 'quote' (text enclosed in ""), it's not a
comment character, it's a literal
so i have to know if i'm inside a 'quote' (which can span multiple lines)

"this is a ; in a quote "
"this is
also
a ; in
a quote "

i'm using the term quote to indicate characters in between " and " (which is
a "string" data type in lisp terminology)
i have fairly complicated code that does all this with about 99% accuracy
but still a few odd 'strings' (characters between double quotes) will throw
it off, generally because a 'string' starts with a double quote and end with
a double quote, so i track open and close quotes to know if i'm inside a
string...however it can also contain a quote inside the string if it's
escaped with a \.
"this is a string with a \" inside it"

i can probably muck around some more with my code to fix that problem
but i know the way i'm doing it is a travesty of ugliness :)

i'm not sure regex can do all that but i suspect it could if i grasped all
the nuances of how to construct the regex string

mark
sorry for the long winded explanation but i don't know a simpler way to
describe what i'm trying to do
i can show the vb6 code that does this if desired, but doubt anyone wants to
see that ugliness
:)
 
M

mp

Peter Duniho said:
mp said:
Regex is probably exactly what i need, but i've read about it so many
times and been completely unable to grasp the inscrutable code to grab
what i'm looking for.

I'm parsing lisp code text files.
one objective is to strip out all the comments from the actual code

[...]
i'm not sure regex can do all that but i suspect it could if i grasped
all the nuances of how to construct the regex string

Regex can definitely handle the problem. But because of the quoting
issue, a working expression would have to include "balanced matching
groups" to exclude from consideration quoted text. That's a bit on the
advanced side of regex, and so while it's worthwhile maybe for you to
eventually explore that, in terms of just upgrading from VB6 to managed
code, maybe not the best thing to deal with at the moment.

Since your goal is to remove the comments altogether, that suggests to me
that using StreamReader to read one line at a time, while keeping some
state flags to track the string quoting across lines, is likely to be a
reasonably simple and workable solution. You can use a StringBuilder to
construct the modified text, sans comments, as you go through the file one
line at a time.

Pete

thanks Pete,
that's how i'm doing it now...just didn't know if there were a cleaner way
thanks for the input
mark
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top