Combine 2 lines in text file conditionally

D

d4

I have a vbscript (below) I want to rewrite in C# but I cannot get it
to work (unless there is a better way to do it).

The script will combine 2 lines if the (next) line contains a "+",
otherwise just write the line.



#### vbscript:

Do While Not InputFile.AtEndofStream

Line = InputFile.ReadLine

If Len(Line) = 0 Then
Exit Do ' Exit at last line because it is empty
End If

If Mid(Line, 26, 1) = "+" Then
out = out & Mid(Line, 27)
Else
If Len(Out) Then
OutputFile.WriteLine(out)
End If
out = Left(Line, 15) & ";;" & RTrim(Mid(Line, 32, 8)) &
";;" & Mid(Line, 66)
End If

Loop
OutputFile.WriteLine(out)

OutputFile.Close()
InputFile.Close()

#### Below is the C# version I cannot get to work...

private static void CreateText()
{

string inFile = @"c:\Text.txt";
string outFile = @"C:\OutText.txt";

StreamReader sr = new StreamReader(inFile);
StreamWriter sw = new StreamWriter(File.Create(outFile));


while (!(sr.EndOfStream))
{
string Line = sr.ReadLine();
if (Line.Length == 0) {
goto exitDoLoopStatement0;
}
if (Line.Substring(26, 1) == "+")
{
string outL = outL + Line.Substring(27);
} else
{
if (Line.Length > 0) {
sw.WriteLine(outL);
}
outL = Line.Substring(0, 15) + ";" + Line.Substring(31,
8).TrimEnd() + ";" + Line.Substring(65);
}
}
exitDoLoopStatement0: ;
sw.WriteLine(outL);

sr.Close();
sw.Close();

}


### Text file: Text.txt

06.299 22:06:34 HBR10224W DUID=SERVER29: ABORTED BY CLIENT,
FILE=C:\Program Files\Websense Reporter\LogServer\Cache\LogServer.state
06.299 22:06:36 HBR10224W DUID=SERVER29: ABORTED BY CLIENT,
FILE=D:\Program Files\ISS\RealSecure SiteProtector\Site
Database\Data\Re
06.299 22:06:36 HBR10224W+alSecureDB.mdf
06.299 22:06:36 HBR10224W DUID=SERVER29: ABORTED BY CLIENT,
FILE=D:\Program Files\ISS\RealSecure SiteProtector\Site
Database\Data\Re
06.299 22:06:36 HBR10224W+alSecureDBLog.ldf
06.299 22:06:52 HBR10224W DUID=SERVER1 : ABORTED BY CLIENT,
FILE=C:\STATIC\straudit.log
06.299 22:06:52 HBR10224W DUID=SERVER1 : ABORTED BY CLIENT,
FILE=C:\STATIC\strerror.log
06.299 22:06:52 HBR10224W DUID=SERVER1 : ABORTED BY CLIENT,
FILE=C:\STATIC\trace
06.299 22:09:13 HBR10224W DUID=SERVER8 : ABORTED BY CLIENT,
FILE=D:\perfdata_DB\perfdata_Data.MDF



Any help or suggestions would be greatly appreciated.
 
J

Jeroen

Hi,

First, please indicate what goes wrong (error? unexpected output?) so
that we don't have to compile and run the example right away.

Second, notice that you can use the 'break' statement where you use the
'goto', which to me really seems nicer (and I'm also sure that there's
people around here that can put forth better arguments :)).

Regards,
Jeroen
 
D

d4

Sorry. The error is in the line
"string outL = outL + Line.Substring(27);" with "Local variable outL
might be not initialized before accessing". Also, the subsequent outL
get "Cannot resolve symbol 'outL'"
(If I try to compile, I get "The name 'outL' does not exist in the
current context".
 
T

Tom Porterfield

d4 said:
Sorry. The error is in the line
"string outL = outL + Line.Substring(27);" with "Local variable outL
might be not initialized before accessing". Also, the subsequent outL
get "Cannot resolve symbol 'outL'"
(If I try to compile, I get "The name 'outL' does not exist in the
current context".

That is to be expected. Your variable outL is declared inside your second
"if" statement. As such it is scoped only to that if block. You cannot
access that variable once you have left the closing brace for the "if", not
even in the corresponding "else". Since you need to write outL once your
"while" loop has finished, you'll need to declare the variable at the same
place where you declare your sr and sw variables.

Note that I have not checked your code to insure it is doing what you want
it to, but this should get you past the current compile error so you can
start debugging.
 
J

Jeroen

d4 schreef:
Sorry. The error is in the line
"string outL = outL + Line.Substring(27);" with "Local variable outL
might be not initialized before accessing". Also, the subsequent outL
get "Cannot resolve symbol 'outL'"
(If I try to compile, I get "The name 'outL' does not exist in the
current context".


Hmm, I see a few other bits that might give you trouble as well, as for
example the clause in the while loop. Don't think that's the c# way to
do it, you might want to check the examples coming with the stream
documentation.

Anyways, your current problem is caused by the fact that you did not
initialize the outL variable in the correct place. You should remove
the 'string' part from the error-line, and put a statement just
above/outside the while loop:

string outL = "";

This puts the variable in a larger scope, also solving the second
problem.

Good luck.
 
D

d4

Thanks. I moved it out but just get blank lines written to the outfile.
I'm sure my vb didn't translate well to C#, so is there some other way
to accomplish what I need to do in C#. That is read a line, if the next
line contains a "+" then concatenate (since it is a continuation of the
filename), otherwise just write out the current line?

Thanks again for any help. (as you can tell I am new to c# and
programming in general).
 
J

Jason Gurtz

d4 said:
Thanks again for any help. (as you can tell I am new to c# and
programming in general).

Just some general hints that will make life easier (for you and those
that may replace you, and you in 6 months when you decide to modify your
code):

Put comments in your code. Even if they seem stupid. If nothing else
it helps organize your algorithm

goto statement: really *really* *BAD* idea. There is something like a 5
nines chance that some other coding structure/algorithm will work
instead and will be easier to read/debug. Hint, look at the "break" and
"continue" statements inside of loops.

C# has a really convenient "foreach" loop (i'd like to think a la perl!)
It is really really nice when working with arrays such as strings
(yes, strings are actually arrays; more later).

Always pay attention to your variable's scope and always pre-declare
and pre-initialize them. Pretty much, the only good time to declare and
use a variable at the same time is the counter in a for loop.

e.g. for (int i=0; i<max; i++) {
//i exists only here and
// that's a good thing
}
//There's no i here, yay!

For others declare like this near the top of your main() or
constructor/initializer function/method:

int myInt = 0;
double myDouble = 0.0;
String myString = "";
char myChar = ' ';

Note space ----^

I hate to harp on formatting (and who knows if news reader software
mangled yours) but its tough to even check the balancing and matching of
braces in your example. Braces are what clue you in to scope so it's
important. I won't preach to use a particular style but I will preach
to at least be consistent with your indentation and spacing. The
default VS.NET 2005 editor is reasonable; I would suggest staying with
how it formats things for you until you decide to use a more compact style.

------------------------------------------------

Notes on your specific problem (I come from a C++ and perl background so
keep that in mind):

A string is nothing more than an array of chars with a null char at the
end (well ok, there's more in managed code but bear with me here).
Check it out in the debugger! If you're doing simple slicing and dicing
of strings it can be concise (and faster) to access the string by
character. Something like:

String Line = "My long annoying line nee+ding spliting\n";

if (Line[25] =='+') {
//rest of line should be concatenated with prevLine
}
else {
//Line should remain whole
}

To nab the end of the string with the '+' in it try this example

String[] splitString = new String[] {"", ""}; //Array of 2 empty Strings

myString = "My string with a+in the middle";
splitString = myString.Split(new char[1] {'+'});

//splitString[0] = "My string with a"
//splitString[1] = "in the middle"

Then you can do your concatenation:

Maybe try the Insert method here using StringBuilder:

StringBuilder SBoutL = new StringBuilder(outL);
SBoutL.Insert( (outL.Length-1), splitString[1]);

//And write the line...
sw.WriteLine(SBoutL);

You may encounter issues with line breaks use the
StringBuilder.TrimStart and TrimEnd methods to remove chars in the same
syntax as my Split example above. Use the \r\n escape sequences
(they're like the VBcr & VBlf except you can use them right inside
string literals without bothersome concatenation) for testing and adding
statements.

Hmm that was longer than I planned ;)

~Jason

--
 
D

d4

Thanks for the tips Jason.
I still cannot get my head around the StringBuilder / split /
concatenate. Whatever I write is a mess (won't paste any bad code).
Could you please provide some psuedo code as to how it should look?
 
J

Jason Gurtz

d4 said:
Thanks for the tips Jason.
I still cannot get my head around the StringBuilder / split /
concatenate. Whatever I write is a mess (won't paste any bad code).
Could you please provide some psuedo code as to how it should look?

Well we basically have two separate string manipulation abstractions
that you will wrap in your logic:

#1 -- Split(): The simpler one
the Split method that is a member of your basic String class. What this
does is split a string into sub strings when the original string is
delimited with some char. In a more conventional sense you would have
data in a flat file where the lines are delimited by commas or tab
chars. Note that there is also a Join() method that does the reverse,
creating a delimited String. Since Split() can look at any char that
you define and it looks doubtful that your log file has actual data that
is the + char it is convenient to just look at the + as a delimiter in
order to access. What is probably making my example look hairy is that
the Split method takes a single argument of type char[] (a character
array). If the argument was just a single character it would look like
this:

splitResult = myDelimString.Split('+'); //won't work

here's my example from my previous post, expanded into each step:

String delimString = ""; //Declare a string to hold line
// to be processed

String[] splitString = new String[2]; //Allocate empty array of two
// Strings to hold result of split

char[] delimChar = new char[1]; //Declare character array that
// holds just one
delimChar[0] = '+'; //Store a char to it

//This String would be a line from your log file
delimString = "My string with a+in the middle";

//Now do the split (note that the Split method returns an array of
// Strings)
splitString = delimString.Split(delimChar[0]);

//splitString now holds the two substrings from delimString

OK, now that we've extracted the part to be appended we've got to do it.

#2 -- StringBuilder.Insert(): A little more complex because you're
dealing embedded line endings and such. Think of the StringBuilder
class as being the same as the String class, only faster.
Unfortunately, there is no Split() or Join() in StringBuilder. Read
more here:
<http://msdn.microsoft.com/library/d...cpguide/html/cpconusingstringbuilderclass.asp>
<http://msdn.microsoft.com/library/d.../frlrfsystemtextstringbuildermemberstopic.asp>

There are a number of ways to concatenate strings but the trouble is if
there is an \n at the end of your first string and you just append then
you will have two lines instead of one. Yes, you could instead
pre-process your strings to remove all line ending chars, but that's
more work ;) Instead StringBuilder.Insert() does all that dirty work
for us an we can insert where ever we want.

So basically what we are doing is making a copy of the first line (the
one we are appending to) as a StringBuilder type. Then, we are invoking
the Insert method to insert the second string at the position one less
than the end of the string (in front of the \n). You will have to do
testing to determine if the line ending is \r\n and in that case insert
at two less than the end.

----------------

So what you should do is something like this (assumes first line never
has a plus):

Open file1 for reading
Open file2 for writing

Set boolWrite equals True
Read a line from file1 into line1
While NOT EOF
Read a line into line2
Test line2 for '+' at position n

If '+' exists at n
Split line2 into 2 pieces
Insert second piece of line2 at one less than the end of line 1
Write line1 to file2

If NOT EOF
Read a line from file1 into line1
Else
Set boolWrite equals False
break
End If
Else
write line1 to file2
line1 = line2
End If
End While

If boolWrite is True
write line1 to file2
End If

Close file2
Close file1

Hopefully there's no logic errors there ;)

~Jason

--
 
J

Jon Skeet [C# MVP]

Jason Gurtz said:
Always pay attention to your variable's scope and always pre-declare
and pre-initialize them. Pretty much, the only good time to declare and
use a variable at the same time is the counter in a for loop.

e.g. for (int i=0; i<max; i++) {
//i exists only here and
// that's a good thing
}
//There's no i here, yay!

For others declare like this near the top of your main() or
constructor/initializer function/method:

Eek, no! Declare variables as late as you can, in general, giving them
the smallest possible scope. A variable's name may not even make sense
when reading it from the top if you introduce it too early - it's like
introducing a character in a book and then not mentioning them for
ages.
 
J

Jason Gurtz

Jon said:
Eek, no! Declare variables as late as you can, in general, giving them
the smallest possible scope.

Obviously, variables should be declared only within the functions they
are used in and should generally not be global or have too much scope.

I will argue (esp in the case of a beginner) that you should declare all
your vars in one place (within the function or method) so that it is
just one place to look instead of scattered about. Also, a small end of
line comment goes a long way to understanding what a var represents.

~Jason

--
 
M

Marc Gravell

All variable declarations in one place? I used to do this with VB6, but it
makes for a mess of code at the top, and hard to remember vars lower down.
In C#, I always declare it and intialise as late as possible, so mid method
decs are perfectly normal. Likewise inline initialisation.

A comment to explain what it represents? In that case IMO you aren't naming
'em right. Perhaps a comment to explain any unusual nuance of usage or
behaviour... I know what you mean, though.

Marc
 
J

Jon Skeet [C# MVP]

Jason Gurtz said:
Obviously, variables should be declared only within the functions they
are used in and should generally not be global or have too much scope.

I'd argue that too much scope within a method is harmful too though.
I will argue (esp in the case of a beginner) that you should declare all
your vars in one place (within the function or method) so that it is
just one place to look instead of scattered about. Also, a small end of
line comment goes a long way to understanding what a var represents.

So you'd rather have:

variable declaration

15 lines of code

use of variable


than

15 lines of code

declaration and first use
subsequent uses

?

Why make someone look back up to the top of a method for the
declaration rather than just to the previous line?
 
J

Jason Gurtz

Jon said:
Why make someone look back up to the top of a method for the
declaration rather than just to the previous line?

Yea, I see your point. That's still organized, unlike an awful lot of
code i see where there are no logical groupings at all and the
boilerplate is just scattered about willy nilly.

~Jason

--
 
D

d4

Managed to get working with this (still working on comments :)


private static void CreateText()
{
string inFile = @"c:\Text.txt";
string outFile = @"C:\OutText.txt";

StreamReader sr = new StreamReader(inFile);
StreamWriter sw = new StreamWriter(File.Create(outFile));

string Line = sr.ReadLine();

while (Line != null)
{
sw.Write(Line);

Line = sr.ReadLine();

while ((Line != null) && (Line.IndexOf("+") > 0))
{
sw.Write(Line.Substring(Line.IndexOf("+") + 1));
Line = sr.ReadLine();
}

sw.WriteLine();
}

sr.Close();
sw.Close();
}

private static void AddSeparator()
{
string inFile = @"c:\OutText.txt";
string outFile = @"C:\Text2.txt";

StreamReader sr = new StreamReader(inFile);
StreamWriter sw = new StreamWriter(File.Create(outFile));

string currentLine = sr.ReadLine();

while (currentLine != null)
{
if (currentLine != null)
{
string a = currentLine.Substring(0, 15);
string b = currentLine.Substring(31, 8).TrimEnd();
string c = currentLine.Substring(65).Trim();
string Line = a + ";" + b + ";" + c;
sw.WriteLine(Line);
}

currentLine = sr.ReadLine();
}

sr.Close();
sw.Close();
}
 
J

Jon Skeet [C# MVP]

d4 said:
Managed to get working with this (still working on comments :)

Just a few notes:

1) Use the "using" statement rather than explicitly calling Close on
the StreamReader and StreamWriter - otherwise if an exception is
thrown, your reader/writer won't get closed.

2) Your variable names aren't consistent in capitalisation

3) Calling ReadLine() three times in the code makes it harder to
understand. Look for ways to make only one loop.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top