Downloading unix \n text files, convert to \r\n non-unix?

Z

Zytan

I am downloading a file with \n newlines from a Unix system, and
storing it to a string. I want to convert it to \r\n newlines for
Windows. I know the StreamReader has an Encoding attribute, but this
isn't what I need. Should I do a String.Replace(), or is there a
better solution?

Zytan
 
Z

Zytan

Should I do a String.Replace(), or is there a better solution?

String.Replace("\n", "\r\n") doesn't work. It doesn't change
anything.

Zytan
 
?

=?ISO-8859-1?Q?Arne_Vajh=F8j?=

Zytan said:
String.Replace("\n", "\r\n") doesn't work. It doesn't change
anything.

It should.

string s = "A\nBB\nCCC";
Console.WriteLine(s.Length);
s = s.Replace("\n", "\r\n");
Console.WriteLine(s.Length);

prints 8 and 10 for me !

Arne
 
?

=?ISO-8859-1?Q?Arne_Vajh=F8j?=

Arne said:
It should.

string s = "A\nBB\nCCC";
Console.WriteLine(s.Length);
s = s.Replace("\n", "\r\n");
Console.WriteLine(s.Length);

prints 8 and 10 for me !

But I think:

string s = "A\nBB\nCCC";
Console.WriteLine(s.Length);
s = Regex.Replace(s, "(?<!\r)\n", Environment.NewLine);
Console.WriteLine(s.Length);

is better code !

Arne
 
Z

Zytan

string s = "A\nBB\nCCC";
Console.WriteLine(s.Length);
s = s.Replace("\n", "\r\n");
Console.WriteLine(s.Length);

prints 8 and 10 for me !

LOL! I wasn't storing the value of s.Replace into s! And this after
I've confirmed your code works, and printed my string out as bytes, to
confirm that it is unix-style. haha

Zytan
 
Z

Zytan

But I think:
string s = "A\nBB\nCCC";
Console.WriteLine(s.Length);
s = Regex.Replace(s, "(?<!\r)\n", Environment.NewLine);
Console.WriteLine(s.Length);

is better code !

Thanks, Arne. So, this replaces both \r and \r\n with the proper
newline. So, this should work for any string! great!

Zytan
 
?

=?ISO-8859-1?Q?Arne_Vajh=F8j?=

Zytan said:
Thanks, Arne. So, this replaces both \r and \r\n with the proper
newline. So, this should work for any string! great!

No.

It replace \n with \r\n without replacing \r\n with \r\r\n.

Arne
 
?

=?ISO-8859-1?Q?Arne_Vajh=F8j?=

Zytan said:
Thanks, Arne. So, this replaces both \r and \r\n with the proper
newline. So, this should work for any string! great!

If you need to convert \n to \r\n and \r to \r\n
but keep existing \r\n untouched you should use:

s = Regex.Replace(s, "((?<!\r)\n)|(\r(?!\n))", "\r\n");

Note that I switched back from Environment.NewLine to \r\n
again - after thinking aboutn it, then Environment.NewLine is not
really good here.

Arne
 
Z

Zytan

No.

It replace \n with \r\n without replacing \r\n with \r\r\n.

Well, \r\r\n wouldn't be proper, anyway, right, so it doesn't really
matter if you fix it or not, since it's already broken?

Your code ensures that only 0 or 1 \r is pre-fixed before \n, before
being replaced with Environment.NewLine, which is always the proper
carraige return for the system running the code. Right?

Zytan
 
Z

Zytan

If you need to convert \n to \r\n and \r to \r\n
but keep existing \r\n untouched you should use:

No, I just need to convert unix style (\n) into the current system
style (Environment.NewLine), although being able to convert either
style (\n or \r\n) into the current system style is good, since then I
don't care about which system I downloaded the text from. I think
your original code does this, fine.

Thanks, Arne

Zytan
 
P

Peter Duniho

Zytan wrote: [...]
It replace \n with \r\n without replacing \r\n with \r\r\n.

Well, \r\r\n wouldn't be proper, anyway, right, so it doesn't really
matter if you fix it or not, since it's already broken?

Your code ensures that only 0 or 1 \r is pre-fixed before \n, before
being replaced with Environment.NewLine, which is always the proper
carraige return for the system running the code. Right?

No. As he said, it replaces \n with \r\n without replacing \r\n with
\r\r\n. In ensures that exactly 0 \r is pre-fixed before \n before being
replaced, to phrase it in the same way you did.

You are right that \r\r\n would not be proper, which is why he wrote the
code to avoid creating instances of that.

Personally, I think it's overkill. If you're going to the trouble of
using the general purpose "Environment.NewLine" property as the replaced
text, then checking to avoid munging a specific character combination
(\r\n) seems pointless, given that in theory the point is that
"Environment.NewLine" might not actually be that specific character
combination. If nothing else, I would think it would make more sense to
design the code to replace *both* \n and \r\n with Environement.NewLine.

But ignoring that point, the code does do exactly what Arne says it does.

Pete
 
Z

Zytan

No. As he said, it replaces \n with \r\n without replacing \r\n with
\r\r\n. In ensures that exactly 0 \r is pre-fixed before \n before being
replaced, to phrase it in the same way you did.

Ah, ok. Regex is foreign to me.
You are right that \r\r\n would not be proper, which is why he wrote the
code to avoid creating instances of that.
Right.

I would think it would make more sense to
design the code to replace *both* \n and \r\n with Environement.NewLine.

Yes, that's what I was thinking Arne attempting to do (which is above
and beyond my original concern). And I agree this would be best.

Thanks, Pete.

Zytan
 
?

=?ISO-8859-1?Q?Arne_Vajh=F8j?=

Zytan said:
Well, \r\r\n wouldn't be proper, anyway, right, so it doesn't really
matter if you fix it or not, since it's already broken?

It is not about \r\r\n - it is about if the guys on the Unix
side suddenly decide to be nice and convert it to Windows
format \r\n for you. If you then run your program with the
simple replace you will get \r\r\n.

This variant avoids that, so the code can handle both
Unix and Windows format.

Arne
 
G

Guest

Peter said:
Personally, I think it's overkill. If you're going to the trouble of
using the general purpose "Environment.NewLine" property as the replaced
text, then checking to avoid munging a specific character combination
(\r\n) seems pointless, given that in theory the point is that
"Environment.NewLine" might not actually be that specific character
combination.

Which is one of the reasons why I dropped it in the last version.

Arne
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top