String.Substring result when non-alphanumeric character?

B

Bill

I just ran into a situation where string data from a mainframe contained a
couple of non-alphanumeric characters (hex CC and C8). I was parsing a field
that occurred after these unexpected characters and it appears the Substring
method was thrown off and returned a field two bytes off.

Does this data cause a problem with the Substring method?
 
J

Jon Skeet [C# MVP]

Bill said:
I just ran into a situation where string data from a mainframe contained a
couple of non-alphanumeric characters (hex CC and C8). I was parsing a field
that occurred after these unexpected characters and it appears the Substring
method was thrown off and returned a field two bytes off.

Does this data cause a problem with the Substring method?

Could you post a short but complete program which demonstrates the
problem?

See http://www.pobox.com/~skeet/csharp/complete.html for details of
what I mean by that.
 
J

Jon Skeet [C# MVP]

Bill said:
This test now appears to be an issue with the StreamReader dropping these
unexpected characters.

Ah - and that's almost certainly just because you haven't given it the
right encoding. You haven't specified an encoding, so it's using UTF-8,
which I don't believe is what you really wanted.

See http://www.pobox.com/~skeet/csharp/unicode.html for more
information about encodings and Unicode.

You probably want to use Encoding.Default in this case.
 
B

Bill

Jon Skeet said:
Ah - and that's almost certainly just because you haven't given it the
right encoding. You haven't specified an encoding, so it's using UTF-8,
which I don't believe is what you really wanted.

See http://www.pobox.com/~skeet/csharp/unicode.html for more
information about encodings and Unicode.

You probably want to use Encoding.Default in this case.

Yes... Encoding.Default solved the problem.

The following modifications to my original test example works:

class Class1
{
static void Main(string[] args)
{
string str;

using (StreamWriter sro = new StreamWriter("ProbDataOut.txt"))
{
using (StreamReader sri = new StreamReader("ProbDataIn.txt",
Encoding.Default))
{
Console.WriteLine("---- TESTING FILE WITH PROBLEM CHARACTERS
HEX 'CC' and 'C8' ----");
while ( (str = sri.ReadLine()) != null )
{
Console.WriteLine("Length: " + str.Length);
Console.WriteLine("Output: " + str.Substring(184,1));
sro.WriteLine(str);
}
}
}
}
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top