String.Substring result when non-alphanumeric character?

  • Thread starter Thread starter Bill
  • Start date Start date
B

Bill

I just ran into a situation where string data from a mainframe contained a
couple of non-alphanumeric characters (hex CC and C8). I was parsing a field
that occurred after these unexpected characters and it appears the Substring
method was thrown off and returned a field two bytes off.

Does this data cause a problem with the Substring method?
 
Bill said:
I just ran into a situation where string data from a mainframe contained a
couple of non-alphanumeric characters (hex CC and C8). I was parsing a field
that occurred after these unexpected characters and it appears the Substring
method was thrown off and returned a field two bytes off.

Does this data cause a problem with the Substring method?

Could you post a short but complete program which demonstrates the
problem?

See http://www.pobox.com/~skeet/csharp/complete.html for details of
what I mean by that.
 
Bill said:
This test now appears to be an issue with the StreamReader dropping these
unexpected characters.

Ah - and that's almost certainly just because you haven't given it the
right encoding. You haven't specified an encoding, so it's using UTF-8,
which I don't believe is what you really wanted.

See http://www.pobox.com/~skeet/csharp/unicode.html for more
information about encodings and Unicode.

You probably want to use Encoding.Default in this case.
 
Jon Skeet said:
Ah - and that's almost certainly just because you haven't given it the
right encoding. You haven't specified an encoding, so it's using UTF-8,
which I don't believe is what you really wanted.

See http://www.pobox.com/~skeet/csharp/unicode.html for more
information about encodings and Unicode.

You probably want to use Encoding.Default in this case.

Yes... Encoding.Default solved the problem.

The following modifications to my original test example works:

class Class1
{
static void Main(string[] args)
{
string str;

using (StreamWriter sro = new StreamWriter("ProbDataOut.txt"))
{
using (StreamReader sri = new StreamReader("ProbDataIn.txt",
Encoding.Default))
{
Console.WriteLine("---- TESTING FILE WITH PROBLEM CHARACTERS
HEX 'CC' and 'C8' ----");
while ( (str = sri.ReadLine()) != null )
{
Console.WriteLine("Length: " + str.Length);
Console.WriteLine("Output: " + str.Substring(184,1));
sro.WriteLine(str);
}
}
}
}
}
 
Back
Top