Chr(0) in .NET causes string to return blank

C

Chad Dalton

The code worked perfectly fine in VB6 for converting a decimal length to
binary hex, but in .NET it only works when the iNum variable doesn't
contain a zero. If zero is passed in then I lose everything in the
string. Does anyone
know how to make the Chr(0) Function work exactly as it did in VB 6. If
you pass 342,2 as the arguments then everything works fine, but if you
pass 54,2 then it returns blank. Any help would be greatly
appreciated!!!!

Public Function SetBinaryLengthofMsg(ByRef iLen As Integer, ByRef
iPosLen As Integer) As String

iLen = decimal length value to be converted to binary
iPosLen = number of bytes in which to store binary length value (max
bytes = 3)

Dim iNum As Integer
Dim iRemainder As Short
Dim iLoop As Short
Dim lTmpLenVal As Integer
Dim szTmpLenString As String
Dim sb As New StringBuilder


'---set temporary decreasing value to length value
lTmpLenVal = lLen
szTmpLenString = ""

'---loop backwards breaking out length value into bytes
For iLoop = lPosLen To 1 Step -1

Select Case iLoop
Case 3
iNum = Int(lTmpLenVal / 65536)
sb.Append(Chr(iNum))
lTmpLenVal = lTmpLenVal Mod 65536
Case 2
iNum = Int(lTmpLenVal / 256)
sb.Append(Chr(iNum))
lTmpLenVal = lTmpLenVal Mod 256
Case 1
iNum = lTmpLenVal
sb.Append(Chr(iNum))
End Select

Next iLoop

'---return length value in binary hex
SetBinaryLengthofMsg = sb.ToString()


End Function
 
J

Jon Skeet [C# MVP]

Chad Dalton said:
The code worked perfectly fine in VB6 for converting a decimal length to
binary hex, but in .NET it only works when the iNum variable doesn't
contain a zero. If zero is passed in then I lose everything in the
string.

My guess is that you're actually creating the string properly, but it
isn't displaying in the debugger.

However, this is a really bad thing to be doing. Character data *isn't*
binary data, and shouldn't be treated as such. Instead of returning a
string, you should be returning a byte array, and then the problem
doesn't come up in the first place.
 
G

Guest

Chad,

What you are running into here is the zero-terminated-string problem. This was not something you would have ever encountered in VB6, but it a real issue in .NET that you have to pay attention to. Let me explain:

In most languages, API's etc the termination for a string is represented by a null zero. For example, in C, when you allocate an array of chars (which is a string in C) you can put any value into that array:

char ch[5];
ch[0]='A';
ch[1]='B';
ch[2]='C';
ch[3]=0;
ch[4]='D';

now, for the above code, in VB6 if you were to inspect the string you would see something like "ABC0D". In C however, if you were to pass a pointer to the above array to a psz string function, the function would only see the string as "ABC". That is because in most lower level languages the end of a string is represented by a NULL terminator (zero). They do this because it is annoying to have to indicate to every function you pass a char* to what the length of your string is.

See, there is a huge difference between a byte with a value of 0, and an ASCII "0". The actual value in the first case is 0x0 (decimal 0), the actual value in the second is 0x30 (decimal 48).

This same issue exists in .NET.

To test it, do this:

dim strTemp as string

strTemp = "Hello World!" & chr(0) & " My name is Zaphod!"

if you look in the debugger at this string, you will not see what you would expect. All you will see is "Hello World!

Interestingly, please note that in the debugger, the trailing quotation mark is missing. This is one way that you can tell you have a zero-terminated string problem. In the command prompt, type ?strTemp. Again, you will see Hello World! but will not see the name of the much maligned Zaphod.

What is happening here, is that at the point where you insert the chr(0) you are indicating to the string manipulation routines of the CLR that the string has ended (BTW: there may be some possible buffer overrun exploit scenerios that arise from this, I don't know I haven't checked) even though you have more data allocated for the string object.

To get around this, if you need to use an actual chr(0), (real, true zero valued byte) you are going to have to use a byte array, not a string.

There is absolutely no way around it.

If you are writing a communication protocol, for example, for raw IP or RS232 that use zero's you are stuck with byte arrays.
 
J

Jon Skeet [C# MVP]

Clayton Gulick said:
What is happening here, is that at the point where you insert the
chr(0) you are indicating to the string manipulation routines of the
CLR that the string has ended (BTW: there may be some possible buffer
overrun exploit scenerios that arise from this, I don't know I haven't
checked) even though you have more data allocated for the string
object.

No you haven't. The CLR won't get confused by null characters in
strings. Unmanaged code might, and the debugger might, but that's a
different matter.
 
G

Guest

I.E. the problem is not just with the debugger... the string itself gets corrupted and truncated at the point that you insert a null 0 into it... I know this for sure because it bit me in the behind when I was parsing raw bytes from IP... had the same problem Chad is having... too alot of research to discover what was happening

Due to the fact that the problem is happening, I think it is fair to say that... the problem is happening.
 
J

Jon Skeet [C# MVP]

Clayton Gulick said:
I.E. the problem is not just with the debugger... the string itself
gets corrupted and truncated at the point that you insert a null 0
into it...

No it doesn't. Just because the debugger is confused doesn't mean the
CLR is.
I know this for sure because it bit me in the behind when I
was parsing raw bytes from IP... had the same problem Chad is
having... too alot of research to discover what was happening.

It sounds like you've come to the wrong conclusions though.
Due to the fact that the problem is happening, I think it is fair to
say that... the problem is happening.

Please give an example which *doesn't* rely on the debugger or any
other unmanaged code (message boxes and many other GUI components are
just wrappers round unmanaged code). Here's an example which shows that
the string is *not* truncated:

using System;

class Test
{
static void Main()
{
string x = "Hello\0world";
Console.WriteLine(x.Length);
Console.WriteLine(x);
foreach (char c in x)
{
Console.WriteLine ((int)c);
}
}
}

There no "corruption" going on. There's just data which some unmanaged
code (the message box) doesn't understand properly. If the data were
corrupted, it wouldn't all be printed out absolutely correctly, would
it? You could try some other things too - writing the string to a
stream (with a StreamWriter) - no problem.

Fundamentally a string is a sequence of characters, and it doesn't
matter (to the string) what those characters are. It may well matter to
other things such as the debugger and message boxes, but that's not the
same thing as the CLR itself having any problem with it.

Having said all that, and as I said originally, you shouldn't try to
use a character data type such as string for something which is
fundamentally binary data.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top