string / char[] question

Z

Zach

Hello,

This might be a rather basic question, but I've tried a few things and
I can't really find a solution as elegant as what I'd like for this
problem. The situation is this - I have a file that's written to disk
in a binary format. Basically it's a bunch of records, one after the
other, where each record has the following format : 18 bytes, 6 bytes
which are all zero, 4 bytes which represent a UInt32. The first 18
bytes of each record are actually a non-unicode string with maximum
length 18. If my string is "hello", the bytes are 'h', 'e', 'l', 'l',
'o', followed by 13 zeros. If my string is abcdefghijklmnopqr, then
all 18 bytes of this field are used.

So now that I've explained the structure, the problem I'm encountering
is when I want to read this string field. I'm using a
System.IO.BinaryReader to read the file. I created it with
Encoding.ASCII and then called getChars(18), which returns me a char[]
that is 18 characters long. But then how to get this into a
System.String? I tried the string constructor that takes a char[] as a
constructor parameter, but it doesn't seem acknowledge the nulls that
are padded into the end of the array, and even if it did it wouldn't
matter since I need to handle the case where all 18 chars contain a
value. The code I'm using now is as follows:

char[] NameChars = Reader.ReadChars(18);
int i = 0;
for (i = 0; i < 18; ++i)
if (NameChars == '\0') break;
string Name = new string(NameChars, 0, i);

Reader.ReadBytes(6); //Skip the padding.

System.UInt32 Code = Reader.ReadUInt32();

That loop I coded in there just seems "bad", and I'd much prefer if
there was a way to accomplish the same thing using only standard
framework calls.

On a related but slightly different topic, is it correct behavior of
the string::string(char[]) constructor that it simply ignores any nulls
embedded in the array, and creates a string with embedded nulls? I
wasn't aware that .NET strings even supported embedded nulls at all,
although this could be due to my lack of experience with .NET

Thanks
 
L

Lau Lei Cheong

I think it's normal behaviour.

Because in C/C++, strings are usually null terminated, it's understandable
that the string constrctor see "null" character as the end of string marker.
 
Z

Zach

Actually, it seems to me that string constructor does NOT see null
character as end of string marker. Example:

char[] c = new char {'a', 'a', 'a', 'a', 'a', '\0', '\0'}
string s = new string(c);

This code creates a string of length 7. Shouldn't it create a string
of length 5?
I think it's normal behaviour.

Because in C/C++, strings are usually null terminated, it's understandable
that the string constrctor see "null" character as the end of string marker.

Zach said:
Hello,

This might be a rather basic question, but I've tried a few things and
I can't really find a solution as elegant as what I'd like for this
problem. The situation is this - I have a file that's written to disk
in a binary format. Basically it's a bunch of records, one after the
other, where each record has the following format : 18 bytes, 6 bytes
which are all zero, 4 bytes which represent a UInt32. The first 18
bytes of each record are actually a non-unicode string with maximum
length 18. If my string is "hello", the bytes are 'h', 'e', 'l', 'l',
'o', followed by 13 zeros. If my string is abcdefghijklmnopqr, then
all 18 bytes of this field are used.

So now that I've explained the structure, the problem I'm encountering
is when I want to read this string field. I'm using a
System.IO.BinaryReader to read the file. I created it with
Encoding.ASCII and then called getChars(18), which returns me a char[]
that is 18 characters long. But then how to get this into a
System.String? I tried the string constructor that takes a char[] as a
constructor parameter, but it doesn't seem acknowledge the nulls that
are padded into the end of the array, and even if it did it wouldn't
matter since I need to handle the case where all 18 chars contain a
value. The code I'm using now is as follows:

char[] NameChars = Reader.ReadChars(18);
int i = 0;
for (i = 0; i < 18; ++i)
if (NameChars == '\0') break;
string Name = new string(NameChars, 0, i);

Reader.ReadBytes(6); //Skip the padding.

System.UInt32 Code = Reader.ReadUInt32();

That loop I coded in there just seems "bad", and I'd much prefer if
there was a way to accomplish the same thing using only standard
framework calls.

On a related but slightly different topic, is it correct behavior of
the string::string(char[]) constructor that it simply ignores any nulls
embedded in the array, and creates a string with embedded nulls? I
wasn't aware that .NET strings even supported embedded nulls at all,
although this could be due to my lack of experience with .NET

Thanks
 
M

Mattias Sjögren

So now that I've explained the structure, the problem I'm encountering
is when I want to read this string field. I'm using a
System.IO.BinaryReader to read the file. I created it with
Encoding.ASCII and then called getChars(18), which returns me a char[]
that is 18 characters long. But then how to get this into a
System.String?

The easist way would be to call Encoding.GetString direcly and skip
the GetChars call.

On a related but slightly different topic, is it correct behavior of
the string::string(char[]) constructor that it simply ignores any nulls
embedded in the array, and creates a string with embedded nulls?

Yes, embedded nulls are allowed in managed strings since the length is
stored separately. If you don't want them, you can remove trailing
nulls with String.TrimEnd('\0').


Mattias
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top