unsafe code and pointers

T

Tony Johansson

Hello!

In this piece of code both bytepek and shortpek works as expected but
longpek and doublepek give
unexpected result. Can somebody explain why not doublepek and longpek works
the same as bytepek and shortpek ?

static unsafe void Main(string[] args)
{
int x = 10;

int* pX = &x;

byte* bytepek = (byte*)pX; //works
short* shortpek = (short*)pX; //works
long* longpek = (long*)pX; //does not work
double* doublepek = (double*)pX; //does not work

Console.WriteLine("Address of x is 0x{0:X}, size is {1}, value is
{2}",
(uint)&x, sizeof(int), x);
Console.WriteLine("Address of is 0x{0:X}, size is {1}, value is
0x{2:X}",
(uint)&bytepek, sizeof(byte*), (uint)bytepek);

Console.WriteLine("Address of is 0x{0:X}, size is {1}, value is
0x{2:X}",
(uint)&shortpek, sizeof(short*), (uint)shortpek);

Console.WriteLine("Address of is 0x{0:X}, size is {1}, value is
0x{2:X}",
(uint)&longpek, sizeof(long*), (uint)longpek);

Console.WriteLine("Address of is 0x{0:X}, size is {1}, value is
0x{2:X}",
(uint)&doublepek, sizeof(double*), (uint)doublepek);

Console.WriteLine(*bytepek);
Console.WriteLine(*shortpek);
Console.WriteLine(*doublepek);
Console.WriteLine(*longpek);
}

//Tony
 
J

Jeff Johnson

In this piece of code both bytepek and shortpek works as expected but
longpek and doublepek give
unexpected result. Can somebody explain why not doublepek and longpek
works the same as bytepek and shortpek ?

static unsafe void Main(string[] args)
{
int x = 10;

int* pX = &x;

byte* bytepek = (byte*)pX; //works
short* shortpek = (short*)pX; //works
long* longpek = (long*)pX; //does not work
double* doublepek = (double*)pX; //does not work

(It's kind of scary that someone even HAS to explain this to you given how
much you already know about C#, but I guess we all have to learn this for
the first time eventually.)

Longs and doubles occupy more bytes in memory than an int. An int is
contained in 4 bytes. In a little-endian system, your value of 10 will be
represented by these bytes (hex notation):

0A 00 00 00

When you point bytepek at the memory location occupied by x then bytepek
will see:

0A

When you point shortpek at the memory location occupied by x then shortpek
will see:

0A 00

Longs and doubles are stored in 8 bytes, so when you point those variables
at the memory location occupied by x then they will see:

0A 00 00 00 <plus 4 more bytes which could be ANYTHING>

In longpek's case, IF those next 4 bytes JUST HAPPEN TO BE 0's then longpek
will appear to hold 10, but it's not likely that will happen. Since doubles
use IEEE floating point layout, doublepek will never hold 10.
 
J

Jeff Johnson

Console.WriteLine("Address of is 0x{0:X}, size is {1}, value is 0x{2:X}",
(uint)&bytepek, sizeof(byte*), (uint)bytepek);

I just noticed something else that I think you're confused about. All
pointers are the same size. A pointer is simply a 32-bit integer (or 64-bit
on a 64-bit OS) that holds a memory address. So there is no difference in
size between a pointer to a byte and a pointer to a long. The only size
difference is in the data stored at the memory location.

A byte pointer is 4 bytes long. When it is dereferenced, the underlying code
will only look at the first byte at the given memory address.

A long pointer is 4 bytes long. When it is dereferenced, the underlying code
will look at the next 8 bytes starting at the given memory address.

So all your sizeof(xxx*) calls will return exactly the same value regardless
of the data type of xxx.
 
T

Tony Johansson

Jeff Johnson said:
I just noticed something else that I think you're confused about. All
pointers are the same size. A pointer is simply a 32-bit integer (or
64-bit on a 64-bit OS) that holds a memory address. So there is no
difference in size between a pointer to a byte and a pointer to a long.
The only size difference is in the data stored at the memory location.

A byte pointer is 4 bytes long. When it is dereferenced, the underlying
code will only look at the first byte at the given memory address.

A long pointer is 4 bytes long. When it is dereferenced, the underlying
code will look at the next 8 bytes starting at the given memory address.

So all your sizeof(xxx*) calls will return exactly the same value
regardless of the data type of xxx.

Here I have a pointer to a double and I use a pointer to long to point to a
double. At the marks 1 and 2 it writes out 8 so both use 8 bytes.
When I run this I get wrong result when a long pointer is pointing to a
double.
I assume that the reason for this is that a double is not built up as whole
number as a long is so I assume that is the reason that I get wrong result ?

static unsafe void Main(string[] args)
{
double z = 5;
double* pZ = &z;
1 Console.WriteLine(sizeof(long)); // Here it displayes 8
2 Console.WriteLine(sizeof(double)); // Here it displayes 8
long* longpek = (long*)pZ;
double* doublepek = (double*)pZ;
Console.WriteLine(*doublepek);
Console.WriteLine(*longpek);
}
//Tony
 
T

Tony Johansson

Peter Duniho said:
You'll have to define "wrong result".

For sure, the bytes that compose a "long" value in memory use a completely
different format than those that compose a "double". A "long" is a plain
64-bit integer (in C#), while a "double" is a 64-bit IEEE floating point
format.

On the other hand, they are both just 8 bytes. You can interpret those 8
bytes however you like, and you'll get _something_. But just because they
both take 8 bytes to represent, that doesn't mean that the same 8 bytes
means the same thing in either format.

The code you posted reinterprets the 8 bytes that are formatted to
represent, in the IEEE floating point format, the number 5. Instead of
interpreting them only in the format that they are intended to be, it also
reinterprets them as a plain integer.

The result is "correct" in the sense that you get to see the same 8 bytes
interpreted in two different ways. Both interpretations are potentially
correct, but of course only the floating point interpretation will match
the initialization of the variable, since that's the format that was used
to initialize the variable.

It is at this point that I believe you would be best served by halting
your C# work for a few days, and spending some time learning about
numerical data formats.

It is VERY important as a programmer to understand that as far as the
computer is concerned, _all_ of your data is just a bunch of bits,
typically organized in groups of 8 (i.e. bytes). The data has no meaning
whatsoever until you interpret them specifically in some format, and the
resulting meaning will depend on how _you_ the programmer tell the
computer to interpret them.

It's up to the programmer to tell the computer the correct interpretation,
and if you lie to the computer (as in your code example), you will get
some result that is not what one might expect or intend.

Until you understand the various formats for numerical data, and in
particular the most common ones used in a binary computer (i.e. based on a
sequence of bits, grouped as bytes), all of this will remain a great
puzzle to you. Once you learn that crucial, foundational piece of
information that all competent programmers already understand, everything
will be painfully obvious to you. The difference is like night and day.

Alternatively, just don't ever use unsafe code in C#, nor classes like
BitConverter, BinaryReader, or BinaryWriter (which all provide similar
byte-level interpretations of higher-level data formats).

Pete

I was quite sure it was as you have explained but to be 100% sure I asked
the question

//Tony
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top