Struct inside class

S

Sneil

Example:
namespace _111_
{
public struct S
{
public int i1, i2;
}
public class C
{
public S s;
}
class Program
{
static void Main(string[] args)
{
C myClass = new C();
myClass.s.i1 = 999;
myClass.s.i2 = 888;
//at this point some memory must be assigned for
myClass.s.i1 & myClass.s.i2
//question: where this memory was taken from? From stack?
Or from heap?
}
}
}

I just don't sure....
 
B

Barry Kelly

Sneil said:
Example:
namespace _111_
{
public struct S
{
public int i1, i2;
}
public class C
{
public S s;
}
class Program
{
static void Main(string[] args)
{
C myClass = new C();

// At this point, memory is allocated for the struct s inside the new
instance of C.
myClass.s.i1 = 999;
myClass.s.i2 = 888;
//at this point some memory must be assigned for
myClass.s.i1 & myClass.s.i2
//question: where this memory was taken from? From stack?
Or from heap?

The memory for the struct s in this example is part of (i.e. fully
contained within) the heap-allocated object myClass.


-- Barry
 
S

Sneil

Barry said:
The memory for the struct s in this example is part of (i.e. fully
contained within) the heap-allocated object myClass.

OK, I think the same. BUT! In other words myClass.s.i1 is already in
heap, yes? Now - what is boxing? Boxing is creating special packed
version of value-type in heap. In our case i1 ALREADY in heap. So...
object o = myClass.s.i1; //_not_ a boxing here?
???
 
?

=?ISO-8859-1?Q?G=F6ran_Andersson?=

Sneil said:
OK, I think the same. BUT! In other words myClass.s.i1 is already in
heap, yes? Now - what is boxing? Boxing is creating special packed
version of value-type in heap. In our case i1 ALREADY in heap. So...
object o = myClass.s.i1; //_not_ a boxing here?
???

Yes, it's boxed. You are not storing the myClass.s.il variable in the
object, you are storing a copy of the value of the myClass.s.il variable.
 
J

Jon Skeet [C# MVP]

Göran Andersson said:
Yes, it's boxed. You are not storing the myClass.s.il variable in the
object, you are storing a copy of the value of the myClass.s.il variable.

No, there's no boxing going on there. Boxing is creating a separate
object on the heap for a value type. That's not happening here.
 
?

=?ISO-8859-1?Q?G=F6ran_Andersson?=

Jon said:
No, there's no boxing going on there. Boxing is creating a separate
object on the heap for a value type. That's not happening here.

Yes, it is.

myClass.s.i1 is an integer variable. The value copied from that integer
variable can not be stored in the variable o, as it is an object
reference, so a separate object has to be created on the heap where the
value can be stored, and the reference to that object is stored in the
variable o.
 
M

Michael D. Ober

You're missing the point.

All instance data inside a class is stored on the heap, but within the
memory allocation for that object. The compiler figures computes the
instance data layout, just like it would for the local variables in a
procedure and then references all the instance data relative to the instance
base address. This is very simple to code in machine assembler (MOV AX,
WORD PTR [DX]) for example. Boxing is a runtime feature and is used only
when an object's type is unknown at compile time. When you put a struct
inside a class, the compiler knows the types at runtime, except when you
declare a structure component as a generic "Object".

Here's a simple set of rules for how storage is allocated:

Reference types, including anything derived from "Object": Heap pointer
from either a stack frame, global compilera allocated storage, or another
object on the heap.
All Value Types, including "Struct": Stored on the stack, in global
compiler allocated storage, or relative to the base address of an object on
the heap

Where this gets confusing is when you use a value type as part of the
instance data of a reference type. In this case, the value type is stored
on the heap, but inside the allocated space for the reference type instance.
When you use a reference type as instance data of a reference type, all that
is allocated inside the containing reference type instances is a pointer to
the instance data. This complexity is why C and C++ programs tend to have
memory leaks and why the .NET good garbage collector is required.

Note that I referenced "global compiler allocated storage". This is the
memory allocated by the compiler for globally accessible public variables as
well as static (VB Shared) variables associated with objects. On the x86
platforms, this memory is reference relative to the "DS" register.

In your original example,

namespace _111_
{
public struct S
{
public int i1, i2;
}
public class C
{
public S s;
}
class Program
{
static void Main(string[] args)
{
C myClass = new C();
myClass.s.i1 = 999;
myClass.s.i2 = 888;
//at this point some memory must be assigned for
myClass.s.i1 & myClass.s.i2
//question: where this memory was taken from? From stack?
Or from heap?
}
}
}

The class C is allocated from the heap. Since struct S is a value type, it
it stored entirely in the class. The named components of S are also value
types, so they are stored inside S:

Offset Value Code Reference Comments
0000 i1 base of class C, start of Struct S, int i1 is laid
down first by the compiler
0004 i2 int i2
0008 next object starts here


In code:
// myClass = allocate(C)
MOV EAX, 8 // C is 8 bytes long - GC_ALLOCATE will
return the offset in EAX as well
CALL GC_ALLOCATE // Have the garbage collector allocate 8 bytes of
storage
// The GC will add object
overhead, but these will be negative offsets from the returned address.
MOV [ESP], EAX // The variable myClass is on the stack at
offset 0 relative to the current stack frame BP
MOV EAX, myClass // myClass's address is actually stored on the
stack; this instruction can be optimized out in this case
MOV DWORD PTR [EAX], 999 // myClass.s.i1 = 999
MOV DWORD PTR [EAX]+4, 888 // myClass.s.i2 = 888

First - I don't guarantee the syntax (I haven't written in x86 assembler in
several years), but this is close enough for discussion.
Second - GC_ALLOCATE returns an offset into the heap. This is a "magic"
number that the memory management subsystem handles for you. It is relative
to the value in register DS, which is set at application startup by the
memory manager. As you should be able to see from the assembler code, there
is no "boxing" of the variables. Boxing requires a rather expensive call to
determine an object's actual data type.

All Reference types are allocated in a similar method to the above example.
Value types are allocated by the compiler and don't require the call to
GC_ALLOCATE at run time.

Note the magic occurring in "GC_ALLOCATE" - it allocates memory from the
heap and returns an offset into the heap that is then used by later code for
reference to the memory. The actual object size will be 8 bytes plus
garbage collector management buffer. The GC management buffer will be at
negative offets to the returned address, thus making the rest of the
compiler easier to write. If GC_ALLOCATE can't allocate the requested
memory, it calls GC_COLLECT, which runs compacts accessible heap memory and
resets the allocation pointer for GC_ALLOCATE, which then tries again. If
GC_ALLOCATE still can't allocate memory, it asks the OS to extend the heap.
The OS returns the new heap size to GC_ALLOCATE. If GC_ALLOCATE still can't
allocate the requested memory, it throws an Out of Memory exception.

Mike Ober.
 
?

=?ISO-8859-1?Q?G=F6ran_Andersson?=

No, you are missing the point.

This is the code in question:

object o = myClass.s.i1;

From the previous discussion, we know that i1 is a member variable of a
strunct in a class, so it's stored on the heap.

The question was if boxing occurs or not, considering that the i1
variable is stored on the heap.

The answer is that boxing does occur, because it's not the i1 variable
that is stored in the object, but the value from the i1 variable. It
doesn't matter where the i1 variable is stored.

You're missing the point.

All instance data inside a class is stored on the heap, but within the
memory allocation for that object. The compiler figures computes the
instance data layout, just like it would for the local variables in a
procedure and then references all the instance data relative to the instance
base address. This is very simple to code in machine assembler (MOV AX,
WORD PTR [DX]) for example. Boxing is a runtime feature and is used only
when an object's type is unknown at compile time. When you put a struct
inside a class, the compiler knows the types at runtime, except when you
declare a structure component as a generic "Object".

Here's a simple set of rules for how storage is allocated:

Reference types, including anything derived from "Object": Heap pointer
from either a stack frame, global compilera allocated storage, or another
object on the heap.
All Value Types, including "Struct": Stored on the stack, in global
compiler allocated storage, or relative to the base address of an object on
the heap

Where this gets confusing is when you use a value type as part of the
instance data of a reference type. In this case, the value type is stored
on the heap, but inside the allocated space for the reference type instance.
When you use a reference type as instance data of a reference type, all that
is allocated inside the containing reference type instances is a pointer to
the instance data. This complexity is why C and C++ programs tend to have
memory leaks and why the .NET good garbage collector is required.

Note that I referenced "global compiler allocated storage". This is the
memory allocated by the compiler for globally accessible public variables as
well as static (VB Shared) variables associated with objects. On the x86
platforms, this memory is reference relative to the "DS" register.

In your original example,

namespace _111_
{
public struct S
{
public int i1, i2;
}
public class C
{
public S s;
}
class Program
{
static void Main(string[] args)
{
C myClass = new C();
myClass.s.i1 = 999;
myClass.s.i2 = 888;
//at this point some memory must be assigned for
myClass.s.i1 & myClass.s.i2
//question: where this memory was taken from? From stack?
Or from heap?
}
}
}

The class C is allocated from the heap. Since struct S is a value type, it
it stored entirely in the class. The named components of S are also value
types, so they are stored inside S:

Offset Value Code Reference Comments
0000 i1 base of class C, start of Struct S, int i1 is laid
down first by the compiler
0004 i2 int i2
0008 next object starts here


In code:
// myClass = allocate(C)
MOV EAX, 8 // C is 8 bytes long - GC_ALLOCATE will
return the offset in EAX as well
CALL GC_ALLOCATE // Have the garbage collector allocate 8 bytes of
storage
// The GC will add object
overhead, but these will be negative offsets from the returned address.
MOV [ESP], EAX // The variable myClass is on the stack at
offset 0 relative to the current stack frame BP
MOV EAX, myClass // myClass's address is actually stored on the
stack; this instruction can be optimized out in this case
MOV DWORD PTR [EAX], 999 // myClass.s.i1 = 999
MOV DWORD PTR [EAX]+4, 888 // myClass.s.i2 = 888

First - I don't guarantee the syntax (I haven't written in x86 assembler in
several years), but this is close enough for discussion.
Second - GC_ALLOCATE returns an offset into the heap. This is a "magic"
number that the memory management subsystem handles for you. It is relative
to the value in register DS, which is set at application startup by the
memory manager. As you should be able to see from the assembler code, there
is no "boxing" of the variables. Boxing requires a rather expensive call to
determine an object's actual data type.

All Reference types are allocated in a similar method to the above example.
Value types are allocated by the compiler and don't require the call to
GC_ALLOCATE at run time.

Note the magic occurring in "GC_ALLOCATE" - it allocates memory from the
heap and returns an offset into the heap that is then used by later code for
reference to the memory. The actual object size will be 8 bytes plus
garbage collector management buffer. The GC management buffer will be at
negative offets to the returned address, thus making the rest of the
compiler easier to write. If GC_ALLOCATE can't allocate the requested
memory, it calls GC_COLLECT, which runs compacts accessible heap memory and
resets the allocation pointer for GC_ALLOCATE, which then tries again. If
GC_ALLOCATE still can't allocate memory, it asks the OS to extend the heap.
The OS returns the new heap size to GC_ALLOCATE. If GC_ALLOCATE still can't
allocate the requested memory, it throws an Out of Memory exception.

Mike Ober.


Göran Andersson said:
Yes, it is.

myClass.s.i1 is an integer variable. The value copied from that integer
variable can not be stored in the variable o, as it is an object
reference, so a separate object has to be created on the heap where the
value can be stored, and the reference to that object is stored in the
variable o.
 
C

Carl Daniel [VC++ MVP]

Göran Andersson said:
No, you are missing the point.

This is the code in question:

object o = myClass.s.i1;

From the previous discussion, we know that i1 is a member variable of
a strunct in a class, so it's stored on the heap.

The question was if boxing occurs or not, considering that the i1
variable is stored on the heap.

The answer is that boxing does occur, because it's not the i1 variable
that is stored in the object, but the value from the i1 variable. It
doesn't matter where the i1 variable is stored.

Absolutely correct. o is a reference to a boxed int that contains the same
value as the variable myClass.s.i1.

The only way to not incur boxing in a case like this is to hold an "interior
pointer" to myClass.s.i1. An ordinary object reference is not an interior
pointer - it always references a complete object.

-cd
 
J

Jon Skeet [C# MVP]

Göran Andersson said:
Yes, it is.

myClass.s.i1 is an integer variable. The value copied from that integer
variable can not be stored in the variable o, as it is an object
reference

Ah, yes, I missed the storage in the variable o. Yes, this involves
boxing. The struct within a class bit is entirely irrelevant.
 
S

Sneil

WOW! It's just amazing answer, almost little article. :) Great thanks
for it!
It is relative to the value in register DS, which is set at
application startup by the memory manager. As you should be able to
see from the assembler code, there is no "boxing" of the variables.
Boxing requires a rather expensive call to determine an object's
actual data type.

Yes, you absolutely right - up to this point no any boxing. But my
second question was:
......
myClass.s.i1 = 999;
myClass.s.i2 = 888;
object o = myClass.s.i1; //<< is boxing _here_?
Now I am sure - it IS. I see it in Reflector:
......
L_001d: ldc.i4 888
L_0022: stfld int32 _111_.S::i2
L_0027: ldloc.0
L_0028: ldflda _111_.S _111_.C::s
L_002d: ldfld int32 _111_.S::i1L_0037: stloc.1
......
So - in spite of the fact that i1 already in heap it can not be stored
in the variable o, so boxing it is inevitable.
 
M

Michael D. Ober

In OPs original source, it isn't obvious that OP is referring to a generic
object variable. In this case, boxing must be done. The runtime cannot
operate on a generic object variable without the boxing. I suspect that the
concrete object will still be laid out the same manner by the compiler, but
that the compiler will have to generate addition calls into the memory
manager to box and unbox every reference to the object "o". That's why I
copied OP's original sample.

Mike.

Göran Andersson said:
No, you are missing the point.

This is the code in question:

object o = myClass.s.i1;

From the previous discussion, we know that i1 is a member variable of a
strunct in a class, so it's stored on the heap.

The question was if boxing occurs or not, considering that the i1
variable is stored on the heap.

The answer is that boxing does occur, because it's not the i1 variable
that is stored in the object, but the value from the i1 variable. It
doesn't matter where the i1 variable is stored.

You're missing the point.

All instance data inside a class is stored on the heap, but within the
memory allocation for that object. The compiler figures computes the
instance data layout, just like it would for the local variables in a
procedure and then references all the instance data relative to the instance
base address. This is very simple to code in machine assembler (MOV AX,
WORD PTR [DX]) for example. Boxing is a runtime feature and is used only
when an object's type is unknown at compile time. When you put a struct
inside a class, the compiler knows the types at runtime, except when you
declare a structure component as a generic "Object".

Here's a simple set of rules for how storage is allocated:

Reference types, including anything derived from "Object": Heap pointer
from either a stack frame, global compilera allocated storage, or another
object on the heap.
All Value Types, including "Struct": Stored on the stack, in global
compiler allocated storage, or relative to the base address of an object on
the heap

Where this gets confusing is when you use a value type as part of the
instance data of a reference type. In this case, the value type is stored
on the heap, but inside the allocated space for the reference type instance.
When you use a reference type as instance data of a reference type, all that
is allocated inside the containing reference type instances is a pointer to
the instance data. This complexity is why C and C++ programs tend to have
memory leaks and why the .NET good garbage collector is required.

Note that I referenced "global compiler allocated storage". This is the
memory allocated by the compiler for globally accessible public variables as
well as static (VB Shared) variables associated with objects. On the x86
platforms, this memory is reference relative to the "DS" register.

In your original example,

namespace _111_
{
public struct S
{
public int i1, i2;
}
public class C
{
public S s;
}
class Program
{
static void Main(string[] args)
{
C myClass = new C();
myClass.s.i1 = 999;
myClass.s.i2 = 888;
//at this point some memory must be assigned for
myClass.s.i1 & myClass.s.i2
//question: where this memory was taken from? From stack?
Or from heap?
}
}
}

The class C is allocated from the heap. Since struct S is a value type, it
it stored entirely in the class. The named components of S are also value
types, so they are stored inside S:

Offset Value Code Reference Comments
0000 i1 base of class C, start of Struct S, int i1 is laid
down first by the compiler
0004 i2 int i2
0008 next object starts here


In code:
// myClass = allocate(C)
MOV EAX, 8 // C is 8 bytes long - GC_ALLOCATE will
return the offset in EAX as well
CALL GC_ALLOCATE // Have the garbage collector allocate 8 bytes of
storage
// The GC will add object
overhead, but these will be negative offsets from the returned address.
MOV [ESP], EAX // The variable myClass is on the stack at
offset 0 relative to the current stack frame BP
MOV EAX, myClass // myClass's address is actually stored on the
stack; this instruction can be optimized out in this case
MOV DWORD PTR [EAX], 999 // myClass.s.i1 = 999
MOV DWORD PTR [EAX]+4, 888 // myClass.s.i2 = 888

First - I don't guarantee the syntax (I haven't written in x86 assembler in
several years), but this is close enough for discussion.
Second - GC_ALLOCATE returns an offset into the heap. This is a "magic"
number that the memory management subsystem handles for you. It is relative
to the value in register DS, which is set at application startup by the
memory manager. As you should be able to see from the assembler code, there
is no "boxing" of the variables. Boxing requires a rather expensive call to
determine an object's actual data type.

All Reference types are allocated in a similar method to the above example.
Value types are allocated by the compiler and don't require the call to
GC_ALLOCATE at run time.

Note the magic occurring in "GC_ALLOCATE" - it allocates memory from the
heap and returns an offset into the heap that is then used by later code for
reference to the memory. The actual object size will be 8 bytes plus
garbage collector management buffer. The GC management buffer will be at
negative offets to the returned address, thus making the rest of the
compiler easier to write. If GC_ALLOCATE can't allocate the requested
memory, it calls GC_COLLECT, which runs compacts accessible heap memory and
resets the allocation pointer for GC_ALLOCATE, which then tries again. If
GC_ALLOCATE still can't allocate memory, it asks the OS to extend the heap.
The OS returns the new heap size to GC_ALLOCATE. If GC_ALLOCATE still can't
allocate the requested memory, it throws an Out of Memory exception.

Mike Ober.


Göran Andersson said:
Jon Skeet [C# MVP] wrote:
OK, I think the same. BUT! In other words myClass.s.i1 is already in
heap, yes? Now - what is boxing? Boxing is creating special packed
version of value-type in heap. In our case i1 ALREADY in heap. So...
object o = myClass.s.i1; //_not_ a boxing here?
???
Yes, it's boxed. You are not storing the myClass.s.il variable in the
object, you are storing a copy of the value of the myClass.s.il variable.
No, there's no boxing going on there. Boxing is creating a separate
object on the heap for a value type. That's not happening here.

Yes, it is.

myClass.s.i1 is an integer variable. The value copied from that integer
variable can not be stored in the variable o, as it is an object
reference, so a separate object has to be created on the heap where the
value can be stored, and the reference to that object is stored in the
variable o.
 
M

Michael D. Ober

I missed the second question as I couldn't get a clean download until this
morning. Sorry about that. As you have already discovered, boxing requires
additional code and an additional call into the memory manager. There will
be analogous code on the outbound side of the box as well. Boxing not only
takes additional code, but it also takes additional memory since the runtime
must store the metadata for the variable as well.

Mike.
 
M

Michael D. Ober

Actually, the struct inside a class is relavant. The compiler uses this
information to generate the metadata required by the boxing.

Mike.

Göran Andersson said:
Yes, it is.

myClass.s.i1 is an integer variable. The value copied from that integer
variable can not be stored in the variable o, as it is an object
reference

Ah, yes, I missed the storage in the variable o. Yes, this involves
boxing. The struct within a class bit is entirely irrelevant.
 
?

=?ISO-8859-1?Q?G=F6ran_Andersson?=

No, it's not. It's just an integer value that is stored in the boxing
object. Where the value came from originally is totally irrelevant for
how the boxing is done.

If you compare these two statements:

object o = myClass.s.i1;

and

object p = 999;

The values that will be stored in the boxing objects will be identical,
and the boxing will be performed in exactly the same way.
 
M

Michael D. Ober

The relavance comes from the compiler itself having to know which data type
metadata to feed to the boxing routine. In the case of the struct, the
compiler must know the structure's contained datatypes or it can't box. In
the second case, the compiler also determines the datatype to give the
boxing routines. You are correct that it's not relevant at runtime, but it
is relavant at compile time.

Mike Ober.
 
J

Jon Skeet [C# MVP]

Michael D. Ober said:
Actually, the struct inside a class is relavant. The compiler uses this
information to generate the metadata required by the boxing.

The fact that the value originally came from inside a class is
irrelevant. The boxing just creates a boxed System.Int32, regardless of
the origin of the value. The type of the value and the evaluated value
are the only important things.
 
J

Jon Skeet [C# MVP]

Michael D. Ober said:
The relavance comes from the compiler itself having to know which data type
metadata to feed to the boxing routine. In the case of the struct, the
compiler must know the structure's contained datatypes or it can't box. In
the second case, the compiler also determines the datatype to give the
boxing routines. You are correct that it's not relevant at runtime, but it
is relavant at compile time

Yes, it has to know the type - but that's true whatever you're doing.
Boxing a value from inside a struct which is inside a class is exactly
the same as boxing a value of the same type which is evaluated in a
different way.

The compiler is able to traverse the expression to work out the type
required, but that's orthogonal to boxing.
 
M

Michael D. Ober

Agreed.

Mike.

Jon Skeet said:
The fact that the value originally came from inside a class is
irrelevant. The boxing just creates a boxed System.Int32, regardless of
the origin of the value. The type of the value and the evaluated value
are the only important things.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top