Right way of passing unmanaged structures between managed and native code?

  • Thread starter Vladimir Kouznetsov
  • Start date
V

Vladimir Kouznetsov

Hi group,

It seems there are no guarantees that the same unmanaged structures have the
same binary representation for managed and native code. How do I use them in
mixed-code assemblies? Is there just some compiler magic behind curtains
(IJW, some implicit marshalling, which can lead to a performance
degradation) or the data are binary compatible in these scenarios or I
perhaps should use attributes to explicitly specify the layout (which BTW
may depend on compiler switches increasing complexity of maintenance)?

thanks,
v
 
V

Vladimir Kouznetsov

Was: Right way of passing unmanaged structures between managed and native
code?
Either the subject was not catchy enough (again) or people just kill-filter
me. I'll try once more.
It seems there are no guarantees that the same unmanaged structures have the
same binary representation for managed and native code. How do I use them in
mixed-code assemblies? Is there just some compiler magic behind curtains
(IJW, some implicit marshalling, which can lead to a performance
degradation) or the data are binary compatible in these scenarios or I
perhaps should use attributes to explicitly specify the layout (which BTW
may depend on compiler switches increasing complexity of maintenance)?

thanks,
v
 
R

Ronald Laeremans [MSFT]

Hi Vladimir,

Unmanaged types are guaranteed to have the same representation regardless of
whether they are compiled with or without the /clr switch. What
documentation or experiment makes you believe otherwise?

Ronald Laeremans
Visual C++ team
 
V

Vladimir Kouznetsov

Thank you Ronald,

I found the following phrase
(http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dv_vstecha
rt/html/vbtchtroubleshootingnetinteroperability.asp "Troubleshooting .NET
Interoperability"): "Because Visual Studio .NET optimizes the way data is
stored, the unmanaged representation of the structure does not always match
the managed representation". From that I concluded that either alignment or
order of members or both can be different for managed and unmanaged code. If
I'm misinterpreting that, that probably should be clarified. Otherwise if
that is incorrect I'm happy to know that.

thanks,
v
 
R

Ronald Laeremans [MSFT]

This article is talking about taking (in C++.Net version 7.x terms about the
layout between:

__gc struct Foo
{
SomeType SomeMember;
...
};

or

__value struct Foo
{
SomeType SomeMember;
};

And

struct Foo
{
SomeType SomeMember;
}

And NOT about the latter definition compiled either with or without the
/CLR switch.

If you use ildasm on the latter type you will see that it is eximitted in a
CLR sense as an opqaue (i.e. empty type) with explicit layout and explicit
size with the compiler generated code directly managing the layout. Which is
indeed guaranteed to be the same in either case.

Ronald
 
V

Vladimir Kouznetsov

Thanks a lot Ronald,

That's a relief! I knew that __value struct may not be binary compatible
with struct and judging by the context of the article it most probably
should have been the topic. Though other people could be misled as I was. Is
anywhere an explicit statement regarding the matter? That would be nice to
know it's not going to change in the future.
BTW why are you saying that opaqueness of the type is the guarantee of
binary compatibility?

thanks,
v
 
R

Ronald Laeremans [MSFT]

Because the opaqueness means that the compiler generated code uses offsets
directly instead of using any members by name (since there are non). That is
exactly what the native code does.

As an example.

Compiling the following sample.
#using <mscorlib.dll>

struct Foo
{
int i;
int j;
};

int main()
{
Foo foo;
foo.i = 10;
foo.j = 11;
return 0;
}
<<<<<<<<<<<<<<

Leads to the following definition of Foo (from ildasm)

..class public sequential ansi sealed Foo
extends [mscorlib]System.ValueType
{
.pack 1
.size 8
} // end of class Foo

And the following code in main:
(my comments added)

..method public static int32
modopt([mscorlib]System.Runtime.CompilerServices.CallConvCdecl)
main() cil managed
{
.vtentry 1 : 1
// Code size 16 (0x10)
.maxstack 2
.locals init ([0] valuetype Foo foo)
IL_0000: ldloca.s foo // load the address of local variable foo
IL_0002: ldc.i4.s 10 // load the constant 10
IL_0004: stind.i4 // store it there
IL_0005: ldloca.s foo // load address again
IL_0007: ldc.i4.4 // add the offset for to get to the address of
foo.j
IL_0008: add
IL_0009: ldc.i4.s 11 // what we need to store in it
IL_000b: stind.i4 // store it
IL_000c: ldc.i4.0 // 0 return value
IL_000d: br.s IL_000f // for debugging purposes
IL_000f: ret
} // end of method 'Global Functions'::main

Ronald
 
V

Vladimir Kouznetsov

Thank you for the explanation Ronald!

I understand that the code uses offsets, but I've failed to see how does
that mean that the layout is the same. In the example it is so but one still
can suspect it may not be so in some cases. You probably know some compiler
internal details that make you think that's always true so I take your word
for it.

thanks,
v

Ronald Laeremans said:
Because the opaqueness means that the compiler generated code uses offsets
directly instead of using any members by name (since there are non). That is
exactly what the native code does.

As an example.

Compiling the following sample.
#using <mscorlib.dll>

struct Foo
{
int i;
int j;
};

int main()
{
Foo foo;
foo.i = 10;
foo.j = 11;
return 0;
}
<<<<<<<<<<<<<<

Leads to the following definition of Foo (from ildasm)

.class public sequential ansi sealed Foo
extends [mscorlib]System.ValueType
{
.pack 1
.size 8
} // end of class Foo

And the following code in main:
(my comments added)

.method public static int32
modopt([mscorlib]System.Runtime.CompilerServices.CallConvCdecl)
main() cil managed
{
.vtentry 1 : 1
// Code size 16 (0x10)
.maxstack 2
.locals init ([0] valuetype Foo foo)
IL_0000: ldloca.s foo // load the address of local variable foo
IL_0002: ldc.i4.s 10 // load the constant 10
IL_0004: stind.i4 // store it there
IL_0005: ldloca.s foo // load address again
IL_0007: ldc.i4.4 // add the offset for to get to the address of
foo.j
IL_0008: add
IL_0009: ldc.i4.s 11 // what we need to store in it
IL_000b: stind.i4 // store it
IL_000c: ldc.i4.0 // 0 return value
IL_000d: br.s IL_000f // for debugging purposes
IL_000f: ret
} // end of method 'Global Functions'::main

Ronald

Vladimir Kouznetsov said:
Thanks a lot Ronald,

That's a relief! I knew that __value struct may not be binary compatible
with struct and judging by the context of the article it most probably
should have been the topic. Though other people could be misled as I
was.
Is
anywhere an explicit statement regarding the matter? That would be nice to
know it's not going to change in the future.
BTW why are you saying that opaqueness of the type is the guarantee of
binary compatibility?

thanks,
v

about
the in Which
(http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dv_vstecha
data
is Otherwise
if
scenarios
 
R

Ronald Laeremans [MSFT]

Because the code path that determines the layout and offset is exactly the
same regardless of whether we generate native or managed code. The point I
was making is that it is the compiler (c1xx.dll, c2.dll) that does this
rather than the CLR runtime and JITs who own translating names of types and
members into raw offsets for managed types.

Does that clarify it?

Ronald

Vladimir Kouznetsov said:
Thank you for the explanation Ronald!

I understand that the code uses offsets, but I've failed to see how does
that mean that the layout is the same. In the example it is so but one still
can suspect it may not be so in some cases. You probably know some compiler
internal details that make you think that's always true so I take your word
for it.

thanks,
v

Ronald Laeremans said:
Because the opaqueness means that the compiler generated code uses offsets
directly instead of using any members by name (since there are non).
That
is
exactly what the native code does.

As an example.

Compiling the following sample.
#using <mscorlib.dll>

struct Foo
{
int i;
int j;
};

int main()
{
Foo foo;
foo.i = 10;
foo.j = 11;
return 0;
}
<<<<<<<<<<<<<<

Leads to the following definition of Foo (from ildasm)

.class public sequential ansi sealed Foo
extends [mscorlib]System.ValueType
{
.pack 1
.size 8
} // end of class Foo

And the following code in main:
(my comments added)

.method public static int32
modopt([mscorlib]System.Runtime.CompilerServices.CallConvCdecl)
main() cil managed
{
.vtentry 1 : 1
// Code size 16 (0x10)
.maxstack 2
.locals init ([0] valuetype Foo foo)
IL_0000: ldloca.s foo // load the address of local variable foo
IL_0002: ldc.i4.s 10 // load the constant 10
IL_0004: stind.i4 // store it there
IL_0005: ldloca.s foo // load address again
IL_0007: ldc.i4.4 // add the offset for to get to the
address
of
foo.j
IL_0008: add
IL_0009: ldc.i4.s 11 // what we need to store in it
IL_000b: stind.i4 // store it
IL_000c: ldc.i4.0 // 0 return value
IL_000d: br.s IL_000f // for debugging purposes
IL_000f: ret
} // end of method 'Global Functions'::main

Ronald

was.
nice
to eximitted
in
(http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dv_vstecha
I
use scenarios
 
V

Vladimir Kouznetsov

Thanks again Ronald,

Sorry, I must've been bored you to death. I guess you are saying that the
current compilation technology guarantees that the layout is binary
compatible and there are no indications it may change in the future. And all
other potential vendors are aware about that, so if for example Intel ever
supports managed code generation that would be a defect in the
implementation to not provide that, claiming that they compiler is
compatible with yours.

thank you very much!
v

Ronald Laeremans said:
Because the code path that determines the layout and offset is exactly the
same regardless of whether we generate native or managed code. The point I
was making is that it is the compiler (c1xx.dll, c2.dll) that does this
rather than the CLR runtime and JITs who own translating names of types and
members into raw offsets for managed types.

Does that clarify it?

Ronald

Vladimir Kouznetsov said:
Thank you for the explanation Ronald!

I understand that the code uses offsets, but I've failed to see how does
that mean that the layout is the same. In the example it is so but one still
can suspect it may not be so in some cases. You probably know some compiler
internal details that make you think that's always true so I take your word
for it.

thanks,
v

Because the opaqueness means that the compiler generated code uses offsets
directly instead of using any members by name (since there are non).
That
is
exactly what the native code does.

As an example.

Compiling the following sample.


#using <mscorlib.dll>

struct Foo
{
int i;
int j;
};

int main()
{
Foo foo;
foo.i = 10;
foo.j = 11;
return 0;
}
<<<<<<<<<<<<<<

Leads to the following definition of Foo (from ildasm)

.class public sequential ansi sealed Foo
extends [mscorlib]System.ValueType
{
.pack 1
.size 8
} // end of class Foo

And the following code in main:
(my comments added)

.method public static int32
modopt([mscorlib]System.Runtime.CompilerServices.CallConvCdecl)
main() cil managed
{
.vtentry 1 : 1
// Code size 16 (0x10)
.maxstack 2
.locals init ([0] valuetype Foo foo)
IL_0000: ldloca.s foo // load the address of local variable foo
IL_0002: ldc.i4.s 10 // load the constant 10
IL_0004: stind.i4 // store it there
IL_0005: ldloca.s foo // load address again
IL_0007: ldc.i4.4 // add the offset for to get to the
address
of
foo.j
IL_0008: add
IL_0009: ldc.i4.s 11 // what we need to store in it
IL_000b: stind.i4 // store it
IL_000c: ldc.i4.0 // 0 return value
IL_000d: br.s IL_000f // for debugging purposes
IL_000f: ret
} // end of method 'Global Functions'::main

Ronald

Thanks a lot Ronald,

That's a relief! I knew that __value struct may not be binary compatible
with struct and judging by the context of the article it most probably
should have been the topic. Though other people could be misled as I was.
Is
anywhere an explicit statement regarding the matter? That would be
nice
to
know it's not going to change in the future.
BTW why are you saying that opaqueness of the type is the guarantee of
binary compatibility?

thanks,
v

This article is talking about taking (in C++.Net version 7.x terms about
the
layout between:

__gc struct Foo
{
SomeType SomeMember;
...
};

or

__value struct Foo
{
SomeType SomeMember;
};

And

struct Foo
{
SomeType SomeMember;
}

And NOT about the latter definition compiled either with or
without
the
/CLR switch.

If you use ildasm on the latter type you will see that it is eximitted
in
a
CLR sense as an opqaue (i.e. empty type) with explicit layout and
explicit
size with the compiler generated code directly managing the layout.
Which
is
indeed guaranteed to be the same in either case.

Ronald

Thank you Ronald,

I found the following phrase
(http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dv_vstecha
do
 
C

Carl Daniel [VC++ MVP]

Not exactly.

Code compiled with VC++, regardless of whether it's compiled with /clr or
not is guaranteed to use the same layout for native structs/classes.
There's no guarantee (and there never was) that any other vendor's compiler
will be binary compatible with VC++. Now, it just so happens that Intel's
compiler is binary compatible, since they went to pains to make it so, but
some other compiler might not be. Versions of the Visual C++ compiler have
always been highly compatible with one another, and there's no reason (that
I know of) to believe that that will change any time soon, so it's likely
that future versions of VC++ will also be compatible with today's compiler
in terms of struct layout.

Note that another vendor's compiler generating .NET __value structs would
automaticaly be layout compatible, since the layout is handled by the
CLR/JIT and not by the compiler in the first place.

-cd

Vladimir said:
Thanks again Ronald,

Sorry, I must've been bored you to death. I guess you are saying that
the current compilation technology guarantees that the layout is
binary compatible and there are no indications it may change in the
future. And all other potential vendors are aware about that, so if
for example Intel ever supports managed code generation that would be
a defect in the implementation to not provide that, claiming that
they compiler is compatible with yours.

thank you very much!
v

Ronald Laeremans said:
Because the code path that determines the layout and offset is
exactly the same regardless of whether we generate native or managed
code. The point I was making is that it is the compiler (c1xx.dll,
c2.dll) that does this rather than the CLR runtime and JITs who own
translating names of types and members into raw offsets for managed
types.

Does that clarify it?

Ronald

Vladimir Kouznetsov said:
Thank you for the explanation Ronald!

I understand that the code uses offsets, but I've failed to see how
does that mean that the layout is the same. In the example it is so
but one still can suspect it may not be so in some cases. You
probably know some compiler internal details that make you think
that's always true so I take your word for it.

thanks,
v

message Because the opaqueness means that the compiler generated code uses
offsets directly instead of using any members by name (since there
are non). That is exactly what the native code does.

As an example.

Compiling the following sample.


#using <mscorlib.dll>

struct Foo
{
int i;
int j;
};

int main()
{
Foo foo;
foo.i = 10;
foo.j = 11;
return 0;
}
<<<<<<<<<<<<<<

Leads to the following definition of Foo (from ildasm)

.class public sequential ansi sealed Foo
extends [mscorlib]System.ValueType
{
.pack 1
.size 8
} // end of class Foo

And the following code in main:
(my comments added)

.method public static int32
modopt([mscorlib]System.Runtime.CompilerServices.CallConvCdecl)
main() cil managed
{
.vtentry 1 : 1
// Code size 16 (0x10)
.maxstack 2
.locals init ([0] valuetype Foo foo)
IL_0000: ldloca.s foo // load the address of local variable
foo IL_0002: ldc.i4.s 10 // load the constant 10
IL_0004: stind.i4 // store it there
IL_0005: ldloca.s foo // load address again
IL_0007: ldc.i4.4 // add the offset for to get to the
address of foo.j
IL_0008: add
IL_0009: ldc.i4.s 11 // what we need to store in it
IL_000b: stind.i4 // store it
IL_000c: ldc.i4.0 // 0 return value
IL_000d: br.s IL_000f // for debugging purposes
IL_000f: ret
} // end of method 'Global Functions'::main

Ronald

message Thanks a lot Ronald,

That's a relief! I knew that __value struct may not be binary
compatible with struct and judging by the context of the article
it most probably should have been the topic. Though other people
could be misled as I was. Is anywhere an explicit statement
regarding the matter? That would be nice to know it's not going
to change in the future.
BTW why are you saying that opaqueness of the type is the
guarantee of binary compatibility?

thanks,
v

message This article is talking about taking (in C++.Net version 7.x
terms about the layout between:

__gc struct Foo
{
SomeType SomeMember;
...
};

or

__value struct Foo
{
SomeType SomeMember;
};

And

struct Foo
{
SomeType SomeMember;
}

And NOT about the latter definition compiled either with or
without the /CLR switch.

If you use ildasm on the latter type you will see that it is eximitted
in
a
CLR sense as an opqaue (i.e. empty type) with explicit layout
and explicit size with the compiler generated code directly
managing the layout.
Which
is
indeed guaranteed to be the same in either case.

Ronald

message Thank you Ronald,

I found the following phrase
(http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dv_vstecha
 
V

Vladimir Kouznetsov

Thank you for the explanation Carl,

I know that other compilers don't have to be binary compatible. I said that
would be a defect if they claimed that their compilers are binary compatible
with Visual C++ but didn't provide binary compatibility of structures for
both managed and native code.
The note about compatibility of __value struct was really useful, thank you.

thanks,
v

Carl Daniel said:
Not exactly.

Code compiled with VC++, regardless of whether it's compiled with /clr or
not is guaranteed to use the same layout for native structs/classes.
There's no guarantee (and there never was) that any other vendor's compiler
will be binary compatible with VC++. Now, it just so happens that Intel's
compiler is binary compatible, since they went to pains to make it so, but
some other compiler might not be. Versions of the Visual C++ compiler have
always been highly compatible with one another, and there's no reason (that
I know of) to believe that that will change any time soon, so it's likely
that future versions of VC++ will also be compatible with today's compiler
in terms of struct layout.

Note that another vendor's compiler generating .NET __value structs would
automaticaly be layout compatible, since the layout is handled by the
CLR/JIT and not by the compiler in the first place.

-cd

Vladimir said:
Thanks again Ronald,

Sorry, I must've been bored you to death. I guess you are saying that
the current compilation technology guarantees that the layout is
binary compatible and there are no indications it may change in the
future. And all other potential vendors are aware about that, so if
for example Intel ever supports managed code generation that would be
a defect in the implementation to not provide that, claiming that
they compiler is compatible with yours.

thank you very much!
v

Ronald Laeremans said:
Because the code path that determines the layout and offset is
exactly the same regardless of whether we generate native or managed
code. The point I was making is that it is the compiler (c1xx.dll,
c2.dll) that does this rather than the CLR runtime and JITs who own
translating names of types and members into raw offsets for managed
types.

Does that clarify it?

Ronald

message Thank you for the explanation Ronald!

I understand that the code uses offsets, but I've failed to see how
does that mean that the layout is the same. In the example it is so
but one still can suspect it may not be so in some cases. You
probably know some compiler internal details that make you think
that's always true so I take your word for it.

thanks,
v

message Because the opaqueness means that the compiler generated code uses
offsets directly instead of using any members by name (since there
are non). That is exactly what the native code does.

As an example.

Compiling the following sample.


#using <mscorlib.dll>

struct Foo
{
int i;
int j;
};

int main()
{
Foo foo;
foo.i = 10;
foo.j = 11;
return 0;
}
<<<<<<<<<<<<<<

Leads to the following definition of Foo (from ildasm)

.class public sequential ansi sealed Foo
extends [mscorlib]System.ValueType
{
.pack 1
.size 8
} // end of class Foo

And the following code in main:
(my comments added)

.method public static int32
modopt([mscorlib]System.Runtime.CompilerServices.CallConvCdecl)
main() cil managed
{
.vtentry 1 : 1
// Code size 16 (0x10)
.maxstack 2
.locals init ([0] valuetype Foo foo)
IL_0000: ldloca.s foo // load the address of local variable
foo IL_0002: ldc.i4.s 10 // load the constant 10
IL_0004: stind.i4 // store it there
IL_0005: ldloca.s foo // load address again
IL_0007: ldc.i4.4 // add the offset for to get to the
address of foo.j
IL_0008: add
IL_0009: ldc.i4.s 11 // what we need to store in it
IL_000b: stind.i4 // store it
IL_000c: ldc.i4.0 // 0 return value
IL_000d: br.s IL_000f // for debugging purposes
IL_000f: ret
} // end of method 'Global Functions'::main

Ronald

message Thanks a lot Ronald,

That's a relief! I knew that __value struct may not be binary
compatible with struct and judging by the context of the article
it most probably should have been the topic. Though other people
could be misled as I was. Is anywhere an explicit statement
regarding the matter? That would be nice to know it's not going
to change in the future.
BTW why are you saying that opaqueness of the type is the
guarantee of binary compatibility?

thanks,
v

message This article is talking about taking (in C++.Net version 7.x
terms about the layout between:

__gc struct Foo
{
SomeType SomeMember;
...
};

or

__value struct Foo
{
SomeType SomeMember;
};

And

struct Foo
{
SomeType SomeMember;
}

And NOT about the latter definition compiled either with or
without the /CLR switch.

If you use ildasm on the latter type you will see that it is
eximitted
in
a
CLR sense as an opqaue (i.e. empty type) with explicit layout
and explicit size with the compiler generated code directly
managing the layout.
Which
is
indeed guaranteed to be the same in either case.

Ronald

message Thank you Ronald,

I found the following phrase
(http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dv_vstecha
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top