Garbage collectable pinned arrays!

B

Ben Voigt [C++ MVP]

For example, I have an application which uses glVertexPointer. I
Interop does all you need to pin the buffer when the GC kicks of,
this way, the buffer is protected for the duration of the call, but
that doesn't mean it is pinned during , it doesn't have to, it only
needs to get pinned when the GC runs!

The "duration of the call" isn't sufficient. The data has to stay at the
same address from glVertexPointer when opengl stores the address, until
glDrawElements when it reads from the arrays (and I have glColorPointer in
between and there are about six other arrays that could be configured in an
exponential number of combinations, so "opengl should have combined all
access to the array into one single call" just doesn't cut it). So I *need*
a pinning pointer if I'm to use an array allocated using .NET. Furthermore,
since the Gen0/LOH distinction is an implementation detail, simply combining
all my buffers into a single large object would be asking for severe
breakage in a future version of .NET, not to mention failing code reviews
left and right.
This won't change a bit, interop between managed and unmanaged is the
same whether you use C++/CLI or another managed language. All you
have is somewhat greater control (and responsability) when using
C++/CLI, but whenever you need to pass managed "buffers" to unmanaged
you need to watch for the GC.

In C++/CLI, I can request an immobile buffer by using "new" instead of
gcnew.

What's under discussion is having the same ability with other .NET
languages.
Horrible fragmentation? Ever looked at the native heap fragmentation
when using OVERLAPPED in unmanaged code?

Zilch. You allocate your buffers once and use them, reuse them, reuse them
again. At least that's what I do.

But if the runtime pinning (or fixing, or pointer tabling) is so efficient,
why does AllocateNativeOverlapped exist, why not use the standard mechanism?
The very existance of that function as an internal CLR implementation is
proof of the OP's requirement for a new feature (though it doesn't prove
things are being copied or otherwise, actually copying might be better than
allowing Gen0 to fragment).
 
B

Ben Voigt [C++ MVP]

Willy said:
As I said before, pinning is an action performed by the Interop layer
*when the GC initiates a scan, and it's not done by means of a call to
GCHandle.Alloc. You only need to pin "explicitely" when you are
passing an object to unmanaged code and you need to keep the object
alive and at fixed location after the PInvoke call returned, during
the call the PInvoke layer takes care of eventual pinning.

That's not an unusual case. I've already given two examples of APIs in
widespread use which require a buffer to stay in one position after the
initial function call which accepts the pointer.
Also, you keep ignoring my remark that the fact that addresses of
*Large* objects are fixed is a convenience of the current version of
the CLR, nothing stops MS from changing this.

Which is why the OP is asking for a keyword / MSIL flag that will let the
runtime know that the object is intended to be fixed for as long as it
lives. It would be an implementation detail whether the memory is allocated
from the LOH, OLE task allocator, etc, etc. Also I don't think that
sacrificing GC for such objects would necessarily be a big loss, they either
will live to the end of the process anyway, or they can be explicitly freed.

Hence my suggestion of T[] Marshal.AllocCoTaskMem<T>(int elementCount), T
required to be a "reference free" value type, meaning all members are
"reference free" value types.
It would be useful to request that a particular buffer not be
subject to relocation by the GC. Probably the easiest way to do
this would be to place it in the LOH. The OLE task allocator or
HGlobal allocator, both of which are already exposed by the Marshal
class in a typeless way, would be other options. It could be as
simple as adding a T[] Marshal.AllocCoTaskMem<T>(int elementCount)
override.

But now you are allocating from the unmanaged heap (COM heap or CRT
heap or whatever). So now you will incur the costs of copying back
and forth, again this depends on the semantics, but might be a
solution when you need to pass large data chunks to unmanaged land.

Why? .NET could create a proper array descriptor storing the
metadata alongside and access it directly.

Where? outside of the GC heap? Who's going to "manage" these objects
then? As you may know, there are other cheap means to pass fixed
buffers to managed code.

The CLR has no trouble with data outside the GC heap, it just can't hold
references to objects inside the GC heap (because then objects could be
reachable but the GC wouldn't know that). For example, MSIL instructions
have no trouble accessing structures on the stack.
 
J

Jeroen Mostert

Ben said:
In C++/CLI, I can request an immobile buffer by using "new" instead of
gcnew.

What's under discussion is having the same ability with other .NET
languages.
Marshal.AllocHGlobal()? Gives you a nice clump of unmanaged memory to play
with. In unsafe mode, you can use this as a pointer in the good
old-fashioned C++ way. And you even have to deallocate it in the good
old-fashioned C++ way! Wrapping it in a class that will deallocate that
memory on finalization is an obvious move, as is making it Disposable.

Of course, any marshaling will have to be done explicitly and/or with
pointer casting (as long as you're being unsafe...)
 
B

Ben Voigt [C++ MVP]

Jeroen said:
Marshal.AllocHGlobal()? Gives you a nice clump of unmanaged memory to
play with. In unsafe mode, you can use this as a pointer in the good
old-fashioned C++ way. And you even have to deallocate it in the good
old-fashioned C++ way! Wrapping it in a class that will deallocate
that memory on finalization is an obvious move, as is making it
Disposable.

Yeah but I want it to be treated type-safely inside .NET, if it needs an
extra dozen bytes for metadata to store the array size and type it wouldn't
especially bother me.

Hence my suggestion of a new override

T[] Marshal.AllocCoTaskMem<T>(int elementCount)

or similar for AllocHGlobal

And I still don't understand why .NET considers use of a pointer unsafe...
only casts or pointer arithmetic can ever be unsafe, only the operations
which create a new pointer, and only if not bounds checked.
 
J

Jeroen Mostert

Ben said:
And I still don't understand why .NET considers use of a pointer unsafe...
only casts or pointer arithmetic can ever be unsafe, only the operations
which create a new pointer, and only if not bounds checked.
Which leaves... what useful operation you could do with pointers that
couldn't be done with references?

You *can* eliminate all unsafe operations (and there sure are a lot of
them), but when you've done that, what you get is managed code. Safe managed
pointers are references... OK, garbage collection is orthogonal to that, but
the discussion's already been there. Pointers as they are are not safe. You
could have references to unmoveable objects, but it would be more akin to a
C++ reference than a C++ pointer.
 
L

Lasse Vågsæther Karlsen

Atmapuri said:
Hi!


God bless you :) Finally a real man (!)
Atmapuri

Note that I agree with you that data that you only use for interop could
possibly have a mechanism available to avoid having this overhead for
each call.

But don't fall into the trap of posting one question or assumption and
then just dismissing everyone not agreeing with you as being wrong.

If your original post was that pinning data had overhead, I think this
thread would consist of two posts: yours and one other from a random
person saying just "Yes, it does".

And at this point, let me just reiterate one of the tips this post has
given you multiple times: Use unmanaged code for this part. Managed code
is not a cover-all solution, it has tradeoffs, in particular in interop.
 
W

Willy Denoyette [MVP]

Ben Voigt said:
That's not an unusual case. I've already given two examples of APIs in
widespread use which require a buffer to stay in one position after the
initial function call which accepts the pointer.

True, but if you need this, why is the cost of pinning so important?
The cost of GCHandle.Alloc is ~5500 cycles. That means a one time cost to
pin a buffer that lives until the end of the process, if you do this early
in the process you won't suffer from fragmentation of the gen0 heap as this
object will end on the gen2 heap anyway.

Which is why the OP is asking for a keyword / MSIL flag that will let the
runtime know that the object is intended to be fixed for as long as it
lives. It would be an implementation detail whether the memory is
allocated from the LOH, OLE task allocator, etc, etc. Also I don't think
that sacrificing GC for such objects would necessarily be a big loss, they
either will live to the end of the process anyway, or they can be
explicitly freed.

But , this is what "fixed" is meant for, sure, it's scope is limited by it's
containing function scope, but you can perfectly pin an object across
several unmanaged function calls.
Consider following code snip:

...
internal class C
{
internal int v;
internal long l;
internal byte[] ba;
public C()
{
l = 123456789;
ba = new byte[2000];
}
}
void Foo()
{
int[] ia = new int[100];
C c = new C();
c.ba[0] = 123;
int x = 0;
unsafe
{
fixed (int* ptri = ia) // ptri becomes a "an untracked
pinned byref local"
fixed (int* ptrv = &c.v) // ptrv becomes a "an untracked
pinned byref local"
fixed (byte* ptrba = c.ba) // ptrba becomes a "an untracked
pinned byref local"
{
//call unmanaged code passing the pointers as arguments...
...
} // end of fixed scope, "release" the pinned references
}
}

The JIT compiler fills the "pointer table", which is part of the GCInfo
table, with references to objects that cannot get moved when the GC comes
along.
In above sample that means that :
- the instance of the int[] pointed to by ia,
- the instance of an int pointed to by c.v, and
- the instace of the byte[] pointed to by c.ba cannot move.
Note that fixing c.v also fixes the location of c instance.
This way of "fixing" objects in C# comes at nearly no costs.
If this doesn't suits your needs, then you will have to use the Marshal
class or GCHandle.Alloc, carefully considering it's (fixed) costs. If these
costs are too high, then you have choosen the wrong tool for the job at hand
I'm afraid.
Hence my suggestion of T[] Marshal.AllocCoTaskMem<T>(int elementCount), T
required to be a "reference free" value type, meaning all members are
"reference free" value types.

Why not post your suggestion(s) to the connect site? It makes little sense
to post them here, I guess.
It would be useful to request that a particular buffer not be
subject to relocation by the GC. Probably the easiest way to do
this would be to place it in the LOH. The OLE task allocator or
HGlobal allocator, both of which are already exposed by the Marshal
class in a typeless way, would be other options. It could be as
simple as adding a T[] Marshal.AllocCoTaskMem<T>(int elementCount)
override.

But now you are allocating from the unmanaged heap (COM heap or CRT
heap or whatever). So now you will incur the costs of copying back
and forth, again this depends on the semantics, but might be a
solution when you need to pass large data chunks to unmanaged land.

Why? .NET could create a proper array descriptor storing the
metadata alongside and access it directly.

Where? outside of the GC heap? Who's going to "manage" these objects
then? As you may know, there are other cheap means to pass fixed
buffers to managed code.

The CLR has no trouble with data outside the GC heap, it just can't hold
references to objects inside the GC heap (because then objects could be
reachable but the GC wouldn't know that). For example, MSIL instructions
have no trouble accessing structures on the stack.

I have no idea what can/cannot be done, only the CLR team can answer such
questions, after all, they are the only ones that will have to implement
such changes, arent't they?

Willy.
 
B

Ben Voigt [C++ MVP]

Jeroen said:
Which leaves... what useful operation you could do with pointers that
couldn't be done with references?

Return a pointer to a single element of a member array.
 
B

Ben Voigt [C++ MVP]

That's not an unusual case. I've already given two examples of APIs
True, but if you need this, why is the cost of pinning so important?
The cost of GCHandle.Alloc is ~5500 cycles. That means a one time
cost to pin a buffer that lives until the end of the process, if you
do this early in the process you won't suffer from fragmentation of
the gen0 heap as this object will end on the gen2 heap anyway.

That's what I do now.

But doesn't the object need to be moved to end up in gen2 data space? Won't
the pinning reference prevent that?
But , this is what "fixed" is meant for, sure, it's scope is limited
by it's containing function scope, but you can perfectly pin an
object across several unmanaged function calls.

But it needs an "unsafe" block, for no apparent reason.
 
W

Willy Denoyette [MVP]

Ben Voigt said:
That's what I do now.

But doesn't the object need to be moved to end up in gen2 data space?
Won't the pinning reference prevent that?

No, but all depends on the sizes of the individual generations, that's why I
say pin your buffers early in the process and after a GC.Collect().
You can verify the location of your pinned buffer, by attaching to a native
debugger like nsdb or windbg..
Once attached, enter:
!dumpheap -type <yourPinnedType>
this returns the object's address (and some other info), something like:
....
Address MT Size
02721660 79131a68 1612
....

To find the generation where your object "lives" , you'll have to compare
it's address with the start address of the generations returned by:

!eeheap -gc
....
Number of GC Heaps: 1
generation 0 starts at 0x0272d230
generation 1 starts at 0x02721cac
generation 2 starts at 0x02721000
....

here the object at address 02721660 falls in the Gen2 range.
Pinning an object that follows a couple of MB's of other live objects, has
little chance to end on the Gen2, unless the GC can move the object to gen2
at the moment of "pinning".

But it needs an "unsafe" block, for no apparent reason.

"unsafe" blocks are needed when using pointers!
This produces non-verifiable code, so no surprises here, you are warned.....

Willy.
 
B

Ben Voigt [C++ MVP]

Willy said:
No, but all depends on the sizes of the individual generations,
that's why I say pin your buffers early in the process and after a
GC.Collect(). You can verify the location of your pinned buffer, by
attaching to a
native debugger like nsdb or windbg..
Once attached, enter:
!dumpheap -type <yourPinnedType>
this returns the object's address (and some other info), something
like: ...
Address MT Size
02721660 79131a68 1612
...

To find the generation where your object "lives" , you'll have to
compare it's address with the start address of the generations
returned by:
!eeheap -gc
...
Number of GC Heaps: 1
generation 0 starts at 0x0272d230
generation 1 starts at 0x02721cac
generation 2 starts at 0x02721000
...

here the object at address 02721660 falls in the Gen2 range.
Pinning an object that follows a couple of MB's of other live
objects, has little chance to end on the Gen2, unless the GC can move
the object to gen2 at the moment of "pinning".

Ok, so changing the generation of an object is done by moving the generation
boundary in memory, not by actually changing the object's address. I guess
that makes good sense.
"unsafe" blocks are needed when using pointers!
This produces non-verifiable code, so no surprises here, you are
warned.....

GCHandle.Alloc and Marshal.UnsafeAddrOfPinnedArrayElement are verifiable and
need no unsafe block.
 
W

Willy Denoyette [MVP]

Ben Voigt said:
Ok, so changing the generation of an object is done by moving the
generation boundary in memory, not by actually changing the object's
address. I guess that makes good sense.

That's correct. However, the generations pointers will not necessarely move
beyond a pinned object that is "too far away" from the base of the GC heap.
But again we are talking about implementation details which may vary from
one release to another.

GCHandle.Alloc and Marshal.UnsafeAddrOfPinnedArrayElement are verifiable
and need no unsafe block.

True, but they are much slower (which was the subject of this thread). There
seems to be a problem with GCHandle.Alloc, this function takes ~450
instructions in it's "normal" execution path when pinning an array,
execution however, takes ~5500 cycles on 32-bit. That means an average of 10
cycles per instruction which is way too much. We are currently analyzing
this and will open a case with MS.

Willy.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top