"new byte[132]" alignment on 16 bytes

  • Thread starter Thread starter Olaf Baeyens
  • Start date Start date
O

Olaf Baeyens

I am porting some of my buffer class code for C++ to C#.
This C++ class allocates a block of memory using
m_pBuffer=new BYTE[...];

But since the class is also used for pointers for funtions that uses raw MMX
and SSE power, the starting pointer MUST be starting at a 16 byte memory
boundary.
In C++ I allocate more memory than needed, and in a second phase I search
for the address that starts on a 16 byte boundary. And I then use that new
memory address pointer.

Is there a way to make C# to always align a "byte[] Buffer=new byte[132];"
dynamically to a 16 byte memory address?
[StructLayout(LayoutKind.???)] ???

I am not talking about the location of &Buffer but about the location of
&(new byte[132]).
It would be nice if I could decide to make parts of my code aligned on 16
bytes and the rest on normal 8 bytes or 32 byte depending on the need and
not force all uses of Buffer to be alignend on 16 bytes..
 
What exactly do you mean with "C# to always align a....". C# doesn't align
anything. Byte arrays are allocated on the managed heap and this allocation
is done by the CLR, the application code cannot control the alignment nor
the location.
However, as your functions are unmanaged code and as such, you have to copy
the byte arrays to unmanaged memory anyway, no-one stops you from
implementing some form of custom marshalling.

Willy.
 
Olaf Baeyens said:
I am porting some of my buffer class code for C++ to C#.
This C++ class allocates a block of memory using
m_pBuffer=new BYTE[...];

But since the class is also used for pointers for funtions that uses raw
MMX
and SSE power, the starting pointer MUST be starting at a 16 byte memory
boundary.
In C++ I allocate more memory than needed, and in a second phase I search
for the address that starts on a 16 byte boundary. And I then use that new
memory address pointer.

Is there a way to make C# to always align a "byte[] Buffer=new byte[132];"
dynamically to a 16 byte memory address?
[StructLayout(LayoutKind.???)] ???

I am not talking about the location of &Buffer but about the location of
&(new byte[132]).
It would be nice if I could decide to make parts of my code aligned on 16
bytes and the rest on normal 8 bytes or 32 byte depending on the need and
not force all uses of Buffer to be alignend on 16 bytes..
(note, each of these uses unsafe code as you will probably have to, unless
you use an API like VirtualAlloc that always returns a buffer on a page
boundary)
Well, I don't think I would be terribly keen on any method you would require
to determine the location of a byte array, as they are generally moved along
with all other items in the managed heap. YOu can use fixed on them and take
a pointer to the array, looping through to find the first 16byte boundry,
but that forces the GC to move objects around the buffer...not nessecerily
pleasent and you have to keep the buffer fixed as long as you intend to use
its pointer.

Option two: You could allocate unmanaged memory(via Win32 or the
Marshal::AllocHGlobal() method) and scan through that pointer, locating a
boundry. This is potentially an issue as it requires you to manually release
the buffer and can, again, cause GC memory layout problems, but atleast the
buffer doesn't start out sitting in the middle of the managed heap, although
the heap could theoretically grow around it if you maintain the buffer for
too long.

The third option, if your buffers are short lived(within a single method
call and its children) and small, you could use stackalloc, which allocates
a memory buffer directly on the stack. Again you would have to walk the
pointer and find a 16byte boundary, but in this case it shouldn't cause any
trouble with the managed heap or require extra steps to clean up at the cost
of a larger stackframe and no release until the method ends.

stackalloc syntax example(all parts are required, stackalloc int[132] does
not stand alone):

int *buffer = stackalloc int[132];
 
What exactly do you mean with "C# to always align a....". C# doesn't align
anything. Byte arrays are allocated on the managed heap and this allocation
is done by the CLR, the application code cannot control the alignment nor
the location.
However, as your functions are unmanaged code and as such, you have to copy
the byte arrays to unmanaged memory anyway, no-one stops you from
implementing some form of custom marshalling.

Well I want to create a byte[] Buffer=new byte[xxx] ;
And pass this Buffer to a unmanaged C++ function that executes Assembler SSE
instructions on this buffer without having to copy the memory block.
The SSE instructon set assumes that the starting physical memory block is
starting at a address location that is devidable by 16 or else it generates
an exception.

Copying a memory block is no option since these are huge memory blocks. The
intention of using SSE is just to speed up calculations, but if I need to
copy the memory block the SSE would not be a solution in my case. Another
technique should then be used.

In code examples I see [StructLayout(LayoutKind.???)] beeing used as way
for alignment of a structure. But this alignes the buffer pointer, not the
actual memory address of the data.
So if it works for structures, then there might also be a way to align for
byte arrays.

An alternative is that I still use unmanaged C++ to allocate aligned memory
blocks, and then pass this on to C#, but I would really prefer that C# can
do this instead.
 
Option two: You could allocate unmanaged memory(via Win32 or the
Marshal::AllocHGlobal() method) and scan through that pointer, locating a
boundry. This is potentially an issue as it requires you to manually release
the buffer and can, again, cause GC memory layout problems, but atleast the
buffer doesn't start out sitting in the middle of the managed heap, although
the heap could theoretically grow around it if you maintain the buffer for
too long.

This is the thing I am doing in unmanaged C++ right now.
Allocate a memory block that is bigger, and then loop through the array
searching for the first memory address that is at a 16 byte boundary.
This address is the used as starting point.

But my problem with C# is that the memory moves, so if I pin a memory block
to a fixed position, then the moment I release it, it might me moved and the
start index might might not reside on a 16 byte boundary anymore. So the
alternative would be to keep it pinned down. But since each memory block
have a size of 16 MB, and I have numerous of them, this would not be a good
solution.

Right now I am looking into allocating memory un unmanaged C++ and then pass
it on to C#.

Maybe something for the wishlist if the next CLR? ;-)
 
Olaf Baeyens said:
This is the thing I am doing in unmanaged C++ right now.
Allocate a memory block that is bigger, and then loop through the array
searching for the first memory address that is at a 16 byte boundary.
This address is the used as starting point.

But my problem with C# is that the memory moves, so if I pin a memory
block
to a fixed position, then the moment I release it, it might me moved and
the
start index might might not reside on a 16 byte boundary anymore. So the
alternative would be to keep it pinned down. But since each memory block
have a size of 16 MB, and I have numerous of them, this would not be a
good
solution.

Allocating with Marshal::HGlobalAlloc will not result in a moving buffer, it
just isn't nessecerily the best method to use.
Right now I am looking into allocating memory un unmanaged C++ and then
pass
it on to C#.

That will probably work best. Again you may want to look into VirtualAlloc
and allocate pages instead of relying on a memory manager to give you a
block and locating a valid boundary yourself.

Large page support and even AWE may factor in if your systems support them
and your application design scales that high.
Maybe something for the wishlist if the next CLR? ;-)

I don't think so...perma-pinned, aligned memory is pretty against hte grain
in the CLR. It is designed, after all, to run across architectures(atleast
IA32, IA64, and AA64), thus embedding alignment is potentially a bad idea.
In your case your code is limited to machines that support SSE, but in the
general case such a function would be much less valuable and probably not
nessecerily worth the time and effor the CLR team would require.

What I would consider more likely would be enhancements to the JIT and
perhaps math libraries\primatives to allow the more advanced number
crunching instructions to be generated and used without any special casing.
This would allow a more general enhancement, but I don't know if it is
really possible and it wouldn't be for a while yet I suspect.
 
Olaf,

See inline ***

Willy.

Olaf Baeyens said:
What exactly do you mean with "C# to always align a....". C# doesn't
align
anything. Byte arrays are allocated on the managed heap and this allocation
is done by the CLR, the application code cannot control the alignment nor
the location.
However, as your functions are unmanaged code and as such, you have to copy
the byte arrays to unmanaged memory anyway, no-one stops you from
implementing some form of custom marshalling.

Well I want to create a byte[] Buffer=new byte[xxx] ;
And pass this Buffer to a unmanaged C++ function that executes Assembler
SSE
instructions on this buffer without having to copy the memory block.
The SSE instructon set assumes that the starting physical memory block is
starting at a address location that is devidable by 16 or else it
generates
an exception.

Copying a memory block is no option since these are huge memory blocks.
The
intention of using SSE is just to speed up calculations, but if I need to
copy the memory block the SSE would not be a solution in my case. Another
technique should then be used.
*** If copying is not an option, and pinning is not an option (it shouldn't
be), I would say Managed code is not an option too.
You can't solve all problems using managed code, that's why you (think you)
needed assembly code to solve the SSE issue and that's why you will need
another memory allocator to solve this too.
In code examples I see [StructLayout(LayoutKind.???)] beeing used as way
for alignment of a structure. But this alignes the buffer pointer, not the
actual memory address of the data.
So if it works for structures, then there might also be a way to align for
byte arrays.

*** StructLayout doesn't change anithing on the managed heap, it's just a
directive for the marshaler to change the layout when marshaling to/from
unmanaged memory.

An alternative is that I still use unmanaged C++ to allocate aligned
memory
blocks, and then pass this on to C#, but I would really prefer that C# can
do this instead.

*** This is the only option. And again C# has nothing to do with this, C#
has no memory allocator, it's a task for the CLR's heap manager which is the
common runtime for all managed languages.
 
Hi Olaf,

I once had to deal with aligment issues, mine was that a mem block should be
4Kb aligned. Since 4Kb aligned blocks are also 16b aligned (any number
divisable by 4096 is also divisable by 16), you can use VirtualAlloc to
achieve this (check documentation). This is unmanaged memory which you still
have to release manually (VirtualFree).

VirtualAlloc & VirtualFree can be found in "kernel32.dll".

Tom T.
 
*** StructLayout doesn't change anithing on the managed heap, it's just a
directive for the marshaler to change the layout when marshaling to/from
unmanaged memory.

actually it does. that's why you can use StructLayout to create an 'union'
of sort in C#.
 
Hi,

Note that you can map these functions using P/Invoke, and call them from C#.
You don't have to use C++ for this.

Another thing is that you will get a raw pointer back from VirtualAlloc
(which is wrapped using IntPtr). If you wish to fiddle its contents, you will
have to use pointers. You can cast to raw pointer to a byte* to manipulate
the contents (in an unsafe code block).

HTH,
TT
 
I once had to deal with aligment issues, mine was that a mem block should
be
4Kb aligned. Since 4Kb aligned blocks are also 16b aligned (any number
divisable by 4096 is also divisable by 16), you can use VirtualAlloc to
achieve this (check documentation). This is unmanaged memory which you still
have to release manually (VirtualFree).

I knew about this one. I was hoping not to use WinAPI. ;-)

My plan at this moment is to create a C++ managed class that will have a
unmanaged pointer to a memory that either uses VirtualAlloc/VirtualFree or
the code that I already have.
So for the SSE assembler related stuff, I could store the data there. For
all other functionality, I will create a pure Buffer class. I don't like
mixing managed/unmanaged too much.

But I thank everyone for the feedback.
 
Daniel,

Agreed, [StructLayout(LayoutKind.Explicit)] allows for a limitted form of
offset control of the individual fields, but it'wont let you align on
specific addresses (n*16) as OP asked for and it only applies to blittable
types.
For instance this will compile and run ( with alignment fix-up on X86 and
possibly fail on IA64 without the help of the CLR):

[StructLayout(LayoutKind.Explicit)]
public class MyStruct
{
[FieldOffset(0)]public ushort us1;
[FieldOffset(2)]public ushort us2;
[FieldOffset(4)]public uint uint1;
[FieldOffset(6)]public uint uint2; // notice the overlap
}

However, these will compile, but will throw a Typeload exception.
[StructLayout(LayoutKind.Explicit)]
public class MyStruct
{
[FieldOffset(0)]public ushort us1; // overlapped by a non object field (non
blitable)
[FieldOffset(2)]public ushort us2;
[MarshalAs(UnmanagedType.ByValArray, SizeConst=132)]
[FieldOffset(0)]public byte[] ba;
}

Same here:
[StructLayout(LayoutKind.Explicit)]
public class MyStruct
{
[FieldOffset(0)]public ushort us1;
[FieldOffset(2)]public ushort us2;
[FieldOffset(6)]public string uint2; // Incorrectly aligned
}

Willy.
 
Back
Top