stackalloc vs. new byte?

  • Thread starter Thread starter Bob Eaton
  • Start date Start date
B

Bob Eaton

I need to allocate a buffer to be used with some non-managed code to do text
processing. Until now, I'd been using "stackalloc" in order to avoid
fragmenting the memory too much as a result of using "new", but recently,
someone gave me a very long string to process and the following code threw
the System.StackOverflowException:

int nBufSize = sInput.Length * 10;
byte* lpInBuffer = stackalloc byte [ nBufSize ];

I tried to put try...catch around this, but it can't catch the exception
(presumably because the stack is already clobbered by that point).

Anyway, in order to accomodate a large buffer situation, I'd like to do
something like say, "if the requested size is larger than the amount of
space on the stack, then use 'new' instead."

First of all, is "the amount of space on the stack" even determinable at
run-time? If not, I don't mind hard-coding something like, 100000 (or some
such value), but I can't get it to compile...

I've tried to change it to something like this:

const int cnMaxStackSize = 100000;
byte* lpInBuffer = (nBufSize > cnMaxStackSize) ? stackalloc
byte[nBufSize] : new byte[nBufSize];

But that doesn't work because "stackalloc is only valid in local variable
initializers."

Is there anything I can do besides simply using 'new' instead?

Thanks,

Bob
 
Bob said:
I need to allocate a buffer to be used with some non-managed code to do
text
processing. Until now, I'd been using "stackalloc" in order to avoid
fragmenting the memory too much as a result of using "new", but recently,
someone gave me a very long string to process and the following code threw
the System.StackOverflowException:

int nBufSize = sInput.Length * 10;
byte* lpInBuffer = stackalloc byte [ nBufSize ];

I tried to put try...catch around this, but it can't catch the exception
(presumably because the stack is already clobbered by that point).

I haven't tried what you're talking about, but I suspect the reason the
exception can't be caught is that the "stackalloc" keyword isn't so much
a statement that get executed, but rather in indication as to how the
method should be compiled.

In other words, the exception occurs before the stack frame in which
your try/catch has been constructed.

It seems to me that you could catch the exception by putting the
try/catch in the method that calls the method where the "stackalloc" use is.

Now, that said...I am not sure that your motivation here is
well-founded. I admit to not having specific C# knowledge about this
(specifically, I don't know where it puts the stack, but probably it's
the same as any other Win32 process, right?), but it seems to me that
you are as likely to run into memory fragmentation issues allocating on
the stack than from the heap.

Your stack can't be moved around to accommodate a need for more memory,
whereas when you allocate a new object on the heap, if there's a
contiguous space anywhere in the heap large enough, the allocation can
succeed.

So, you could have enough space in your virtual address space to
accommodate the allocation, but if your stack is running into some other
allocation below it, there's nothing that can be done. Your allocation
will fail, in spite of there being a large enough block of virtual
address space somewhere else.

In its own way, allocating data on the stack is potentially hazardous,
and IMHO simply shifts the potential problem from one possible scenario
to another.

I can't rule out that you would have problems with memory fragmentation,
especially if you allocating a really HUGE number of buffers over and
over and they are ALL different sizes. But I don't see how allocating
on the stack is an improvement. It just changes the way you could fail.

If fragmentation does prove to be a problem in practice, I'd suggest
trying to make a determination as to a size that is large enough to
contain 95% of the sizes passed in, and then just allocate a persistent
buffer that large. Then, at the very least, you shouldn't have to worry
about that method itself fragmenting the address space.

In the end though, if you are running into fragmentation _and_ you have
very large blocks of memory you want to allocate, you're always going to
be limited somehow. If you're allocating enough virtual address space
that the stack is being blocked, then you have some more basic memory
requirement issues that should probably be addressed in a more
fundamental way than just having your stack-allocating method fail more
gracefully. :)

Pete
 
Peter Duniho said:
Bob said:
I need to allocate a buffer to be used with some non-managed code to do
text
processing. Until now, I'd been using "stackalloc" in order to avoid
fragmenting the memory too much as a result of using "new", but recently,
someone gave me a very long string to process and the following code
threw
the System.StackOverflowException:

int nBufSize = sInput.Length * 10;
byte* lpInBuffer = stackalloc byte [ nBufSize ];

I tried to put try...catch around this, but it can't catch the exception
(presumably because the stack is already clobbered by that point).

I haven't tried what you're talking about, but I suspect the reason the
exception can't be caught is that the "stackalloc" keyword isn't so much a
statement that get executed, but rather in indication as to how the method
should be compiled.

In other words, the exception occurs before the stack frame in which your
try/catch has been constructed.

I'll give that a try.
Now, that said...I am not sure that your motivation here is well-founded.
I admit to not having specific C# knowledge about this (specifically, I
don't know where it puts the stack, but probably it's the same as any
other Win32 process, right?), but it seems to me that you are as likely to
run into memory fragmentation issues allocating on the stack than from the
heap.

I'm pretty sure that when you allocate memory "on the stack" it doesn't use
the 'stack space' as a source of allocation units (c.f. the dynamic memory
pool used by malloc back in the c-runtime world or "new" in the C++/C#
world), but rather it just uses the stack itself for a static chunk of
memory... There's no 'freeing' or garbage collection of stackalloc
memory--or the potential for fragmentation--but rather you just pop the
stack when you return to the calling routine and you're back to where you
started from... just like when you call sub-routines, but rather than
pushing only the registers and return address of the caller, you also set
aside the stackalloc memory as well. And again, when you return to the
caller, that stack memory just gets unwound automatically.

Frankly, it's a clever way to avoid fragmentation... if your needs are
"modest" :-)

Thanks, though, for the other idea of putting the try...catch in the calling
routine... I'll give that a try.

Bob
 
Bob said:
[...]
Frankly, it's a clever way to avoid fragmentation... if your needs are
"modest" :-)

No disagreement there, especially with emphasis on "modest". My point
is that your entire process shares a single virtual address space. Even
though using the stack prevents fragmentation, it doesn't necessarily
get around other memory space limitations.

In particular, either the stack has a fixed limit in size, in which case
putting large things on it isn't a good idea, or the stack has a dynamic
limit in size, affected by what else is in the virtual address space in
your process. In this latter case, allocations in the heap and
especially fragmentation in the heap, can still easily affect how large
your stack can get.

Let's take an imaginary situation, in which the heap grows from the 0
offset of the process's available virtual address space and the stack
grows from the very end of that virtual address space.

Let's also assume that you have successfully fragmented the heap such
that there is not single contiguous block of address space available.
This means that between the last block used by the heap and the first
block used by the stack, there is not enough room for the desired
allocation.

In this case, it doesn't matter where you try to allocate the block, it
won't succeed. The stack can't grow down far enough, and there's not
enough room left in the heap.

Now, let's assume that having failed the allocation, you have the
ability to go through and do some cleanup (for example, perhaps the
fragmenting blocks in the heap are eligible for garbage collection).

Let's also assume that within the heap, so many of the fragmenting
blocks can be freed that enough room can be made for the desired allocation.

Let's also assume that last fragmenting block, nearest the stack, could
not be freed.

In this case, a heap allocation will succeed while a stack allocation
will not.

Now, I realize that the example is over-simplistic, and I realize it may
not reflect the exact nature of memory layout for a .NET program. But
the underlying mechanisms are still valid and I think the example still
applies. The stack has a very restrictive nature, and it can fail in
fragile ways, whereas the heap is more dynamic and has the ability to
recover to some extent from fragmentation.

The stack, even though it itself doesn't wind up fragmented per se, is
still susceptible to fragmentation, so unless the only thing that's
likely to cause fragmentation is the thing you're allocating on the
stack, using the stack for that one thing doesn't necessarily buy you
much. Even worse, all it takes is a single poorly-placed heap
allocation for fragmentation to break stack allocations, whereas you
need to fragment a large portion of the entire virtual address space to
break heap allocations.

I will also point out another little gotcha: a very large object on the
stack incurs a significant performance hit, because of the way that
detection of stack problems on Windows works. In particular, Windows
has a stack observer that considers a non-contiguous access of an
element on the stack to be a problem. If you allocate something very
large on the stack, the potential exists for a reference to skip a page
of memory. Windows deals with this by touching all of the new pages
allocated in the address space to ensure that they've been committed and
don't cause the error-detection to be set off.

In your case this should be less of a problem, because one hopes you are
not creating such a large thing on the stack on every call. But it's
still something to consider.

Anyway, I admit there's some hand-waving there, but my experience has
been that allocating things on the stack is best left for relatively
small local variables. If you want to allow allocation of large things,
you need to pick a maximum size you'll handle, and always allocate
something that large; otherwise, your code may break well after you've
deployed it, at a time when you can't easily fix it.

Pete
 
Bob Eaton said:
I need to allocate a buffer to be used with some non-managed code to do
text
processing. Until now, I'd been using "stackalloc" in order to avoid
fragmenting the memory too much as a result of using "new", but recently,
someone gave me a very long string to process and the following code threw
the System.StackOverflowException:

int nBufSize = sInput.Length * 10;
byte* lpInBuffer = stackalloc byte [ nBufSize ];

I tried to put try...catch around this, but it can't catch the exception
(presumably because the stack is already clobbered by that point).

Anyway, in order to accomodate a large buffer situation, I'd like to do
something like say, "if the requested size is larger than the amount of
space on the stack, then use 'new' instead."

First of all, is "the amount of space on the stack" even determinable at
run-time? If not, I don't mind hard-coding something like, 100000 (or some
such value), but I can't get it to compile...

I've tried to change it to something like this:

const int cnMaxStackSize = 100000;
byte* lpInBuffer = (nBufSize > cnMaxStackSize) ? stackalloc
byte[nBufSize] : new byte[nBufSize];

But that doesn't work because "stackalloc is only valid in local variable
initializers."

Is there anything I can do besides simply using 'new' instead?

Thanks,

Bob



What makes you think that "new" fragments the heap? It's the task of the GC
to compact (that is defragment) the heap and he's really good at it for as
long as your objects are smaller than 85Kb.

Anyway, "Stackalloc" allocates from the current thread's stack which is (by
default )1MB in size on 32 bit windows, on 64 bit windows it depends on how
the program was compiled (platform:X64 or MSIL), MSIL and X86 reserve 1MB
while X64 reserve 4MB of stack . The stack can never grow above this limit,
the OS raises a SE (structured exception) whenever you try to reserve the
last page of stack, at this point it's impossible to handle the managed
stackoverflow exception because there is no stack space left for this.
So, before you start allocating large chunks of stack space you need to
inspect the amount of free stack space. Only way to do this is by calling
Win32's VirtualQuery API.

Following is a function that does exactly this:

using System;
using System.Runtime.InteropServices;
.....
public unsafe static bool CheckForStackSpace(long bytes) {
int sizeOfPointer = sizeof(IntPtr);
MEMORY_BASIC_INFORMATION stackInfo = new MEMORY_BASIC_INFORMATION();
IntPtr currentAddr;
if(sizeOfPointer == 4) // 32 bit code?
// VirtualQuery rounds up to the next page, the stack grows down so we
must adjust.
currentAddr = new IntPtr((int)&stackInfo - 4096);
else
currentAddr = new IntPtr((long)&stackInfo - 4096); // assumes 4KB pages,
not valid on IA64

VirtualQuery(currentAddr, ref stackInfo,
sizeof(MEMORY_BASIC_INFORMATION));

return (currentAddr.ToInt64() - stackInfo.AllocationBase.ToInt64()) >
(bytes + STACK_RESERVED_SPACE);
}

// keep a minimum of 16 pages to handle overflow exceptions
private const long STACK_RESERVED_SPACE = 4096 * 16;

[DllImport("kernel32")]
private static extern int VirtualQuery (
IntPtr lpAddress,
ref MEMORY_BASIC_INFORMATION lpBuffer,
int dwLength);

private struct MEMORY_BASIC_INFORMATION {
internal IntPtr BaseAddress;
internal IntPtr AllocationBase;
internal uint AllocationProtect;
internal IntPtr RegionSize;
internal uint State;
internal uint Protect;
internal uint Type;
}
}
}
 
Now, that said...I am not sure that your motivation here is
well-founded. I admit to not having specific C# knowledge about this
(specifically, I don't know where it puts the stack, but probably it's
the same as any other Win32 process, right?), but it seems to me that
you are as likely to run into memory fragmentation issues allocating on
the stack than from the heap.

By it's very nature, there can be no fragmentation on the stack.
The problem of the OP is that the stack has a fixed size that cannot
be changed at runtime (it's 1MB by default).

Kristof Verbiest
 
I've never heard of such a dynamic-sized stack in Windows. Do you have
any documentation on this that you could share? How could this be
configured?

There isn't. The stack space is reserved from virtual memory space when the
thread is created and never changes.

The actual physical memory corresponding to that address space is committed
and decommitted as needed, but the stack addresses are predetermined and do
not suffer from fragmentation.


However, the .NET heap uses a compacting garbage collector so it does not
get increasingly fragmented over time (fragmentation causes compaction).
 
I've never heard of such a dynamic-sized stack in Windows. Do you have
any documentation on this that you could share? How could this be
configured?

I never said that's what happens in Windows. I don't have the specifics
off the top of my head, and so was simply discussion the two
possibilities (static and dynamic).

You just happened to cut out the part of my post where I discussed a
static limit for the stack, which as Ben points out is how Windows
implements the stack.

My real point is that putting really big things on the stack can get you
into trouble just as fast as putting them in the heap, and maybe faster.

Pete
 
Back
Top