unmanaged vs managed.

G

Guest

Sorry to bring up a topic that is just flogging a dead horse.... but...

On the topic of memory management....
I am doing some file parcing that has to be done as quick as posible,
but what I have found hasnt been encouraging...

I found that the *quickest* way is to use a struct to retrieve fixed length
packets out of the file, but structs get placed on the stack! In one of the
books I read, the stack is only about 1mb, so I would rather it not go there.
(Not that there is even enough space to do what I want it to do.)
If I put all, say, 5000+ records on the heap individualy, then the heap has
to work quite hard every time it has to house cleaning.

So finaly, I see two "good" options remaining: Allocate one block of memory
on the managed heap, or a block of memory on the unmanaged heap. From
what I have read, if I allocate a chunk on the managed heap, and then want
to use pointers on it, I would definitly need to fix the allocated memory via
fixed() or pinning via GCHandle. The other option is to use unmanaged memory.
I have also read that pinning hurts the process of garbage collection, but in
another book, I read that big allocations typicaly dont get moved around.


My questions are as follows:
Q1)
Is using fixed or is pinning an object the same thing?(Ignoring the fact
that the memory is reliesed at the end of the fixed statement, and the
pinnin is unpinned when I say it is.) Any good reasons to use pinning over
fixed??

Q2)
Is pinning all that bad for an object that "probably won't move" anyhow?

Q3)
Would using unmanaged memory be better in this case?
(yes I know that I would have to be careful to prevent leaks.)

Q4)
Does unmanage memory ever move? (eg no fixing req?)

Q5)
Does the managed heap actualy get derived from the unmanaged?
If so, how much memory at a time? and is the managed heap
always contiguous from one end to the other?

Q6)
I heard that the generics class list<T> has some wonderful performance
gains, and is a lot easier on the heap than the ArrayList. Mostly I have heard
that its performance gains are due to the lack of boxing and unboxing. Does
it allocate chunks of memory on the heap preventing zilions of objects
floating
around the managed heap too?

Q7)
"Bitmap data is allocated on the heap." Why?


I have heard so many conflicting stories about the unmanaged heap, so I am
lift
in the dark not knowing the "real" story.

Thank you for any and all your help on this topic.
 
G

Guest

When you state "but what I have found hasnt been encouraging" -- do you mean
you've created a solution and tested it, and you are unhappy with the
results, or that you've been doing some reading and you aren't happy with
what you've read?

If the latter, you might try doing what you need to do in plain old managed
code first; you may find the results are just fine.

-- Peter
Recursion: see Recursion
site: http://www.eggheadcafe.com
unBlog: http://petesbloggerama.blogspot.com
BlogMetaFinder: http://www.blogmetafinder.com
 
J

Jon Skeet [C# MVP]

TheMadHatter said:
Sorry to bring up a topic that is just flogging a dead horse.... but...

On the topic of memory management....
I am doing some file parcing that has to be done as quick as posible,
but what I have found hasnt been encouraging...

I found that the *quickest* way is to use a struct to retrieve fixed length
packets out of the file, but structs get placed on the stack! In one of the
books I read, the stack is only about 1mb, so I would rather it not go there.
(Not that there is even enough space to do what I want it to do.)
If I put all, say, 5000+ records on the heap individualy, then the heap has
to work quite hard every time it has to house cleaning.

5000 records is hardly a lot. Before you go down micro-optimisation
routes, have you actually written the code in the simplest way and
found that it's definitely too slow? I would expect the IO to be by
*far* the slowest part, easily dwarfing GC times.
 
C

Chris Mullins [MVP - C#]

TheMadHatter said:
I am doing some file parcing that has to be done as quick as posible,
but what I have found hasnt been encouraging...

I found that the *quickest* way is to use a struct to retrieve fixed
length
packets out of the file,

You're guilty of premature optimization. The time spent parsing and doing
memory management is going to be a tiny fraction of the time spent doing
disk seeks and reads.

This should be a 5 or 10 line program, and 99.99% of the time is going to be
disk I/O.

If you want to make things needlesly complex just for the fun of it, use
Async I/O (BeginRead, etc). That at least has a measure of legitimacy behind
it in terms of improving application performance. It'll also be more
educational in the long run that the other things you're thinking about...
 
G

Guest

urr... Perhaps that wasnt choice words......
Maybe the words "confusing" would have been a better word.

Garbage collection does have its place, and I am extremly happy with it.

The goal is micro optimization though. Achieving the best of the best.

And yea.... the 5000 record example records that I mentioned does
seam small, but it was only an example.

Actualy one part of the project requires parcing a few Gb of data
from files, and yes I know that Harddrive tends to be the bottle neck,
which it is, and I have implemented every trick I could find to speed
that up too. (raid, native calls to the platform, 1Mb chunks reads at a time.)
I couldnt imaging tormenting the managed heap with that kind of through
put. I would rather use structure pointers to a chunk of fixed memory.


I am still seeking answers to my questions, but...
Thank you for your responce, though. :)
 
C

Chris Mullins [MVP - C#]

Honestly, if you're really woried about it, implement your stuff in standard
managed code using Async I/O. This will probably be very close to ideal
throughput based on your disk array.

Where should your optimizations end? At some point, you're going to start
saying:

- There are too many thread context switches. Must eliminate threads. Oh,
wait. I've got a dual proc box. I need 2 threads. No, it's a quad proc. Need
4 threads. With Affinity. And with a Realtime priority.

- Transistions from User Mode to Kernel Mode are very expensive. Write your
own Kernel Mode Driver.

- Not hitting the disk very efficiently. Need to reverse engineer
scatter/gather algoirthm in use in the RAID card. Need to make 100% sure
command queuing is involved.

- Hrm. Got 4 threads, but only two spindles. Need fewer threads!

- Hrm. Got 4 spindles but only 2 threads. Need more threads!

- Could enable disk mirroring, and read from multiple spindles at a time.

Ironically, each of these is probaly a bigger "real" issue than the GC heap
thrash. (Don't do any of these... I'm well off into flights of fancy...)
 
G

Guest

Actualy I did that too. :)

The parcing is actualy only one part of the process
that is chewing up my cpu.

Peter told me something simular too.
And I am well aware of the hd issue.
The responce I gave him applies here
too.

I am still seeking answers to my questions, but...
Thank you for your responce, though. :)
 
G

Guest

Peter told me something simular too.
And I am well aware of the hd issue.
The responce I gave him applies here
too.

I am still seeking answers to my questions, but...
Thank you for your responce, though. :)
 
J

Jon Skeet [C# MVP]

TheMadHatter said:
urr... Perhaps that wasnt choice words......
Maybe the words "confusing" would have been a better word.

Garbage collection does have its place, and I am extremly happy with it.

The goal is micro optimization though. Achieving the best of the best.

Do you have a concrete goal in mind?

What's your current performance, and what do you need to achieve?

"The best of the best" is always expensive to achieve and usually hard
to maintain. Do you really need to make your code hard to understand
and maintain for the sake of a tiny performance boost?

Have you profiled the app to see how much of the processor time is
currently stuck in GC?
 
G

Guest

Actualy, what I am trying to do is understand what is
under the hood, to figure out common pitfalls of
programming. For example, boxing and unboxing.
No newbe would understand why their app doesnt
perform as fast as they wanted it to without understanding
atleast those "elementry" pitfalls.

As they say "Processing power is cheap", but one shouldn't
use it carelessly. In more constraining applications, optimization
is nothing to laugh at. (yes of course, always in the right places.)

Once again, thanks for the responce. :)
I dont suppose you have the answers to any of my original Qs?

Honestly, if you're really woried about it, implement your stuff in standard
managed code using Async I/O. This will probably be very close to ideal
throughput based on your disk array.

Already did that.
Where should your optimizations end? At some point, you're going to start
saying:
.....

Already down that path of madness. LOL
 
G

Guest

Chris is of the same mind on the topic as you,
so the answer I gave him applies here too.

Once again, thank you very much for you thoughts
on the matter. :)
My Qs remain unanswered. Do you have one or two
of the answers?
 
C

Chris Mullins [MVP - C#]

TheMadHatter said:
I dont suppose you have the answers to any of my original Qs?

To be honest, all of us responding know the answers to your questions.
Optimized I/O, Threading, GC, Pinning, Heap vs Stack, etc, are what I do all
day long. I suspect the others are in a similar situation. Peter, Jon, and I
all know this stuff pretty darn well.

I'm just avoiding answering the questions, due to a feeling that the answers
are going to lead you into more trouble, rather than to the solution you're
actually looking for. In the years I've been doing this, that seems to
happen... alot.
 
C

Chris Mullins [MVP - C#]

TheMadHatter said:
My questions are as follows:
Q1)
Is using fixed or is pinning an object the same thing?(Ignoring the fact
that the memory is reliesed at the end of the fixed statement, and the
pinnin is unpinned when I say it is.) Any good reasons to use pinning
over
fixed??

I use pinning.

I do this as fixed requires unsafe code (if I remember right), and I
generally try to steer clear of that. I pin things if needed for Interop.
Even then, I almost never do it manually, and let it all happen
automatically.

There are othe ways to pin (this is a good blog article. Read it!):
http://codebetter.com/blogs/gregyoung/archive/2007/08/09/pop-quiz-pinning.aspx
Q2)
Is pinning all that bad for an object that "probably won't move" anyhow?

Yea. All objects move. This leads to heap fragmentation.

There is, that I know of, very little support for "probably won't move". I
suppose you could argue for the LOH, but that's awfully implementation
specific.
Q3)
Would using unmanaged memory be better in this case?
(yes I know that I would have to be careful to prevent leaks.)

No. It's a pain in the ass. It's also slower in many cases. Allocations from
the managed heap are faster than those from an unmanaged heap. Really.

Allocate a big array and keep it around. Parse it up using ArraySegment.
This will prevent fragmentation, and your array will end up wither in the
LOH or in Gen2. In either case, keep it alive for the lifetime of your app,
and GC won't trying to collect it (at least not very often).

The biggest thing is the buffers you pass to the Async I/O method need to
come from here. These buffers get pinned while the operation is outstanding,
and this pinning will cause fragmentation. You want to make sure the pinned
buffers are all next to each other, preferably all from the same segment.
This means allocate them all at the same time, and keep them around for a
long time....
Q4)
Does unmanage memory ever move? (eg no fixing req?)

EVER? Not sure. From a practical perspective, no. No fixing required. At
least not that I can think of.
Q5)
Does the managed heap actualy get derived from the unmanaged?

Yes. It's allocated out of the heap that is part of the process. There are
many heaps in .Net, and sometimes there are many managed heaps as well
(multi-core, server GC algorithm).
If so, how much memory at a time? and is the managed heap
always contiguous from one end to the other?

It's very complicated.

Depends on the number of processors, the CLR version (workstation / server),
the Windows Version (x64, IA64, x86) processor cache sizes, number of
threads running, etc. If I remember right, it grows in increments of the
segment size, which ends up usually about 64k. Depending on configuration
though, you have multiple garbage collected heaps (1 per processor core),
and in each heap Gen0 is carved up by the thread count so that allocations
don't require exclusive access to the gen0 heap. It's also a dynamic value,
not a static one.
Q6)
I heard that the generics class list<T> has some wonderful performance
gains, and is a lot easier on the heap than the ArrayList. Mostly I have
heard
that its performance gains are due to the lack of boxing and unboxing.
Does
it allocate chunks of memory on the heap preventing zilions of objects
floating
around the managed heap too?

List<T> is better than ArrayList in every respect.

Both ArrayList and List are based on arrays. Understanding the algorithm
behind how these arrays are sized is important if you care about the
performance.

If you're growing array sizes in a way that's unexpected to you, and done at
large scales, you'll have unexpected (and poor!) results.
Q7)
"Bitmap data is allocated on the heap." Why?

'cause it's huge? Where else would it go? It aint gonna fit on the stack...
 
C

Chris Mullins [MVP - C#]

TheMadHatter said:
Q1)
Is using fixed or is pinning an object the same thing?(Ignoring the fact
that the memory is reliesed at the end of the fixed statement, and the
pinnin is unpinned when I say it is.) Any good reasons to use pinning
over
fixed??

I answerd this incorrectly earlier. It turns out there are some big
differences between GCAllocated Pinned variables and fixed variables.

Details in Greg Young's blog at:
http://codebetter.com/blogs/gregyoung/archive/2007/08/16/pop-quiz-pins-answers.aspx

Greg goes into a fair bit of detail on the different types of pinning.
 
G

Guest

*Sigh*
Don't suppose you guys know any really good books on the matters?
I already have enough books to start a book store, but I usualy
find that one book for one good chapter, but it always leads to more
questions of the "middle ware".

Currently my knoledge is fairly okay in assembly language, and fairly
good in higher languages, but I am having issues connecting the dots
between the two. Half of them revolve around operating system "magic"
and the other half is what is buried in the higher languages.
I dont think dissassembling code for answers is exactly the fastest
approach, nor is using IL dissassembler.

Perhaps I won't get any answers today, but I am hopeful that I may
lead to coding enlightenment and the reasons why.

Thanks anyhow. :(
 
G

Guest

ur.... actualy I ment the following: The Bitmap data is allocated on the
"unmanaged-heap" rather than the managed. Why?
 
C

Cor Ligthert[MVP]

Peter,
When you state "but what I have found hasnt been encouraging" -- do you
mean
you've created a solution and tested it, and you are unhappy with the
results, or that you've been doing some reading and you aren't happy with
what you've read?

If the latter, you might try doing what you need to do in plain old
managed
code first; you may find the results are just fine.

Why are you writting this, I like to answer questions like this and then you
write it exactly (even as short however probably with other words) as I had
done,

:)

Cor
 
J

Jon Skeet [C# MVP]

Actualy, what I am trying to do is understand what is
under the hood, to figure out common pitfalls of
programming. For example, boxing and unboxing.
No newbe would understand why their app doesnt
perform as fast as they wanted it to without understanding
atleast those "elementry" pitfalls.

For me, the elementary pitfalls here is premature optimization. I
can't remember *ever* running into an issue where the garbage
collector was the bottleneck. I'm not saying it doesn't happen - but
premature optimization happens *much* more often, and causes much more
pain. I've seen plenty of bugs introduced because a developer -
sometimes myself - has tried to squeeze an extra bit of unnecessary
speed out of some code which was already fast enough.
As they say "Processing power is cheap", but one shouldn't
use it carelessly. In more constraining applications, optimization
is nothing to laugh at.

Not when used appropriately. Optimizing at the cost of simple code for
the sake of a very small gain is usually the wrong thing to do even in
constrained environments though.
(yes of course, always in the right places.)

And is this particular case definitely "in the right place"? Do you
have profile data suggesting that you're actually "tormenting the
managed heap"? If garbage collection is a significant factor,
profiling should find that really quickly.

Do that *way* before you start thinking about "reserving" large chunks
of memory and then handling it all yourself.

Jon
 
W

Willy Denoyette [MVP]

Because Bitmaps are allocated by unmanaged code (GDI+), unmanaged code
cannot allocate from the GC heap.
It's important to make a clear distinction between regular *process heaps*
and the *GC heap(s)*.
The regular heaps, like the "default process" heap and the "CRT" heap, are
all using heap management API's (like HeapAlloc, HeapFree etc..) from the
*Heap Manager* (NTDLL), who at his turn calls into the Virtual Memory
Manager (VMM), in order to allocate/deallocate memory from the process
Virtual Address Space (VAS).
The GC heap is a "private" memory allocator, which bypasses the *Heap
Manager* to to allocate directly from the VMM.
A managed application can allocate from both the "regular" or "unmanaged"
heap (for instance by Marshal.AllocHGlobal), and from the "Managed" (GC)
heap, while unmanaged code can allocate from all but the GC heap.


Willy.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top