Memory Limit for Visual Studio 2005???

W

Willy Denoyette [MVP]

Peter Olcott said:
Here are the final results:
Visual C++ 6.0 native code allocated a std::vector 50% larger than the largest
Generic.List that the .NET runtime could handle, and took about 11-fold (1100%) longer to
do this. This would tend to indicate extensive use of virtual memory, especially when this
next benchmark is considered.

Generic.List was only 65% faster than native code std::vector when the amount of memory
allocated was about 1/2 of total system memory. So it looks like the .NET run-time
achieves better performance at the expense of not using the virtual memory system.


NET is not kind of an alian, it's just a thin layer in top of Win32 it uses the same system
services as native code compiled with whatever compiler you can use on Windows, the CLR and
GC re allocating memory from the heap (that is from Virtual memory)through the same calls as
the C runtime library, and do you know why? Because the CLR uses the same C runtime and
there is no other way to allocate memory in Windows.
As we told you before, the process heap is fragmented from the start of the program, the way
it's fragmented is determined by the modules loaded into the process space, so, there might
be a difference between different type of applications. Native C++ console applications
don't have to load the CLR runtime and some of the FCL libraries, that means that the heap
is less fragmented as it's the case with a C# console program, but a real world C++ program
also needs to load libraries, and these will fragment the heap just like in the case of
..NET.

As I said (and others too) each time the List (or vector) overflows it must be extended,
please refer to my prevous post to know exactly what this is all about. To prevent this you
have to pre-allocate the List or vector.

Running following code won't throw an OOM when run on 32 bit windows XP.

static void Main()
{
List<byte> bList;= new List<byte>(1600000000); // 1.600.000.000 bytes
for(int i = 0; i < bList.Capacity; i++)
bList = 12;
}

while this will throw...

bList = new List<byte>();
for(int i = 0; i < 600000000 ; i++) // 600.000.000 bytes
bList.Add(12);

but 512.000.000 bytes will work...


Now back to C++, this will throw.

#include <vector>
#include <iostream>

static void main()
{
std::vector<unsigned char> *bList = new std::vector<unsigned char>;
try {
for(int i = 0; i < 700000000; i++) // 700.000.000 bytes
bList->push_back(12);
}
catch( char * str ) {
std::cout << "Exception raised: " << str << '\n';
}
}

while 640.000.000 bytes may work.

But what's the difference, 100MB? point is that you can't allocate the full 2GB and you need
to pre-allocate whenever you are allocating that huge objects (> 512MB) on 32 bit systems,
native code or managed code it doesn't matter.
And don't let me get started about the performance implication by not doing so!!!!!

Willy.
 
C

Chris Mullins

Peter Olcott said:
There is no fragmentation, all of the memory allocation in both programs
is in an identical tight loop. Since the native code can allocate 50% more
memory than the .NET code, and the .NET code is essentially identical to
the native code, the difference must be in the .NET run-time.

That doesn't prove anything, as Willy has pointed out again and again.
Native code can't address the whole 4GB address space? In any case 2 GB
should be plenty until the 64-bit memory architecture becomes the norm.

No, it can't.

It gets the same 4GB address space as every other win32 application, and
it's limited to the lower half of that, just like every other Win32
application. This means, at most, you have 2GB to play with. For all Win32
apps.

I agree with whomever said that. Oh, wait. That was me! :)

Even if it works on your computer today in unmanaged C++, there will be
computers on which it won't work, and times on your computer on which it
won't work. You're really just being stubborn at this point.

Enough people (and some, like Willy and Jon, who are damn smart), have
expressed this that it's time to open your eye's and agree they just might
know what they're talking about.
 
C

Chris Mullins

Peter Olcott said:
My current plan is to offer a combined GUI Scripting Language
Mouse/Keyboard macro recorder that can be used to place custom optimized
user interfaces on top of existing systems.
In addition to this product custom development using this product will be
provided as a service.

I'm a software engineer. A darn good one. I know what GUI's are. I know a
dozen scripting languages. I've written mouse and keyboard drivers in
assembly. I've written countless macros. I spend tons of time optimizing
things.

I don't understand what it is you just said. I'm not just being obstinate
and stubborn either.

It's time to sit down with an A tier marketing guy and get your message
straight.

[Snip]


.... at this point, I wish you the best of luck. I'm done responding to you,
as you're not really willing to listen to what I or anyone else has said to
you and I'm just wasting my time.
 
P

Peter Olcott

Willy Denoyette said:
NET is not kind of an alian, it's just a thin layer in top of Win32 it uses
the same system services as native code compiled with whatever compiler you
can use on Windows, the CLR and GC re allocating memory from the heap (that is
from Virtual memory)through the same calls as the C runtime library, and do
you know why? Because the CLR uses the same C runtime and there is no other
way to allocate memory in Windows.

Then what explains the reason why two otherwise identical programs both run on
the same computer and OS (XP Pro) and one can allocate 50% more memory than the
other??? You claim that the systems are the same, yet empirical fact shows that
they have different results. One can not get different result from identical
systems. The only essential difference is the former is native code, and the
latter is .NET.

By the way, some experts have told me, that some aspects of the .NET
architecture are not merely wrappers around the pre-existing architecture but
completely re-written subsystems.
As we told you before, the process heap is fragmented from the start of the
program, the way it's fragmented is determined by the modules loaded into the
process space, so, there might be a difference between different type of
applications. Native C++ console applications don't have to load the CLR
runtime and some of the FCL libraries, that means that the heap is less
fragmented as it's the case with a C# console program, but a real world C++
program also needs to load libraries, and these will fragment the heap just
like in the case of .NET.

As I said (and others too) each time the List (or vector) overflows it must be
extended, please refer to my prevous post to know exactly what this is all
about. To prevent this you have to pre-allocate the List or vector.

Running following code won't throw an OOM when run on 32 bit windows XP.

static void Main()
{
List<byte> bList;= new List<byte>(1600000000); // 1.600.000.000 bytes
for(int i = 0; i < bList.Capacity; i++)
bList = 12;
}

while this will throw...

bList = new List<byte>();
for(int i = 0; i < 600000000 ; i++) // 600.000.000 bytes
bList.Add(12);

but 512.000.000 bytes will work...


Now back to C++, this will throw.

#include <vector>
#include <iostream>

static void main()
{
std::vector<unsigned char> *bList = new std::vector<unsigned char>;
try {
for(int i = 0; i < 700000000; i++) // 700.000.000 bytes
bList->push_back(12);
}
catch( char * str ) {
std::cout << "Exception raised: " << str << '\n';
}
}

while 640.000.000 bytes may work.

But what's the difference, 100MB? point is that you can't allocate the full
2GB and you need to pre-allocate whenever you are allocating that huge objects
(> 512MB) on 32 bit systems, native code or managed code it doesn't matter.
And don't let me get started about the performance implication by not doing
so!!!!!

Willy.
 
P

Peter Olcott

Chris Mullins said:
That doesn't prove anything, as Willy has pointed out again and again.


No, it can't.

It gets the same 4GB address space as every other win32 application, and it's
limited to the lower half of that, just like every other Win32 application.
This means, at most, you have 2GB to play with. For all Win32 apps.


I agree with whomever said that. Oh, wait. That was me! :)

Even if it works on your computer today in unmanaged C++, there will be
computers on which it won't work, and times on your computer on which it won't
work. You're really just being stubborn at this point.

Enough people (and some, like Willy and Jon, who are damn smart), have
expressed this that it's time to open your eye's and agree they just might
know what they're talking about.

And some of them are attempt to put forth the point the identical systems can
have consistently different results. I never accept any analytical impossibility
no matter who the source.
 
P

Peter Olcott

Chris Mullins said:
I'm a software engineer. A darn good one. I know what GUI's are. I know a
dozen scripting languages. I've written mouse and keyboard drivers in
assembly. I've written countless macros. I spend tons of time optimizing
things.

I don't understand what it is you just said. I'm not just being obstinate and
stubborn either.

No other technology can possibly produce a mouse macro recorder or GUI scripting
language that can always see where it needs to click the mouse.

If you know what GUI scripting languages are
http://en.wikipedia.org/wiki/Scripting_programming_language#GUI_Scripting
(A entirely different thing than scripting languages in general),

and you know what a mouse recorder is,
http://www.google.com/search?hl=en&lr=&safe=off&q=mouse+recorder
you should be able to get my point.

You will not be able to get my point until you know these exact terms as they
are precisely defined.
It's time to sit down with an A tier marketing guy and get your message
straight.

[Snip]


... at this point, I wish you the best of luck. I'm done responding to you, as
you're not really willing to listen to what I or anyone else has said to you
and I'm just wasting my time.
 
P

Peter Olcott

Chris Mullins said:
Peter Olcott said:
Here is what I have spent 18,000 hours on since 1999:
www.SeeScreen.com

I beat ya too it. :)

I looked through that earlier, when I was trying to figure out what on earth
you were needing to allocate 1GB of memory for.

As an aside, from one business owner to another, you really need to focus on
the message there. I went through quite a bit of the site, and wasn't clear on
how it could save me money. I own a computer software company (and act [most
of the time] as Chief Architect) , and we do LOTS of testing for our software.
In that sends, I'm pretty close to the ideal customer. I realize it helps with
testing, and allows testing to be easier, but in terms of what points of pain
is it addressing, I really don't know.

As far as automated GUI testing goes, the process for this is very well defined,
especially for regression testing. There is much published material on automated
testing.
http://en.wikipedia.org/wiki/Test_automation
 
P

Peter Olcott

Chris Mullins said:
I'm a software engineer. A darn good one. I know what GUI's are. I know a
dozen scripting languages. I've written mouse and keyboard drivers in
assembly. I've written countless macros. I spend tons of time optimizing
things.

I don't understand what it is you just said. I'm not just being obstinate and
stubborn either.

The problem was my fault, I did not answer your prior post properly. You
specifically asked about testing, and my answer did not address this point,
sorry.
It's time to sit down with an A tier marketing guy and get your message
straight.

[Snip]


... at this point, I wish you the best of luck. I'm done responding to you, as
you're not really willing to listen to what I or anyone else has said to you
and I'm just wasting my time.
 
P

Peter Olcott

Willy Denoyette said:
NET is not kind of an alian, it's just a thin layer in top of Win32 it uses
the same system services as native code compiled with whatever compiler you
can use on Windows, the CLR and GC re allocating memory from the heap (that is
from Virtual memory)through the same calls as

I think that I just found out the reason for the difference. Visual C++ 6.0
std::vector has a memory growth factor of 1.5. Whereas Generic.List has been
reported to have a memory growth factor of 2.0. The next reallocation of
std::vector will fit into contiguous free RAM because it is only 50% larger.

The next allocation of Generic.List will not because it is twice as big. Both of
the prior allocations fit into actual RAM without the need of virtual memory.
Although the next allocation of std::vector will fit into free RAM, it must
write the current data to virtual memory to make room.

What it boils down to is the single difference of the memory growth factor can
entirely account all of the differences in the results.
the C runtime library, and do you know why? Because the CLR uses the same C
runtime and there is no other way to allocate memory in Windows.
As we told you before, the process heap is fragmented from the start of the
program, the way it's fragmented is determined by the modules loaded into the
process space, so, there might be a difference between different type of
applications. Native C++ console applications don't have to load the CLR
runtime and some of the FCL libraries, that means that the heap is less
fragmented as it's the case with a C# console program, but a real world C++
program also needs to load libraries, and these will fragment the heap just
like in the case of .NET.

As I said (and others too) each time the List (or vector) overflows it must be
extended, please refer to my prevous post to know exactly what this is all
about. To prevent this you have to pre-allocate the List or vector.

Running following code won't throw an OOM when run on 32 bit windows XP.

static void Main()
{
List<byte> bList;= new List<byte>(1600000000); // 1.600.000.000 bytes
for(int i = 0; i < bList.Capacity; i++)
bList = 12;
}

while this will throw...

bList = new List<byte>();
for(int i = 0; i < 600000000 ; i++) // 600.000.000 bytes
bList.Add(12);

but 512.000.000 bytes will work...


Now back to C++, this will throw.

#include <vector>
#include <iostream>

static void main()
{
std::vector<unsigned char> *bList = new std::vector<unsigned char>;
try {
for(int i = 0; i < 700000000; i++) // 700.000.000 bytes
bList->push_back(12);
}
catch( char * str ) {
std::cout << "Exception raised: " << str << '\n';
}
}

while 640.000.000 bytes may work.

But what's the difference, 100MB? point is that you can't allocate the full
2GB and you need to pre-allocate whenever you are allocating that huge objects
(> 512MB) on 32 bit systems, native code or managed code it doesn't matter.
And don't let me get started about the performance implication by not doing
so!!!!!

Willy.
 
W

Willy Denoyette [MVP]

Trimmed...
I think that I just found out the reason for the difference. Visual C++ 6.0 std::vector
has a memory growth factor of 1.5. Whereas Generic.List has been reported to have a memory
growth factor of 2.0. The next reallocation of std::vector will fit into contiguous free
RAM because it is only 50% larger.

The next allocation of Generic.List will not because it is twice as big. Both of the prior
allocations fit into actual RAM without the need of virtual memory. Although the next
allocation of std::vector will fit into free RAM, it must write the current data to
virtual memory to make room.

You are getting close, the growth factor in C++ is implementation dependent and is not
standard defined, but it may (and does) vary from implementation to implementation.
The growth factor for generic containers in .NET is not a fixed 2.0 factor but varies
depending the actual capacity of the container, that is it starts with factor 2 for small
containers and once it has reached a threshold it drops to 1.5. A growth factor of 2 is
advantageous, in terms of performance (less GC pressure), for small containers that grow
quickly, but is disadvantageous for large containers in terms of memory consumption.
But there is more, containers that are >85KB are allocated from the so called Large Object
Heap (LHO), and this one isn't compacted by the GC after a collection run, simply because
it's too expensive to move these large objects around in memory, that means that you can end
with a fragmented LOH heap if you don't care about your allocation scheme.
Think what's happening in this scenarion:
thread T1 allocates a List<int>() say L1 and starts filling the List with 1000000 int's
at the same time T1 fills L1, thread T2 allocates a List<double>() say l2and starts filling
this list with 100000 doubles
this will result in a highly fragmented LOH especially when one of the containers are long
living. In this case you may even get OOM exceptions when allocating much smaller objects
than the total free heap space. In such scenario the only solution is to start with
pre-allocated containers, say 250000 for L1 and 25000 for L2 to reduce the number of
fragments if you don't know the exact "end size", but much better is to allocate the end
size. Anyway, native or managed you must be prepared to receive OOM exceptions but more
importantly - you should try to prevent OOM when allocating large objects, pre-allocating
is such a technique and as a bonus it helps with performance.
The native heap is never compacted by the C++ allocator, that means that native applications
are more sensible to fragmentation than managed application, that's one of the many reasons
the GC has been invented.

Willy.
 
P

Peter Olcott

Willy Denoyette said:
Trimmed...


You are getting close, the growth factor in C++ is implementation dependent
and is not standard defined, but it may (and does) vary from implementation to
implementation.
The growth factor for generic containers in .NET is not a fixed 2.0 factor but
varies depending the actual capacity of the container, that is it starts with
factor 2 for small containers and once it has reached a threshold it drops to
1.5. A growth factor of 2 is advantageous, in terms of performance (less GC
pressure), for small containers that grow quickly, but is disadvantageous for
large containers in terms of memory consumption.
But there is more, containers that are >85KB are allocated from the so called
Large Object Heap (LHO), and this one isn't compacted by the GC after a
collection run, simply because it's too expensive to move these large objects
around in memory, that means that you can end with a fragmented LOH heap if
you don't care about your allocation scheme.
Think what's happening in this scenarion:
thread T1 allocates a List<int>() say L1 and starts filling the List with
1000000 int's
at the same time T1 fills L1, thread T2 allocates a List<double>() say l2and
starts filling this list with 100000 doubles
this will result in a highly fragmented LOH especially when one of the
containers are long

But isn't the term Garbage Collection and eliminating fragmented memory
one-and-the-same thing?
 
W

Willy Denoyette [MVP]

Peter Olcott said:
But isn't the term Garbage Collection and eliminating fragmented memory one-and-the-same
thing?

Yes, but "compactation" only applies to the Gen0, 1 and 2 heaps, that is for the heaps that
hold the objects smaller than 85KB, the LOH is collected but never compacted, it's simply to
expensive to compact the heap for such large objects. You really should pre-allocate large
objects as much as you can, if you don't, you probably will waste a lot more memory than if
you did.
Again take a look at the allocation scheme for a non pre-alloc'd List (well, to be exact
the underlying array) :

say the LOH starts at 0x02000000 (note that the small objects won't be allacoted from the
LOH)
and say that the first array (128kb bytes) in the LOH starts at :
0x02000000 array size = 12 + 131072= 131084 (object header + array values)
after the first four expansions, and supposed no other thread allocates in between, you will
have a LOH that looks like:

0x02000000 - 131084bytes (1)
0x0202000c - 262156 bytes (2)
0x0204000c- 524300 bytes (3)
0x0208000c - 1048588 bytes (4)
0x0210000c - 2097164 bytes (5)
....
see when expanding to (5), 1.2 and 3 are no longer needed, but their sum is not large
enough to accommodate the 2097164 bytes needed by (5)
Now, this goes on until you have a "free trailer" of let's say ~256MB followed by an array
of 512MB which need to get expanded to 1GB. That means that you'll get an OOM because the
allocator can't find a contiguous free space of 1GB in memory in top of the already occupied
(but still needed) 512MB, and the 256MB area which is too small anyway.
Note that the "free trailer" can be used by other threads to allocate objects from, the
"old" array objects in this area are freed by the GC...

Willy.
 
P

Peter Olcott

Willy Denoyette said:
Yes, but "compactation" only applies to the Gen0, 1 and 2 heaps, that is for
the heaps that hold the objects smaller than 85KB, the LOH is collected but
never compacted, it's simply to expensive to compact the heap for such large
objects.

This should be a user selectable option instead of mandatory. Making some tasks
infeasible because the memory manager thinks that it is not fast enough is not
the ideal solution.
You really should pre-allocate large objects as much as you can, if you don't,
you probably will waste a lot more memory than if you did.

Yet I simply can't do that. The amount of memory that I need is completely
unpredictable. It can be anywhere from 25K to far more than the machine has.
 
W

Willy Denoyette [MVP]

Peter Olcott said:
This should be a user selectable option instead of mandatory. Making some tasks infeasible
because the memory manager thinks that it is not fast enough is not the ideal solution.


Yet I simply can't do that. The amount of memory that I need is completely unpredictable.
It can be anywhere from 25K to far more than the machine has.
This is a bad excuse, pre-allocate more than you reasonably need and trim the excess, what's
the point it's all virtual memory. But don't forget - you can't allocate a single object
that is larger than 2GB anyway.
Note, this is my last reply, it makes no sense to continue this discussion. You simply don't
understand how the 1) the OS memory manager works 2) how GC collector works and what it
takes to move objects around, what the impact is on all running threads in the system, YES!
IN THE SYSTEM when some 500MB of data has to move from one location to another. One
suggestion though, stay away from .NET and return to native code , you are simply not ready
for it.

Willy.
 
P

Peter Olcott

Willy Denoyette said:
This is a bad excuse, pre-allocate more than you reasonably need and trim the
excess, what's the point it's all virtual memory. But don't forget - you can't
allocate a single object that is larger than 2GB anyway.

I have no idea in advance how much memory that I will need. It can be anywhere
at all between 25K and several gigabytes, should I follow your advice and always
allocates at least one GB, or is it that your advice may not apply to my
situation?

I could run through the whole process twice, and then know how much memory that
I need in advance, yet this will double the time that my process takes. The
amount of memory that I need depends upon a number of things that can not
possibly be predicted or even reasonably estimated in advance.
 
J

John J. Hughes II

No :) but then I never had a reason too.

I would have a tendency if you don't know what you are going to need but
expect it to be very large more often then not to write the data to temp
file first. If it turns out to be small you can load the file to memory,
otherwise process it from disk.

Regards,
John
 
P

Peter Olcott

John J. Hughes II said:
No :) but then I never had a reason too.

I would have a tendency if you don't know what you are going to need but
expect it to be very large more often then not to write the data to temp file
first. If it turns out to be small you can load the file to memory,
otherwise process it from disk.

I expect to eventually have as many as millions of users, so I want to produce
the best possible solution. Since disk access is something like 1000-fold slower
than memory access, I don't want to take this kind of performance hit.
 
P

Peter Olcott

Willy Denoyette said:
This is a bad excuse, pre-allocate more than you reasonably need and trim the
excess, what's the point it's all virtual memory. But don't forget - you can't
allocate a single object that is larger than 2GB anyway.

It turns out that your advice will work. I don't actually have to run through
the whole process twice to determine my memory requirements. I can run through
one part of the process once, and store the results, then use these results in
the next step.

This design is a cleaner design because I can know in advance whether or not I
am going to have any memory problems, and simply inform the user that there is
not enough memory for the requested task, rather than have to deal with
OUT_OF_MEMORY exception processing.
 
L

Lucian Wischik

Peter Olcott said:
I expect to eventually have as many as millions of users, so I want to produce
the best possible solution. Since disk access is something like 1000-fold slower
than memory access, I don't want to take this kind of performance hit.

Then you're in for a surprise! When you allocate something, it always
gets taken out of the virtual address space range, and the operating
system decides if and when it wants to page bits of your data to disk.
 
P

Peter Olcott

Lucian Wischik said:
Then you're in for a surprise! When you allocate something, it always
gets taken out of the virtual address space range, and the operating
system decides if and when it wants to page bits of your data to disk.

Yet it does not make this decision on an arbitrary and capricious manner, it
only actually pages to disk when it needs to. Since I will most often only need
less than 200MB, and the minimum requirements for my application will be stated
as 500MB, there is no sense for me to always manually page to disk, when this
paging is not necessary.

Since there was a way that I could determine the precise amount of my huge
memory allocation in advance, this is the best way to go. It is worth all the
extra effort specifically because it allows me to prevent OUT_OF_MEMORY
exceptions instead of having to catch them and deal with them as they occur.
Because I was able to accomplish this without duplicating the steps, it might
even improve performance by eliminating memory reallocation. In any case the
degradation to performance is minimal if any, about seven seconds in the worst
case.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top