GC, Windows or Design problem?

  • Thread starter Thread starter Tamir Khason
  • Start date Start date
T

Tamir Khason

I have VERY BIG object (>2G) serialized and I'm loading it into some object.
In client's site I have P4 3G/512Mb computer where the application running.
In the computer the page file is 3G. But while loading the object from disk
on ~400-500Mb PF usage the CPU conter comed to 0 and memory remains in the
same value and application do not load the file any more it just do nothing
(IMHO)
What's the problem and how to solve it?

TNX
 
Hi,

It looks like a design problem.
There is no sense in keeping 2GB in memory at once.
How could you serialize too much data?
I understand that serialization makes that file bigger.

If you use the SOAP or XML then think about BINARY formatter.
But in case of BINARY 2GB of data, i think that there is no
good solution for your problem.

If i miessed something then don't wait with feedback.

Cheers!
Marcin
 
Tamir Khason said:
I have VERY BIG object (>2G) serialized and I'm loading it into some
object. In client's site I have P4 3G/512Mb computer where the application
running. In the computer the page file is 3G. But while loading the object
from disk on ~400-500Mb PF usage the CPU conter comed to 0 and memory
remains in the same value and application do not load the file any more it
just do nothing (IMHO)
What's the problem and how to solve it?

TNX


You can't simply create such big object (and I would not call this an object
in terms of OO design practices) on windows (32 bit). The largest chunk of
memory you will ever have to create a single object is:
2GB - the space taken by the runtime(s), the BCL and your code , is ~1.4GB.
This value is a theoretical maximum, the practical size can be much smaller
depending on the fragmentation level of the heap(s).

Question is how did you ever managed such "object" to get serialized? It
looks like you have a file on disk that you want to deserialize to an
object, but actually the 'object' it's nothing more than a stream of bytes
and not really an object (a single unit of structured data), so I suggest
you use the data as a file stream probably consisting of smaller blocks of
structured data that can be treated as objects.

Willy.
 
I think 32bit Windows only gives 2G of memory to User space and keeps 2G
for the OS. You might have to look at using 64Bit Windows to do this.
 
My object is CollectionBase collection of rather big structures, where
serialized in server (4xXeon, 5G) and should be deserialized on client...
Those I can not use it as stream, I need an enumeration
 
Tamir Khason said:
My object is CollectionBase collection of rather big structures, where
serialized in server (4xXeon, 5G) and should be deserialized on client...
Those I can not use it as stream, I need an enumeration

How did you manage to 'serialize' such big collection of big structures on
the server? Did you use C# (or any other managed code) to do this on the
server?
If the server runs 32 bit windows (the number of CPU's is not relevant here)
it's simply not possible that this was one single collection.
IMO the data in the file is binary data (large structures) produced by a non
..NET program, please correct me if I'm wrong. If that's the case you should
just read the binary structures from disk and store the data in separate
objects. Note however that it wont be possible to read all the data
structures in memory at once, and that it won't be possible to store all the
objects in a collection either.

Willy.
 
How did you manage to 'serialize' such big collection of big structures on
the server? Did you use C# (or any other managed code) to do this on the
server?
Yes, 100% managed C# code
The implementation is REALLY simple.
If the server runs 32 bit windows (the number of CPU's is not relevant
here)
It is!
it's simply not possible that this was one single collection.
But it works!
 
Tamir said:
I have VERY BIG object (>2G) serialized and I'm loading it into some
object.

Regarding the other postings in this thread, you might want to look for a
way to create proxy objects for your collection items, only loading the "big
parts of the structures" when you really need them.

Assuming this object is a collection of a certain number of objects, you'd
first create that number of proxy objects containig a minimal information.
When working with each of this objects, you'd the load the complete object
from the serialized file and work with only one of these objects
(structures) at a time.

This might mean you'd have to change the original serialisation of your
objects giving you the possibility not to read the whole file into memory...

[...rest snipped...]
 
Tamir Khason said:
Yes, 100% managed C# code
The implementation is REALLY simple.

It is!

But it works!

Are you trying to tell me that the object (the collection) is bigger than
2GB (in memory) on the server or is it the file that's bigger than 2GB? I
guess it's the latter.
Note that the file size can be larger than the object size depending on the
formatter used, for instance the xml serializer produces much larger files
than the binary formatter.
What formatter did you use to serialize?
How did you calculate/determined the object size? (you said object > 2GB and
I say it's not possible to have such object). Anyway you should never assume
that you will be able to deserialize such large objects (> 1GB real object
size) on Win32 even if it was possible to serialize them.

Willy.
 
Are you trying to tell me that the object (the collection) is bigger than
2GB (in memory) on the server
Yes, this is collection, but I do not know it's size. It produces ~2.4Gb
file using binary serialization with 100% managed C#
What formatter did you use to serialize?
As mentioned earlier: Binary
How did you calculate/determined the object size? (you said object > 2GB
and
I do not! I know only serialized object (file) size
 
Tamir Khason said:
Yes, this is collection, but I do not know it's size. It produces ~2.4Gb
file using binary serialization with 100% managed C#

As mentioned earlier: Binary

I do not! I know only serialized object (file) size

Well as I said, don't assume that you will be able to de-serialize such
large object, it's quite simple you only have 2GB virtual address space on
Win32, part of this is taken by the code (runtime and yours) part is
internal process/thread administration, the remainder is up to be used by
the managed heap to store objects, say your object (like all objects) needs
1.2 GB of object space to be able to de-serialize the collection, but your
program also needs the data to be read from the persisted file before it can
reconstruct the object graph, this amount of memory (don't know how large it
is , but certainly larger than the largest object in the collection) needed
to read the file data (unmanaged heap) is simply not available, or the
managed heap cannot grow to accommodate the collection.
Now, why did it work on the server? Simply you were able to create the
collection of that size, during serialization, the GC heap could shrink
while the object data was persisting/serializing, this is not possible on
the de-serializing side.
Simply put your object is too large for 32 bit sorry, you have to cut it in
peaces.

Willy.
 
Back
Top