Big memory overhead

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

Hello,

I'm writing an app running both under .NET and Compact Frameword. I need to allocate lot of little mem blocks. They form a n-ary tree, i.e. pointing at each other. The mem alloc by the program is just prepostrous - it's about 4times more than I esimate these objects should take. On PC it's not such a big deal but on the PPC 32MB device it's quite a problem. Does anyone know it there's something to do about it? Or why it is happening?

tnhx
 
Mikolas said:
I'm writing an app running both under .NET and Compact Frameword. I
need to allocate lot of little mem blocks. They form a n-ary tree,
i.e. pointing at each other. The mem alloc by the program is just
prepostrous - it's about 4times more than I esimate these objects
should take. On PC it's not such a big deal but on the PPC 32MB
device it's quite a problem. Does anyone know it there's something to
do about it? Or why it is happening?

Could you give an example of the kind of object you're allocating, and
how you're estimating the required memory usage?
 
Mikolas said:
For example one of the classes these data members :
{
uint sectionID; // my estimate 4B
string sectionName; // dunno, but hope not much more than 2*length

ArrayList subsections;//my estimate 10B + capacity*4
ArrayList machines;//my estimate 10B + capacity*4
ArrayList graphicObjs;//my estimate 10B + capacity*4
IDMap graphicObjsToSubsections;//my estimate 10B + capacity*4
IDMap graphicObjsToMachines;//my estimate 10B + capacity*4

Okay. For one thing, there's 8 bytes overhead for the class itself.
Then for each reference, there's a 4 byte overhead. string takes
12 bytes + (length+1)*2 rounded up to a multiple of 4
Each ArrayList is going to be 24 + capacity*4 (I think). IDMap will
depend on the implementation, obviously.

So, how "out" your estimates are really depend on the size of strings
and the capacity of your ArrayLists. Could you give a sample object and
what size you're seeing, and we'll see if we can explain it?
 
Keep in mind that strings in C# are wide character strings, so there's you
minimum 2*length. The you need to add in data members like Length (which
I'm sure is not recaclulated everytime the Length property is accessed) and
the internal member to stroe the address of the string, plus a minimum of 2
bytes per character. Even in C++ string objects can be heavy.

The ArrayLists I don't know about. I guess it depends on the memory
management of the object and how it was coded(performance vs. size) Again
keep in mind internal members such as size, reference, and synconizer
objects. Also what are the lists storing? Capacity * 4 makes me assume
refernce types but if they are value types or structs it can be larger.

..02
jim
 
Jim H said:
Keep in mind that strings in C# are wide character strings, so there's you
minimum 2*length. The you need to add in data members like Length (which
I'm sure is not recaclulated everytime the Length property is accessed) and
the internal member to stroe the address of the string, plus a minimum of 2
bytes per character. Even in C++ string objects can be heavy.

There's no internal member to store the address of the string - strings
are special in that the size of the string object itself varies
according to the size. The data for the string is within the string
object itself (unlike Java, where String contains a char array
reference).
The ArrayLists I don't know about. I guess it depends on the memory
management of the object and how it was coded(performance vs. size) Again
keep in mind internal members such as size, reference, and synconizer
objects. Also what are the lists storing? Capacity * 4 makes me assume
refernce types but if they are value types or structs it can be larger.

No, the size of the ArrayList's data array will always be capacity*4
because it *can* only store references - value types end up getting
boxed (assuming we're not talking about v2 with generics).
 
Mikolas said:
I created a sample binary tree with 2^17 nodes like mentioned above,
when created the the mem usage in Task list rises by 56MB. This could
also be because there's still plenty of free mem. On PPC though, when
I create 2^16 nodes the mem usage rises by 13MB and there's almost
nothing left(from 32MB).

But what is the contents of the ArrayLists and strings in that case?
That's only 488 bytes per node, which doesn't sound very much to me -
you don't need to have very large ArrayLists to see that.
You can easily get similar results from
<code>
class MyTree
{
string name;
MyTree left, right;
MyTree(string name)
{
this.name = name;
}

public static MyTree CreateSample(string name, int depth)
{
if (depth <= 0)
return null;

MyTree mt = new MyTree(name);

mt.left = CreateSample(name + "0", depth - 1);
mt.right = CreateSample(name + "1", depth - 1);

return mt;
}


}
MyTree mt;
private void button1_Click(object sender, System.EventArgs e)
{
this.mt = MyTree.CreateSample("root", 20);

}
</code>

And what results did you get from that? (I haven't got time to run it
myself right now.)
 
Jon Skeet said:
No, the size of the ArrayList's data array will always be capacity*4
because it *can* only store references - value types end up getting
boxed (assuming we're not talking about v2 with generics).

Jon,

The exact size of the ArrayList's data array (Object[]) is actualy; 16 +
capacity*4.

Or:
0 Int32 Sync#
4 Int32 MT ptr. (method table ptr. Object[])
8 Int32 Component size
12 Int32 MT ptr. (Method table pointer of element types
(Object))
16 Int32 elem1
20 Int32 elem2
.....

Willy.
 
Did you start with an initial size for the ArrayList?
If you don't, your first AL (actually the items Object[]) will have an
initial size of 16 + 4*4bytes, when this array get full the CLR will expand
this array using a simple algorithm ((actual number of elements * component
size) * 2), so yes, it grows exponential.
See where this goes to?

Willy.



Mikolas said:
Jon Skeet said:
Okay. For one thing, there's 8 bytes overhead for the class itself.
Then for each reference, there's a 4 byte overhead. string takes
12 bytes + (length+1)*2 rounded up to a multiple of 4
Each ArrayList is going to be 24 + capacity*4 (I think). IDMap will
depend on the implementation, obviously.

So, how "out" your estimates are really depend on the size of strings
and the capacity of your ArrayLists. Could you give a sample object and
what size you're seeing, and we'll see if we can explain it?

I created a sample binary tree with 2^17 nodes like mentioned above, when
created the the mem usage in Task list rises by 56MB. This could also be
because there's still plenty of free mem.
On PPC though, when I create 2^16 nodes the mem usage rises by 13MB and
there's almost nothing left(from 32MB).

You can easily get similar results from
<code>
class MyTree
{
string name;
MyTree left, right;
MyTree(string name)
{
this.name = name;
}

public static MyTree CreateSample(string name, int depth)
{
if (depth <= 0)
return null;

MyTree mt = new MyTree(name);

mt.left = CreateSample(name + "0", depth - 1);
mt.right = CreateSample(name + "1", depth - 1);

return mt;
}


}
MyTree mt;
private void button1_Click(object sender, System.EventArgs e)
{
this.mt = MyTree.CreateSample("root", 20);

}
</code>



 
Willy Denoyette said:
No, the size of the ArrayList's data array will always be capacity*4
because it *can* only store references - value types end up getting
boxed (assuming we're not talking about v2 with generics).

The exact size of the ArrayList's data array (Object[]) is actualy; 16 +
capacity*4.

Or:
0 Int32 Sync#
4 Int32 MT ptr. (method table ptr. Object[])
8 Int32 Component size
12 Int32 MT ptr. (Method table pointer of element types
(Object))
16 Int32 elem1
20 Int32 elem2
....

Apologies, yes, I forgot about the normal overhead, length and type.
 
Willy Denoyette said:
Jon Skeet said:
No, the size of the ArrayList's data array will always be capacity*4
because it *can* only store references - value types end up getting
boxed (assuming we're not talking about v2 with generics).

Jon,

The exact size of the ArrayList's data array (Object[]) is actualy; 16 +
capacity*4.

Or:
0 Int32 Sync#
4 Int32 MT ptr. (method table ptr. Object[])
8 Int32 Component size
12 Int32 MT ptr. (Method table pointer of element types
(Object))
16 Int32 elem1
20 Int32 elem2
.....

Willy.


Is there any tool how to do "sizeof" then?

thnx
 
Mikolas said:
Is there any tool how to do "sizeof" then?

No, not really. I find the best way to judge is to create a lot of
objects and look at the memory difference. Change the number of objects
to make sure it's linear, otherwise you could get phased by overheads
etc.
 
A tool? Yes, the native debugger check
http://www.microsoft.com/whdc/devtools/debugging/default.mspx
and this article http://msdn.microsoft.com/msdnmag/issues/03/06/Bugslayer/
to get you started.


Willy.

Mikolas said:
Willy Denoyette said:
Jon Skeet said:
No, the size of the ArrayList's data array will always be capacity*4
because it *can* only store references - value types end up getting
boxed (assuming we're not talking about v2 with generics).

Jon,

The exact size of the ArrayList's data array (Object[]) is actualy; 16 +
capacity*4.

Or:
0 Int32 Sync#
4 Int32 MT ptr. (method table ptr. Object[])
8 Int32 Component size
12 Int32 MT ptr. (Method table pointer of element types
(Object))
16 Int32 elem1
20 Int32 elem2
.....

Willy.


Is there any tool how to do "sizeof" then?

thnx
 
Back
Top