garbage collection problem in large linked list

M

Mr. Mountain

I have a need to implement a linked list that will hold a large number of
items. I have done a little performance testing to compare the linked list
against the framework's standard hashtable.

So far the linked list meets my objectives in all ways -- except for one
problem that it exhibits. It has a garbage collection problem.

If I insert 4 million test objects into a hashtable and run tests
repeatedly, the garbage collector does its job as one would expect. I can
run as many repeated tests as I wish.

When I do the exact same test with the same 4 million test objects in the
linked list, the first test is fine. The next test results in an
out-of-memory exception because the garbage collector does not run. If I
call gc.collect between tests, there are no out-of-memory problems for the
linked list.

Why is it necessary to call gc.collect manually when the linked list is
used? I have never had to request a manual garbage collection in any other
situation.

What steps can I take to optimize the memory management when a large linked
list needs to be used? This is NOT a matter of roots pointing to objects.
The objects are all free to be collected after each test completes.

I'm really interested in knowing if the framework can perform collection as
efficiently (and automatically) for several million individually allocated
list nodes as it can for the contiguous memory block that underlies a
hashtable. If it can, what technique(s) do I need to use to make it happen?

My test code looks like this (I show the hashtable example, but the linked
list test driver is exactly the same):

private void btnTest_Click(object sender, System.EventArgs e)
{
DateTime start = DateTime.Now;
Hashtable ht = new Hashtable();

for (int i = startIndex; i < testSize; i++)
{
Hashtable innerHt = new Hashtable();
ht.Add(i, innerHt);

for (int j = startIndex; j < testSize; j++)
{
OrganizationMember om = new
OrganizationMember(j.ToString(), j.ToString(), "Test");
innerHt.Add(j, om);
}
}

for (int i = startIndex; i < testSize; i++)
{
Hashtable innerHt = ht as Hashtable;

for (int j = startIndex; j < testSize; j++)
{
OrganizationMember om = innerHt[j] as OrganizationMember;
}
}

lblSystemResults.Text = (DateTime.Now -
start).TotalMilliseconds.ToString();
}

private void btnClear_Click(object sender, System.EventArgs e)
{
System.GC.Collect();
}//btnClear_Click
 
R

Richard Blewett [DevelopMentor]

Two things:

1) Can you try
GC.Collect(0);
GC.Collect(1);
GC.Collect(2);
each in place of the GC.Collect and tell us what the result is in each of those situations.

2) you really need to show us the code for th linked list rather than the test harness as this is where the issue will be

Regards

Richard Blewett - DevelopMentor
http://www.dotnetconsult.co.uk/weblog
http://www.dotnetconsult.co.uk

I have a need to implement a linked list that will hold a large number of
items. I have done a little performance testing to compare the linked list
against the framework's standard hashtable.

So far the linked list meets my objectives in all ways -- except for one
problem that it exhibits. It has a garbage collection problem.

If I insert 4 million test objects into a hashtable and run tests
repeatedly, the garbage collector does its job as one would expect. I can
run as many repeated tests as I wish.

When I do the exact same test with the same 4 million test objects in the
linked list, the first test is fine. The next test results in an
out-of-memory exception because the garbage collector does not run. If I
call gc.collect between tests, there are no out-of-memory problems for the
linked list.

Why is it necessary to call gc.collect manually when the linked list is
used? I have never had to request a manual garbage collection in any other
situation.

What steps can I take to optimize the memory management when a large linked
list needs to be used? This is NOT a matter of roots pointing to objects.
The objects are all free to be collected after each test completes.

I'm really interested in knowing if the framework can perform collection as
efficiently (and automatically) for several million individually allocated
list nodes as it can for the contiguous memory block that underlies a
hashtable. If it can, what technique(s) do I need to use to make it happen?

My test code looks like this (I show the hashtable example, but the linked
list test driver is exactly the same):
 
M

Mr. Mountain

Richard,
I performed the generational test you suggested. It required a generation 1
collection at the minimum. A generation 0 collection did not help.

If there are no roots keeping objects alive, what factors would explain why
a linked list requires an explicit call to gc.collect while a hashtable does
not require such a call?

My linked list code is too long to post in its entirety and there are some
parts of it that I cannot post anyway (for the usual reasons). However, key
part, the node in C#, is shown below. This code, together with the test
harness should (I hope) show you what you need to see. If not, let me know
and I'll post other snippets.

public class LinkedList
{
private NodeType head;
[...]

private class NodeType
{

private Object info;
public Object Info
{
get
{
return info;
}
set
{
info = value;
}
}//Info

public NodeType Next;

}

}

Richard Blewett said:
Two things:

1) Can you try
GC.Collect(0);
GC.Collect(1);
GC.Collect(2);
each in place of the GC.Collect and tell us what the result is in each of those situations.

2) you really need to show us the code for th linked list rather than the
test harness as this is where the issue will be
 
R

Richard Blewett [DevelopMentor]

Just to get a bit more environmental information ...

Do you have any finalizers in your code that you haven't shown?

Regards

Richard Blewett - DevelopMentor
http://www.dotnetconsult.co.uk/weblog
http://www.dotnetconsult.co.uk

Richard,
I performed the generational test you suggested. It required a generation 1
collection at the minimum. A generation 0 collection did not help.

If there are no roots keeping objects alive, what factors would explain why
a linked list requires an explicit call to gc.collect while a hashtable does
not require such a call?

My linked list code is too long to post in its entirety and there are some
parts of it that I cannot post anyway (for the usual reasons). However, key
part, the node in C#, is shown below. This code, together with the test
harness should (I hope) show you what you need to see. If not, let me know
and I'll post other snippets.

public class LinkedList
{
private NodeType head;
[...]

private class NodeType
{

private Object info;
public Object Info
{
get
{
return info;
}
set
{
info = value;
}
}//Info

public NodeType Next;

}

}
 
F

Frank Hileman

Hello,

I posted some explanations and solutions in the performance newsgroup.

Regards,
Frank Hileman

check out VG.net: http://www.vgdotnet.com
Animated vector graphics system
Integrated Visual Studio .NET graphics editor
 
M

Mr. Mountain

No I don't have any finalizers in this code. I'm well aware of the issues
with finalizers and gc -- I should have mentioned that none of this code
uses finalizers.

Thanks for your interest.
 
M

Mr. Mountain

Thank you! I thought no one was going to reply over there, so after a day or
so of waiting, I posted here.

Your reply was helpful and I have posted a few follow-up questions in the
other newsgroup.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top