arrays = pointers?

Peter Duniho · Mar 2, 2007

Well, there is nothing more than a pointer there, there is no pointer to
a pointer, or a table of pointers to references. Just the pointer.

Of course there is. The "table" is inherent in the memory manager (aka
garbage collector). The garbage collector obviously has some mechanism by
which it finds and updates references when objects are relocated. This
mechanism is the "table" (whether it is a literal table, or simply some
other data structure that allows the GC to find the references is
irrelevant...the mechanism exists, regardless).

Yes, but that is part of the garbage collector. There is nothing extra
added to the reference.

Who said anything about "extra added"? The only thing at issue is whether
some mechanism beyond the simple pointer exists. And some mechanism
beyond the simple pointer does exist.

Pete

Peter Duniho · Mar 2, 2007

This would explain why C# is fast.

Except when garbage collecting, of course. With the application having
just a single pointer, that means that when the GC relocates something, it
needs to go around updating all of the pointers that refer to it. If the
application had to do the double-dereference, then the GC would be able to
just update a single pointer and be done.

I presume that the thought is that garbage collection can happen
infrequently enough, and at least to some extent in the background, that
speeding the common case execution nets a gain. Of course, it does also
create synchronization issues, since while an object is being relocated
and all those pointers are being updated, .NET needs to ensure that no
code using that reference tries to access it.

Performance might be better the way .NET does it, but certainly the old
Mac-style "handle" paradigm was much simpler, and required much less
memory overhead (and back then, with memory space always being at a
premium, reducing data structure and code size was always the highest
priority).

If it were doubly indirection, then I think we'd see it being slower
than it is.

Execution of one's own code would definitely be slower, I agree. But
garbage collection could be much faster. It's just a matter of optimizing
for the most effective case (one hopes that in this scenario, the .NET
designers got it right...I assume they did

).

Pete

Guest · Mar 2, 2007

Peter said:
Of course there is. The "table" is inherent in the memory manager (aka
garbage collector). The garbage collector obviously has some mechanism
by which it finds and updates references when objects are relocated.
This mechanism is the "table" (whether it is a literal table, or simply
some other data structure that allows the GC to find the references is
irrelevant...the mechanism exists, regardless).

Who said anything about "extra added"? The only thing at issue is
whether some mechanism beyond the simple pointer exists. And some
mechanism beyond the simple pointer does exist.

Pete

Yes, there is a mechanism, but that's all in the garbage collector, as I
am saying over and over. There is no mechanism in the application code
that handles a reference any different from a pointer.

There is type information that the garbage collector uses, but that is
nothing special for references, that information exists for all
variables. The reference is just a pointer, there is nothing more than
the pointer, and the only thing that makes it a reference is how it's used.

Peter Duniho · Mar 2, 2007

Yes, there is a mechanism, but that's all in the garbage collector, as I
am saying over and over. There is no mechanism in the application code
that handles a reference any different from a pointer.

I never said the mechanism had to be in the application. I am simply at a
loss as to why you would invest so much time rebutting a point that I
never actually made

There is type information that the garbage collector uses, but that is
nothing special for references, that information exists for all
variables.

Yet, the mechanism by which the garbage collector *uses* this information
is entirely specific to the implementation of references. There's a bunch
of code in there that exists for the sole purpose of updating references
when the objects are relocated. It's disingenous to claim that there's
"nothing extra".

The reference is just a pointer, there is nothing more than the pointer,
and the only thing that makes it a reference is how it's used.

There is plenty more than the pointer. The code in the garbage collector,
and the data it uses, are all "more than the pointer".

Pete

Guest · Mar 2, 2007

Peter said:
Except when garbage collecting, of course. With the application having
just a single pointer, that means that when the GC relocates something,
it needs to go around updating all of the pointers that refer to it. If
the application had to do the double-dereference, then the GC would be
able to just update a single pointer and be done.

I presume that the thought is that garbage collection can happen
infrequently enough, and at least to some extent in the background, that
speeding the common case execution nets a gain.

Yes, as the program code has to do nothing at all to manage the
references, that means that updating a reference is perhaps twice as
efficient in .NET as in an environment that uses for example reference
counting. If let's say 5% of the code is doing this, then you save 5%
execution time that can be used for garbage collection, which of course
is way more than is normally used.

Of course, it does also
create synchronization issues, since while an object is being relocated
and all those pointers are being updated, .NET needs to ensure that no
code using that reference tries to access it.

This is done by simply freezing the application during the garbage
collection. This is normally not a problem at all, as the same thing is
done all the time by windows on a thread level for multitasking. The
only difference when the garbage collector does it from when the windows
dispatcher does it, is that it freezes all the threads in the
application. So, when an application is running in a single thread,
there is no difference at all.

Performance might be better the way .NET does it, but certainly the old
Mac-style "handle" paradigm was much simpler, and required much less
memory overhead (and back then, with memory space always being at a
premium, reducing data structure and code size was always the highest
priority).

Well, how much memory overhead is there really? The garbage collected
model doesn't need a table of handles, and there is no reference counter
in each object. There is just type information for the variables, and
that can't add up to much.

With the garbage collected model, all the memory management is done by
the garbage collector. Hopefully this means that as the garbage
collector has all the information, it is able to do better descisions
about the memory management.

In a reference counting model, each object would be freed at the instant
that the reference counter reaches zero (which means that clearing a
reference is very unpredictable timewise), while in the garbage
collected model an entire heap generation is cleared all at once.
Usually this means moving away the few object that still are used so
that the entire heap generation can be wiped clean.

As the most active heap generation is regularly cleaned out completely,
this vastly reduces memory fragmentation.

Execution of one's own code would definitely be slower, I agree. But
garbage collection could be much faster. It's just a matter of
optimizing for the most effective case (one hopes that in this scenario,
the .NET designers got it right...I assume they did ).

Yes, probably. As the applicataion code is running most of the time, and
the garbage collector only occasionally (from a computer point of view),
optimising the application code seems to be the obvious choise.

Even if the garbage collector has to to ten times the work to make up
for the lack of a object handle table or a reference reference table,
that should be nothing compared to making every reference operation in
the application code twice as fast.

Peter Duniho · Mar 2, 2007

Yes, as the program code has to do nothing at all to manage the
references, that means that updating a reference is perhaps twice as
efficient in .NET as in an environment that uses for example reference
counting.

It seems you are again responding to points I did not make. I never once
brought up the question of "reference counting", nor do I find it relevant
to the discussion I'm participating in.

The .NET method is nowhere near as efficient in *updating* as the old Mac
"handles" method. Using handles, a single value needs to be updated.
That's it. It's a few instructions at most. This is WAY faster than .NET
running through all of the data structures in the memory manager, finding
those that refer to a given object and modifying them.

[...]

Performance might be better the way .NET does it, but certainly the old
Mac-style "handle" paradigm was much simpler, and required much less
memory overhead (and back then, with memory space always being at a
premium, reducing data structure and code size was always the highest
priority).

Click to expand...

Well, how much memory overhead is there really?

Above and beyond what .NET already requires? Practically none in the form
of data storage, as far as I know. But that only happens because .NET has
an enormous data overhead already. It's only because each and every
variable "knows" what it's doing that the garbage collector can avoid
having to store any additional data.

However, there is still the code required to do the actual garbage
collection. Even when one discounts the data storage requirements, the
code required to navigate the existing data and update the references is
far greater than the handle method used in the old Mac OS.

With the garbage collected model, all the memory management is done by
the garbage collector. Hopefully this means that as the garbage
collector has all the information, it is able to do better descisions
about the memory management.

I see at least two major advantages to a garabage collection paradigm:

* Lazy freeing of resources means that in many cases, memory
management overhead is lower during key performance-critical code paths

* Reduced address space fragmentation (ie the virtual memory
equivalent to the reduced RAM fragmentation that the old Mac OS "handle"
paradigm was so important for)

The latter, of course, has only recently been very important. When the
Win32 first came along, no application came close to using the full 2GB of
virtual address space, nor did any application allocate large enough
blocks of virtual address space to risk failing due to address space
fragmentation. Things have change, of course (though, ironically .NET has
only started to gain a foothold just as 64-bit Windows is nearing
achieving mainstream status, eliminating the address space
fragmentationissue again

).

I see as a minor benefit the memory manager's ability to use object type
information for the purpose of memory management. Yes the additional
information is useful, but I don't see it as a huge leap. We got along
fine without taking advantage of that information...applications would
generally implement their own layer on top of the Windows memory
management, if they needed (for example) to keep objects of the same time
within a single block of virtual address space (one way to help avoid
fragmentation issues, also for example).

In a reference counting model, each object would be freed at the instant
that the reference counter reaches zero (which means that clearing a
reference is very unpredictable timewise), while in the garbage
collected model an entire heap generation is cleared all at once.
Usually this means moving away the few object that still are used so
that the entire heap generation can be wiped clean.

Not that I mentioned reference counting in the first place, but yes...I
agree that the .NET paradigm by its very design avoids some of the
pitfalls of reference counting (one major one being the classic problem of
the reference count being incorrect).

As the most active heap generation is regularly cleaned out completely,
this vastly reduces memory fragmentation.

In theory, garabage collection should practically eliminate virtual
address space fragmentation (the virtual memory manager handles the
avoidance of physical memory fragmentation). Only objects that have been
locked will present a problem, and one hopes that an application is wise
enough to minimize those. That said, I don't see that as relevant to the
question of reference counting (since you brought it up). That is, even a
garbage collection scheme based on reference counting can avoid
fragmentation in the same way. It's the garbage collection that is
important, not how memory objects are tracked.

[...]
Yes, probably. As the applicataion code is running most of the time, and
the garbage collector only occasionally (from a computer point of view),
optimising the application code seems to be the obvious choise.

One hopes. The obvious counter-example is where the garbage collection,
though it happens infrequently, takes an inordinate amount of time.
Presumably this isn't the case in actual implementation, but the "most of
the time" versus "only occasionally" difference in no way guarantees a
desirable performance outcome.

Even if the garbage collector has to to ten times the work to make up
for the lack of a object handle table or a reference reference table,
that should be nothing compared to making every reference operation in
the application code twice as fast.

Well, I haven't measured, but I'd say the garbage collector has to work a
LOT harder than just ten times, at least compared to the old Mac OS
"handle" paradigm.

In the "handle" paradigm, the garbage collector is basically O(n); the
collector can simply scan through the handle table, coalescing the objects
and updating each handle once. In the .NET paradigm, the collector has to
either scan the entire collection of data for each object it wants to
move, or it has to maintain some sort of hash table or other fast-access
data structure in which it stores (at least temporarily) all of the
current references to each given object (which is essentially the
"reference reference table" method anyway). In the former case, that's
essentially an O(n^2) algorithm, which as you know is much slow than an
O(n) algorithm. In the latter case, the premise that the garbage
collector requires no extra data is invalidated (in other words, is
irrelevant to this discussion, since you have made the claim that no extra
data is required by the .NET scheme).

(The O() above obviously doesn't take into account the actual copying of
data when moving objects...that work is identical regardless of the
collection scheme, so I don't find it relevant).

I don't actually know which scheme is used by .NET, but it's clear to me
that however it is done, garbage collection is a lot more costly under
..NET than it would be under a "handle" scheme. Likely way more than just
10x difference. Do we still come out ahead? Probably. Heck, I have to
assume so because I love using .NET and would hate to learn that it's
moved backwards in any way.

But I can't say that it's an obvious
given.

Pete

Zytan · Mar 2, 2007

Even if the garbage collector has to to ten times the work to make up

for the lack of a object handle table or a reference reference table,
that should be nothing compared to making every reference operation in
the application code twice as fast.

And in most cases, I bet there is only 1 reference to the object, so
if the GC really needs to move things, which I imagine it doesn't very
often, there's only 1 32-bit pointer it needs to update.

You guys are very knowledgeable, and are really speaking the same
language. I think you any argument between you is more on
definition. I tend to think like Goran, in the way that, from the
program's perspective (or from the CPU's perspective), a pointer is
just a 32-bit address stating where an object is. Yes, there's "more"
to it on the GC side, but there's not "more" to it in terms of the CPU
saying "hey, here's a pointer, let's go to the address it has stored
to get my object". And in that manner, C# is FAST.

Yes, possibly, it is doubly-indirected meaning that the GC need only
update ONE reference to it, rather than potentially many, but, how
often does that occur, and how much more quick would that have to be
to overcome the slowdown of dual indirection for every reference.

I do have faith that the C# designers, being so smart as they are, got
it right.

Thanks for the very informative discussion, guys. I am enjoying this.

Zytan

Zytan · Mar 2, 2007

I don't actually know which scheme is used by .NET, but it's clear to me

that however it is done, garbage collection is a lot more costly under
.NET than it would be under a "handle" scheme. Likely way more than just
10x difference. Do we still come out ahead? Probably. Heck, I have to
assume so because I love using .NET and would hate to learn that it's
moved backwards in any way. But I can't say that it's an obvious
given.

Pete, consider that it is optimized for the most gain as a whole.
Meaning, if most often, things are only referenced once or twice, then
that's more important to make them fast than anything else, since
that's where the speed will be gained or lost. 10x to update a single
reference? Yes, i know that single reference must be stored somehow
that allows other references to be in there, as well, in some kind of
data structure, or hash, or something, but, still. And, as you said,
even if it was 10x as bad as the "handle" scheme, we're probably still
ahead.

Also, isn't it unfair to compare it to the "handle" scheme? Shouldn't
you be comparing it to whatever other alternative the C# could have
used? Or, are you claiming the "handle" scheme IS something C# could
have used (I may have missed that)?

Zytan

Willy Denoyette [MVP] · Mar 2, 2007

Zytan said:
And in most cases, I bet there is only 1 reference to the object, so
if the GC really needs to move things, which I imagine it doesn't very
often, there's only 1 32-bit pointer it needs to update.

You guys are very knowledgeable, and are really speaking the same
language. I think you any argument between you is more on
definition. I tend to think like Goran, in the way that, from the
program's perspective (or from the CPU's perspective), a pointer is
just a 32-bit address stating where an object is. Yes, there's "more"
to it on the GC side, but there's not "more" to it in terms of the CPU
saying "hey, here's a pointer, let's go to the address it has stored
to get my object". And in that manner, C# is FAST.

Yes, possibly, it is doubly-indirected meaning that the GC need only
update ONE reference to it, rather than potentially many, but, how
often does that occur, and how much more quick would that have to be
to overcome the slowdown of dual indirection for every reference.

I do have faith that the C# designers, being so smart as they are, got
it right.

Thanks for the very informative discussion, guys. I am enjoying this.

Zytan

Sorry, I don't want to sound rude, but you got it wrong, each object reference is held in a
program variable, and this variable can actually exist in a register, on the stack, in the
Finalizer list or in the Handle table.
N references can point to the same object in the GC heap, see sample [1].
The JIT helps the GC, by updating a table (the GCInfo table) in which he stores the
aliveness state of the variables holding object references at JIT compile time (per method).
Note that the JIT doesn't keep track of this for each machine instruction, only those that
can possibly trigger a GC are kept in the GCInfo table.
All the GC has to do is inspect the GCInfo table and start walking the stack(s) and the
handle table to find the references to dead objects and reset these references (set to
null). When done, he can start a compactation of the heap, hereby updating the life
references of the moved objects, in the stack and the handle table. Note that I've left-out
some details but at large that's it.

[1]
class C {
int i;
}
....
void Foo()
{
C c = new C();
C c1 = c
(1)
....
}
Here at point (1), the stack (or registers) will hold (at least) two reference to same
instance of C.

Willy.

Zytan · Mar 2, 2007

Sorry, I don't want to sound rude, but you got it wrong,

Telling me i'm wrong is not rude, so don't worry

each object reference is held in a
program variable, and this variable can actually exist in a register, on the stack, in the
Finalizer list or in the Handle table.
Ok.

N references can point to the same object in the GC heap, see sample [1].
Yes.

The JIT helps the GC, by updating a table (the GCInfo table) in which he stores the
aliveness state of the variables holding object references at JIT compile time (per method).
Ok.

Note that the JIT doesn't keep track of this for each machine instruction, only those that
can possibly trigger a GC are kept in the GCInfo table.
All the GC has to do is inspect the GCInfo table and start walking the stack(s) and the
handle table to find the references to dead objects and reset these references (set to
null). When done, he can start a compactation of the heap, hereby updating the life
references of the moved objects, in the stack and the handle table. Note that I've left-out
some details but at large that's it.

[1]
class C {
int i;}

...
void Foo()
{
C c = new C();
C c1 = c
(1)
...}

Here at point (1), the stack (or registers) will hold (at least) two reference to same
instance of C.

Yes.

Ok, while my explanation was very general, and as far as I can tell,
you explained the same thing, except in much more technical detail
(details that I didn't know). In general, this is what I suspected
was going on. Maybe I was unclear.

Thanks,

Zytan

Guest · Mar 2, 2007

Peter said:
I never said the mechanism had to be in the application. I am simply at
a loss as to why you would invest so much time rebutting a point that I
never actually made

Because I keep explaining that there is nothing more than a pointer in a
reference, and you keep saying that there is more.

Yet, the mechanism by which the garbage collector *uses* this
information is entirely specific to the implementation of references.
There's a bunch of code in there that exists for the sole purpose of
updating references when the objects are relocated. It's disingenous to
claim that there's "nothing extra".

I have never said that there is no code for handling references. On the
contrary I have tried to explain this to you over and over.

There is plenty more than the pointer. The code in the garbage
collector, and the data it uses, are all "more than the pointer".

No, there is nothing more than a pointer in a reference. Nothing. Not a
single bit.

The code in the garbage collector is not stored inside every reference,
neither is all the data that the garbage collector uses.

Guest · Mar 2, 2007

Peter said:
It seems you are again responding to points I did not make. I never
once brought up the question of "reference counting", nor do I find it
relevant to the discussion I'm participating in.

I never said that you mentioned reference counting.

Do you really think that you are the only one that decides what the
discussion is about, and that noone else may mention anything that you
haven't brought up?

I brought up the reference counting model as it's one of the major
alternatives to a garbage collection model.

The .NET method is nowhere near as efficient in *updating* as the old
Mac "handles" method. Using handles, a single value needs to be
updated. That's it. It's a few instructions at most. This is WAY
faster than .NET running through all of the data structures in the
memory manager, finding those that refer to a given object and modifying
them.

Yes, the garbage collection does far more work, but I was talking about
the work that the application code does.

[...]
Even if the garbage collector has to to ten times the work to make up
for the lack of a object handle table or a reference reference table,
that should be nothing compared to making every reference operation in
the application code twice as fast.

Click to expand...

Well, I haven't measured, but I'd say the garbage collector has to work
a LOT harder than just ten times, at least compared to the old Mac OS
"handle" paradigm.

In the "handle" paradigm, the garbage collector is basically O(n); the
collector can simply scan through the handle table, coalescing the
objects and updating each handle once. In the .NET paradigm, the
collector has to either scan the entire collection of data for each
object it wants to move, or it has to maintain some sort of hash table
or other fast-access data structure in which it stores (at least
temporarily) all of the current references to each given object (which
is essentially the "reference reference table" method anyway).

The garbage collector actually does neither of what you claim that it
has to.

It only scans the data that is reachable from the application, not all
the data. There is no reason to look for references in objects that the
application can't reach anyway.

Willy Denoyette [MVP] · Mar 2, 2007

Zytan said:
Sorry, I don't want to sound rude, but you got it wrong,

Click to expand...

Telling me i'm wrong is not rude, so don't worry

each object reference is held in a
program variable, and this variable can actually exist in a register, on the stack, in
the
Finalizer list or in the Handle table.
Ok.

N references can point to the same object in the GC heap, see sample [1].
Yes.

The JIT helps the GC, by updating a table (the GCInfo table) in which he stores the
aliveness state of the variables holding object references at JIT compile time (per
method).
Ok.

Note that the JIT doesn't keep track of this for each machine instruction, only those
that
can possibly trigger a GC are kept in the GCInfo table.
All the GC has to do is inspect the GCInfo table and start walking the stack(s) and the
handle table to find the references to dead objects and reset these references (set to
null). When done, he can start a compactation of the heap, hereby updating the life
references of the moved objects, in the stack and the handle table. Note that I've
left-out
some details but at large that's it.

[1]
class C {
int i;}

...
void Foo()
{
C c = new C();
C c1 = c
(1)
...}

Here at point (1), the stack (or registers) will hold (at least) two reference to same
instance of C.

Click to expand...

Yes.

Ok, while my explanation was very general, and as far as I can tell,
you explained the same thing, except in much more technical detail
(details that I didn't know). In general, this is what I suspected
was going on. Maybe I was unclear.

< snip
And in most cases, I bet there is only 1 reference to the object, so
if the GC really needs to move things, which I imagine it doesn't very
often, there's only 1 32-bit pointer it needs to update.
Above is what I called "wrong":
- there is at least one reference to an object not only 1, most of the time there are even
two or three root references, even if your application only hold a single reference to a
single instance. And as I said a single instance can have multiple application created
references.
- and the GC run more often than some might expect (or had wished :-)

), all depends on your
allocation scheme, remember this is not C++, almost everything is an object in .NET.
A thousands of objects are already created and the the GC has already made a sweep before
the CLR enter the main entry In a simple console application.

Willy.

Peter Duniho · Mar 3, 2007

Because I keep explaining that there is nothing more than a pointer in a
reference, and you keep saying that there is more.

Huh? You can only rebut my post if you introduce new concepts to disagree
with, ones I never mentioned in the first place? That should tell you
something.

I have never said that there is no code for handling references. On the
contrary I have tried to explain this to you over and over.

You continue to claim that a reference is nothing more than a pointer.
But it is.

If I believe your claim that you cannot consider any of the other things
surrounding that reference when considering whether it is more than a
pointer, then it's not even a pointer. It's just 32 bits. It's not a
pointer at all. It's just pure data, without any context whatsoever.

Conversely, once you start interpreting data as a specific kind of data,
you need to consider everything that interprets it. And a reference
definitely is not simply interpreted just as a pointer. It's a very
special kind of value that holds very special meaning beyond simply
describing a specific place in the virtual address space.

Likewise, using the logic you're using, an old Mac OS handle is "just a
pointer" also. After all, all the handle does is pointer to a place in
memory.

No, there is nothing more than a pointer in a reference. Nothing. Not a
single bit.

I don't think I need to reply to that statement one more time. You
already know what I'm going to write.

The code in the garbage collector is not stored inside every reference,
neither is all the data that the garbage collector uses.

I never said any of that was stored within a reference. Again, you
introduce entirely new concepts to disagree with, just for the sake of
rebuttal.

Pete

Peter Duniho · Mar 3, 2007

Pete, consider that it is optimized for the most gain as a whole.

One hopes (and I assume). However, it's not a given. If it turns out
that the memory scheme used by .NET is less efficient than other possible
schemes, well...let's just say that wouldn't be the first time some major
blunder was made when creating some computer architecture.

Most of the time, people get it right. But it's not a given that they
always do. I've been around long enough to know better than to just
assume that they have.

Meaning, if most often, things are only referenced once or twice, then
that's more important to make them fast than anything else, since
that's where the speed will be gained or lost.

Huh? I think you have that backwards. If something is referenced only
once or twice, then it doesn't matter how long it takes to resolve that
reference. It's when something is referenced a huge number of times that
the cost to resolve the reference is significant.

10x to update a single
reference? Yes, i know that single reference must be stored somehow
that allows other references to be in there, as well, in some kind of
data structure, or hash, or something, but, still. And, as you said,
even if it was 10x as bad as the "handle" scheme, we're probably still
ahead.

Well, except that I suspect that the difference in the relocation
algorithm is greater than 10x.

Also, isn't it unfair to compare it to the "handle" scheme? Shouldn't
you be comparing it to whatever other alternative the C# could have
used? Or, are you claiming the "handle" scheme IS something C# could
have used (I may have missed that)?

..NET (C# is just the compiler) certainly could have used the exact same
handle scheme. There's nothing fundamentally wrong with it. The major
disadvantage it has, of course, is that it does complicate resolving
references to objects. And it would require the compiler to do more
work. It introduces a variety of extra complexities in the compiled code
(eg, having to double-dereference all the time, having to remember to lock
the object when modifying the data, and the scheme wouldn't work nearly so
well in a true multitasking environment as it did in the cooperative
multitasking environment used in the old Mac OS), as a cost of simplifying
the garbage collection.

Pete

Peter Duniho · Mar 3, 2007

I never said that you mentioned reference counting.

Do you really think that you are the only one that decides what the
discussion is about, and that noone else may mention anything that you
haven't brought up?

When you are responding to points I've made (indicated by quoting my
post), then yes...it makes sense that you should limit your comments to
things that I've brought up. Or at least acknowledge that you are
introducing a new concept unrelated to the text to which you have
indicated you are commenting on.

[...]
The garbage collector actually does neither of what you claim that it
has to.

I have not once claimed that the garbage collector does anything in
particular. I don't know anything about the inner workings of the garbage
collector other than what you and others have posted, and I'm making the
assumption that you and others know what you are talking about. Granted,
it's not always wise to make assumptions, but at this point it's all I've
got.

It only scans the data that is reachable from the application, not all
the data. There is no reason to look for references in objects that the
application can't reach anyway.

The garbage collector needs a map either of the objects allocated, or the
references in the application. Either way, it has to do a lot more work
than a handle-oriented scheme would (which does not need any information
at all about who is holding a reference to anything).

Pete

Guest · Mar 3, 2007

Peter said:
I have not once claimed that the garbage collector does anything in
particular. I don't know anything about the inner workings of the
garbage collector other than what you and others have posted, and I'm
making the assumption that you and others know what you are talking
about. Granted, it's not always wise to make assumptions, but at this
point it's all I've got.

Ok, then let me quoute you:

"In the .NET paradigm, the collector has to either scan the entire
collection of data for each object it wants to move, or it has to
maintain some sort of hash table or other fast-access data structure in
which it stores (at least temporarily) all of the current references to
each given object (which is essentially the "reference reference table"
method anyway)."

There you claim that there is only two possible ways for the garbage
collector to work.

I agree with you that it's not always wise to make assumptions. When you
do, it might be a good idea to explain that it's just an assumption, so
that noone might do the mistake of thinking that you know what you are
talking about.

Guest · Mar 3, 2007

Peter said:
Huh? You can only rebut my post if you introduce new concepts to
disagree with, ones I never mentioned in the first place? That should
tell you something.

What new concept? Are you confused about what you have said yourself also?

You continue to claim that a reference is nothing more than a pointer.
But it is.

No, it's not.

If I believe your claim that you cannot consider any of the other things
surrounding that reference when considering whether it is more than a
pointer, then it's not even a pointer. It's just 32 bits. It's not a
pointer at all. It's just pure data, without any context whatsoever.

The difference is that the application code uses a pointer as a pointer,
not just 32 bits of data. Do you really fail to see this difference?

The concept of a pointer is not only handled by the compiler and the
memory management, as the concept of a reference is.

Conversely, once you start interpreting data as a specific kind of data,
you need to consider everything that interprets it. And a reference
definitely is not simply interpreted just as a pointer. It's a very
special kind of value that holds very special meaning beyond simply
describing a specific place in the virtual address space.

Yes, a reference truly is interpreted just as a pointer. There is
nothing in the application code that handles it differently from a
pointer in any way.

Likewise, using the logic you're using, an old Mac OS handle is "just a
pointer" also. After all, all the handle does is pointer to a place in
memory.

So you can take the handle and use it as a pointer? I thought that the
handle was a handle, not a pointer.

I don't think I need to reply to that statement one more time. You
already know what I'm going to write.

No, I don't really know what you are going to write. Are you going to
claim that there is more inside a reference than a pointer? That the
size of a reference is more than the size of a pointer?

I never said any of that was stored within a reference. Again, you
introduce entirely new concepts to disagree with, just for the sake of
rebuttal.

I am just trying to understand what it is that you find so hard to
understand about references. You keep saying that there is more than
just a pointer, and only mention the garbage collector code and the data
it uses. If you don't believe that the garbage collector code is inside
each reference, what is it that you think is there?

Zytan · Mar 5, 2007

Huh? I think you have that backwards. If something is referenced only

once or twice, then it doesn't matter how long it takes to resolve that
reference. It's when something is referenced a huge number of times that
the cost to resolve the reference is significant.

I am speaking about updating reference pointers when GC moves the
object, not the run-time dereferencing of those points. So, I mean:
"only referenced once or twice" = the number of C# references to the
object is only 1 or 2. This occurs much more often than the number of
C# references to an object is >= 3. Thus, to optimize for speed, it
should be primarily be concerned with how to implement changing
pointers (when the GC moves the data) for cases where only 1 or two
references to the object exist.

Well, except that I suspect that the difference in the relocation
algorithm is greater than 10x.

I highly doubt whatever solution they used to update the 2 or so
pointers that it would take 10x longer than updating a single handle /
pointer. Possible, but very unlikely.

Considering how often GC causes this to happen is also an issue. If
it hardly ever happens, it really doesn't matter if it takes 100x as
long. As long as the code itself thinks pointers are just pointers
during the time that the GC isn't moving stuff around. THAT's what
happens most of the time of all these things, so that's what's
important for speed. This is why people say pointers are just
pointers. Because, to the code, they likely are. The GC just pauses
the universe to change things around once in a while, and then lets it
run, and the code doesn't know anything different. A pointer is just a
pointer. Hello speed. Everything else becomes much more irrelevant
for speed issues.

Zytan

=?ISO-8859-1?Q?G=F6ran_Andersson?= · Mar 5, 2007

Zytan said:
I highly doubt whatever solution they used to update the 2 or so
pointers that it would take 10x longer than updating a single handle /
pointer. Possible, but very unlikely.

Updating the references is just looping through a list and writing the
new value to the reference, so that is really a minor part of the
garbage collection process.

It takes a bit more work to find the references in the first place, but
that is also quite a small work compared to the work of actually moving
the objects, which is the same regardless of how they are referenced.

Considering how often GC causes this to happen is also an issue. If
it hardly ever happens, it really doesn't matter if it takes 100x as
long.

Actually it happens quite often, but only to a small part of the objects
that are handled.

In a garbage collection all objects in a heap generation that are still
used has to be moved to the next generation. Most objects are short
lived, so they will already be out of use at the time of their first
garbage collection sweep.

As long as the code itself thinks pointers are just pointers
during the time that the GC isn't moving stuff around. THAT's what
happens most of the time of all these things, so that's what's
important for speed. This is why people say pointers are just
pointers. Because, to the code, they likely are.

Not just likely.

arrays = pointers?

Peter Duniho

Peter Duniho

Guest

Peter Duniho

Guest

Peter Duniho

Zytan

Zytan

Willy Denoyette [MVP]

Zytan

Guest

Guest

Willy Denoyette [MVP]

Peter Duniho

Peter Duniho

Peter Duniho

Guest

Guest

Zytan

=?ISO-8859-1?Q?G=F6ran_Andersson?=