ArrayList vs. List<>

Z

Zytan

http://en.wikipedia.org/wiki/Strong_typing

"Programming language expert Benjamin C. Pierce ... has said: "I spent
a few weeks... trying to sort out the terminology of "strongly typed,"
"statically typed," "safe," etc., and found it amazingly difficult....
The usage of these terms is so various as to render them almost
useless.""

Wow.

Zytan
 
N

Nicholas Paldino [.NET/C# MVP]

When storing value types in an ArrayList, the boxing overhead makes the
ArrayList perform approximately 2x as slow. For reference types, a List<T>
is still faster, but only by about 5-10%.
 
J

Jon Harrop

Cor said:
"> Alvin, thanks, so List<T> is faster,

No it is in fact slower, a strongly typed type has always extra code.
However it is type safe and helps you to make more secure programs.

No. Static typing results in fewer run-time checks and faster code as well
as improved reliability.
 
P

Peter Duniho

Or if you need to have a list of objects which are not all of the same
type - that won't jive in a List<T> :)

As Goran points out, if all of the objects have the same base type, then
you just use List<T> where T is the least-derived type common to all of
the objects.

If all of the objects don't have the same base type, I would question
whether they really ought to be found in the same array. :)

And yes, I suppose you can always fall back to the Object base class, but
if that's all the objects have in common I think it would be unusual for
them to be collected together (one obvious exception being something that
involves itself specifically with the Object-ness of the objects, such as
the garbage collection for example).

Pete
 
M

Marc Scheuner

Oh, but it will. Just make a List<object>. :)

True - but then you've basically thrown away all the benefits of a
type-safe list :) I see your point though

Marc
 
J

Jon Davis

Cor Ligthert said:
"> Alvin, thanks, so List<T> is faster,

No it is in fact slower, a strongly typed type has always extra code.

Where did you come up with that idea? Strong typing enforces type checking,
but that's faster than unboxing. ArrayList is nothing but a List<object>
with different semantics. Returning a string that was boxed to System.Object
and then casted back to a string is slower than returning a string that was
never boxed.

Jon
 
F

Frans Bouma [C# MVP]

Jon said:
Where did you come up with that idea? Strong typing enforces type
checking, but that's faster than unboxing. ArrayList is nothing but a
List<object> with different semantics. Returning a string that was
boxed to System.Object and then casted back to a string is slower
than returning a string that was never boxed.

Strings aren't boxed, they're reference types under the hood.

Generics aren't always faster. A good example is the Dictionary<K,V>
class, which is slower than the Hashtable in many situations.

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------
 
J

Jon Harrop

Frans said:
Strings aren't boxed

Can you elaborate on this? Presumably they end up getting passed as a
pointer to a char array?
Generics aren't always faster. A good example is the Dictionary<K,V>
class, which is slower than the Hashtable in many situations.

I just timed this and the generic Dictionary was much faster than the
non-generic Hashtable for double->double mappings (timings are best of 3):

Insert:
Hashtable 31.0s
Dictionary 4.90s

Fetch:
Hashtable 6.09s
Dictionary 3.20s

Can you cite some benchmark results where Hashtable is faster?
 
J

Jon Skeet [C# MVP]

Can you elaborate on this? Presumably they end up getting passed as a
pointer to a char array?

They end up getting passed in the same way as any other reference
type. The reference is passed by value.

Strings aren't just reference types "under the hood" - they're
reference types from every perspective. They happen to be immutable,
of course, but that's a separate matter.

Jon
 
J

Jon Harrop

Cor said:
"> Alvin, thanks, so List<T> is faster,

No it is in fact slower

Can you cite anything to back this up? All I can find is overwhelming
evidence to the contrary.
a strongly typed type has always extra code.

Are you referring to IL or native code? Can you cite a reference on this
because it doesn't make any sense to me. For one thing, surely it depends
entirely upon the compilers.
However it is type safe

Are they not both type safe?
If you use the arraylist only inside a method with one attribute and let
it go out of scoop afterwards, the arraylist is probably the fastest.

Why?
 
C

Christof Nordiek

Jon Harrop said:
Right. So in what sense are they not "boxed"?
They are "boxed" as any reference type always is "boxed" as they never occur
unboxed.
So while you *could* say, they are boxed, no boxing takes place, when
casting from string to object and no unboxing, but only typechecking takes
place, when casting from object to string.

Note, that the word "boxed" seldom to never is used in the way I used it
above but only in connection with value types, where boxing occurs.

Christof
 
J

Jon Skeet [C# MVP]

Right. So in what sense are they not "boxed"?

In the sense that boxing only applies to value types in the first
place. The process of boxing take a value type value, creates an
object on the heap containing that value, and returns a reference to
that object.

None of that happens when you're using a reference type to start with.

Jon
 
J

Jon Harrop

Jon said:
In the sense that boxing only applies to value types in the first
place. The process of boxing take a value type value, creates an
object on the heap containing that value, and returns a reference to
that object.

None of that happens when you're using a reference type to start with.

So C# programmers don't regard reference types as boxed. I didn't know that.

Frans' statement seems to have been about referential transparency then.
 
J

Jon Skeet [C# MVP]

Jon Harrop said:
So C# programmers don't regard reference types as boxed. I didn't
know that.

Well, it's not just a C# thing - boxing/unboxing is a CLR concept too.
(FWIW, the same terminology applies in Java - I've never heard of
reference-to-reference conversions being described as boxing before.)
Frans' statement seems to have been about referential transparency
then.

Do you mean Jon Davis' statement? The only statement I saw Frans making
was to say that strings aren't boxed because they're already reference
types.
 
J

Jon Harrop

Jon said:
Well, it's not just a C# thing - boxing/unboxing is a CLR concept too.
(FWIW, the same terminology applies in Java - I've never heard of
reference-to-reference conversions being described as boxing before.)

I'm from a functional programming background most recently and, to me,
boxing means referring to a block of memory by pointer and it is only used
in reference to the internal run-time representations of a language
implementation.

For example, if you write a complex number implementation in C# then (to me)
using a class results in boxing and using a struct avoids boxing, i.e. an
array of complex numbers is represented as a float array internally in the
case of a struct Complex, but an array of pointers to C-like structs in the
case of an object Complex.

Moreover, the ubiquity of referential transparency in functional programming
languages leads to the notion of boxing only being useful when discussing
performance: it makes no difference to the semantics whatsoever.

In the case of C# (and the CLR), the word "box" seems to have quite a
different meaning where (correct me if I'm wrong) some types have been
given special status as so-called "value" types (probably in the interests
of performance) that are not inherited from the almost-universal type
Object, and regaining the flexibility of a universal type by wrapping a
value type in an object is referred to as "boxing".

I've done a little C# and Java but most of my work is in OCaml, F# and C++.
This is all hidden in F# because all value types are immutable and
parametric polymorphism (generics) obviates the need for a universal type.
This does appear sometimes when you want to pass a parameter to a
dynamically-typed C# interface (e.g. Office automation) and you must
write "box 1" instead of "1" because the value type "int" does not unify
with the universal type "obj".
Do you mean Jon Davis' statement? The only statement I saw Frans making
was to say that strings aren't boxed because they're already reference
types.

Frans' statement really confused me because I understood the first half to
mean that strings are handled as an inline char array (unboxed) but the
second half means that strings are handled as a pointer to a char array.

In fact, I think he was saying that strings are immutable values passed by
reference. That's pretty much the simplest decent representation of a
string so it makes sense. If you wanted to handle huge strings with
variable-length chars then a more sophisticated data structure could be
swapped in transparently.
 
J

Jon Skeet [C# MVP]

Jon Harrop said:
I'm from a functional programming background most recently and, to me,
boxing means referring to a block of memory by pointer and it is only used
in reference to the internal run-time representations of a language
implementation.

Right. It's certainly nothing like that.
In the case of C# (and the CLR), the word "box" seems to have quite a
different meaning where (correct me if I'm wrong) some types have been
given special status as so-called "value" types (probably in the interests
of performance) that are not inherited from the almost-universal type
Object, and regaining the flexibility of a universal type by wrapping a
value type in an object is referred to as "boxing".

Well, it's not just about performance. See
http://pobox.com/~skeet/csharp/memory.html
and
http://pobox.com/~skeet/csharp/references.html
for more description - but there may be "gotchas" in there where the
same word is used for different meanings in functional code.
Frans' statement really confused me because I understood the first half to
mean that strings are handled as an inline char array (unboxed) but the
second half means that strings are handled as a pointer to a char array.

In fact, I think he was saying that strings are immutable values passed by
reference. That's pretty much the simplest decent representation of a
string so it makes sense. If you wanted to handle huge strings with
variable-length chars then a more sophisticated data structure could be
swapped in transparently.

Strings are indeed immutable, but they're not passed by reference in
the formal description of those semantics. Instead, the reference is
passed by value.

See http://pobox.com/~skeet/csharp/parameters.html for a rather more
detailed description.
 
B

Barry Kelly

Jon said:
I'm from a functional programming background most recently and, to me,
boxing means referring to a block of memory by pointer and it is only used
in reference to the internal run-time representations of a language
implementation.

Because functional languages typically use immutable values in a GC
environment, whether a type is implemented as a value type passed by
value or by reference, or as a heap-allocated reference type passed by
value, it does indeed matter little which representation is used, and
boxing there would be an implementation detail. Not so when values can
be arbitrarily modified - modifying a copy is quite different from
modifying a reference.

-- Barry
 
J

Jon Skeet [C# MVP]

Jon Harrop said:
Which brings me to another question: how is passing a reference by value not
pass-by-reference?


I'll check it out, thanks for the links.

The last link should answer your previous question - if it doesn't,
I'll explain in more detail (and update the article). Please let me
know either way!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top