ArrayList without Boxing and UnBoxing

P

Peter Olcott

Arne Vajhøj said:
Both ArrayList and List<> has a constructor with an int argument
specifying initial capacity.

Arne

These can refer to any type of structure?
Which of these is closest to std::vector?
 
B

Bruce Wood

Peter said:
Can allocating an ordinary array, be done through the managed heap? If it can
not be done using the managed heap, then wouldn't allocating unmanaged memory be
considered a poor practice in terms of good .NET design?

Who said anything about unmanaged memory? I'm talking about using a
regular, boring managed array. Fixed-size arrays in .NET can be of any
type, including value types, and so required no boxing and unboxing in
order to store value types. There is no need to use unmanaged arrays.
There are two different facets to my problem, one requiring the array to
dynamically grow, and the other most time critical one, knows its size in
advance.

The use a dynamic structure for the first and a vanilla fixed-size
array for the second. Using a fixed-size array will certainly bring
speed benefits.
Even the one that is required to dynamically grow, must do this
relatively quickly, thus can not take the boxing / unboxing overhead. I wanted
to see if I could adapt my system to .NET using my current compiler, the answer
is no. The next level question is can my system be adapted to .NET at all, the
answer is yes if I use generics.

No, the answer is even better than that. The answer is that you can
achieve a high-speed dynamic structure in .NET 1.1, but you have to
program it yourself. Remember that ArrayList is just a wrapper around
an array that, when you attempt to add one element too many,
reallocates the array and copies the contents. You can write your own
ArrayList specific to your value type that will have no boxing /
unboxing overhead. The only thing that .NET 2.0 gives you is the
ability to use the dynamic data structures provided by the .NET
Framework without incurring boxing / unboxing overhead.
 
P

Peter Olcott

Bruce Wood said:
Who said anything about unmanaged memory? I'm talking about using a
regular, boring managed array. Fixed-size arrays in .NET can be of any
type, including value types, and so required no boxing and unboxing in
order to store value types. There is no need to use unmanaged arrays.


The use a dynamic structure for the first and a vanilla fixed-size
array for the second. Using a fixed-size array will certainly bring
speed benefits.


No, the answer is even better than that. The answer is that you can
achieve a high-speed dynamic structure in .NET 1.1, but you have to
program it yourself. Remember that ArrayList is just a wrapper around
an array that, when you attempt to add one element too many,
reallocates the array and copies the contents. You can write your own
ArrayList specific to your value type that will have no boxing /
unboxing overhead. The only thing that .NET 2.0 gives you is the
ability to use the dynamic data structures provided by the .NET
Framework without incurring boxing / unboxing overhead.

Actually it must be any array of struct of scalar types, 8-bit, 16-bit, and
32-bit integers, all unsigned.

struct MyType {
uint One;
ushort Two;
byte Three;
};

How do I go about creating the equivalent of a std::vector<MyType> that does not
ever have boxing and unboxing overhead?
 
?

=?ISO-8859-1?Q?Arne_Vajh=F8j?=

Peter said:
These can refer to any type of structure?
Which of these is closest to std::vector?

ArrayList uses object meaning that it can take anything
but do boxing and unboxing.

List<> uses whatever you use including struct (that is how
generics work).

Arne
 
?

=?ISO-8859-1?Q?Arne_Vajh=F8j?=

Peter said:
Can allocating an ordinary array, be done through the managed heap? If it can
not be done using the managed heap, then wouldn't allocating unmanaged memory be
considered a poor practice in terms of good .NET design?

A normal array in C# is in managed heap.

And there are nothing wrong with using normal arrays in C#
if you know the size.

Arne
 
?

=?ISO-8859-1?Q?Arne_Vajh=F8j?=

Peter said:
Actually it must be any array of struct of scalar types, 8-bit, 16-bit, and
32-bit integers, all unsigned.

struct MyType {
uint One;
ushort Two;
byte Three;
};

How do I go about creating the equivalent of a std::vector<MyType> that does not
ever have boxing and unboxing overhead?

If you know the size:

MyType[] myarray = new MyType[noelm];

If you don't know the size and are on .NET 2.0:

List<MyType> mylist = new List<MyType>();

Arne
 
P

Peter Olcott

Arne Vajhøj said:
Peter said:
Actually it must be any array of struct of scalar types, 8-bit, 16-bit, and
32-bit integers, all unsigned.

struct MyType {
uint One;
ushort Two;
byte Three;
};

How do I go about creating the equivalent of a std::vector<MyType> that does
not ever have boxing and unboxing overhead?

If you know the size:

MyType[] myarray = new MyType[noelm];

If you don't know the size and are on .NET 2.0:

List<MyType> mylist = new List<MyType>();

Arne

Bruce said that this could be done using .NET 1.1
 
?

=?ISO-8859-1?Q?Arne_Vajh=F8j?=

Peter said:
Arne Vajhøj said:
If you know the size:

MyType[] myarray = new MyType[noelm];

If you don't know the size and are on .NET 2.0:

List<MyType> mylist = new List<MyType>();

Arne

Bruce said that this could be done using .NET 1.1
you can achieve a high-speed dynamic structure in .NET 1.1, but you have to
program it yourself.

You can not make a fast generic one in 1.1, but you can make a fast
one specific for your type.

Arne
 
P

Peter Olcott

Arne Vajhøj said:
Peter said:
Arne Vajhøj said:
If you know the size:

MyType[] myarray = new MyType[noelm];

If you don't know the size and are on .NET 2.0:

List<MyType> mylist = new List<MyType>();

Arne

Bruce said that this could be done using .NET 1.1
you can achieve a high-speed dynamic structure in .NET 1.1, but you have to
program it yourself.

You can not make a fast generic one in 1.1, but you can make a fast
one specific for your type.

Arne

That is all that I need. How hard would this be? I already wrote a complete
std::vector from scratch, (for a C++ compiler lacking this capability) does it
require this same amount of programming?
 
?

=?ISO-8859-1?Q?Arne_Vajh=F8j?=

Peter said:
That is all that I need. How hard would this be? I already wrote a complete
std::vector from scratch, (for a C++ compiler lacking this capability) does it
require this same amount of programming?

Probably less.

If you only need a constructor + an Add method + an indexer
then it can not be many lines of code.

Arne
 
P

Peter Olcott

Arne Vajhøj said:
Probably less.

If you only need a constructor + an Add method + an indexer
then it can not be many lines of code.

Arne
It also must dynamically re-allocate when it needs to grow in size.
 
J

Jeff Louie

Just to be clear, I believe std::vector is finally available in C++/CLI.

Regards,
Jeff
 
B

Bruce Wood

Peter said:
It also must dynamically re-allocate when it needs to grow in size.

I do not vouch for the following: I wrote it off the top of my head, so
the syntax may not even be 100%, but to give you an idea:

public class MyValueTypeVector
{
private MyValueType[] _vector;
private int _upperBound;

public MyValudTypeVector() : this(10) { }

public MyValueTypeVector(int initialCapacity)
{
this._vector = new MyValueType[initialCapacity];
this._upperBound = 0;
}

public MyValueType this[int index]
{
get { return this._vector[index]; }
set { this._vector[index] = value; }
}

public int Length { get { return this._upperBound; } }

public void Add(MyValueType newMember)
{
if (this._upperBound >= this._vector.Length - 1)
{
MyValueType newVector[] = new
MyValueType[this._vector.Length * 2];
for (int i = 0; i < this._vector.Length; i++)
{
newVector = this._vector;
}
this._vector = newVector;
}
this._vector[this._upperBound] = newMember;
this._upperBound += 1;
}
}

Some notes:

1. Notice that the indexer does no bounds check. That is, it is
possible for client code to read/write "past the end of" the array"
(above the upperBounds limit) so long as it doesn't write outside the
physical capacity of the allocated array. This is for speed. Adding the
check will cost one additional comparison per reference / assignment,
and would look like this:

public MyValueType this[int index]
{
get
{
if (index >= this._upperBounds) { throw new
IndexOutOfBoundsException(...); }
return this._vector[index];
}
set
{
if (index >= this._upperBounds) { throw new
IndexOutOfBoundsException(...); }
this._vector[index] = value;
}
}

Even this omits the index < 0 check, as this will throw an exception on
the array access itself. Again, not the prettiest, but we're going for
speed, here.

2. The original indexer is sufficiently simple that the compiler /
JITter should in-line the calls, reducing indexed access to the
"vector" to a simple array access (one bounds comparison, one address
calculation, and one read or write) with no method call overhead. If it
turns out tha the JITter does not in-line calls, you could always make
_vector public and access it directly via myVector.Vector, which is
ugly but possible. You shouldn't have to do this, though, because the
compiler really should in-line such a simple indexer as the first one.
The second with the bounds check probably wouldn't be in-lined.

3. The class is _not_ thread-safe. In particular, the reallocation
operation is _not_ atomic and cannot be made atomic without slowing the
whole thing down tremendously by introducing locks around the indexer
getter and setter and the Add operation including reallocation. I
assume that you're not multi-threading.
 
P

Peter Olcott

It looks like you have found the best possible solution to every aspect of my
original question. I am assuming that generics are equivalent to C++ templates
so that this solution could be adapted to become a generic solution. I am
assuming by what you said that C# does not have the equivalent the [inline]
keyword.


Peter said:
It also must dynamically re-allocate when it needs to grow in size.

I do not vouch for the following: I wrote it off the top of my head, so
the syntax may not even be 100%, but to give you an idea:

public class MyValueTypeVector
{
private MyValueType[] _vector;
private int _upperBound;

public MyValudTypeVector() : this(10) { }

public MyValueTypeVector(int initialCapacity)
{
this._vector = new MyValueType[initialCapacity];
this._upperBound = 0;
}

public MyValueType this[int index]
{
get { return this._vector[index]; }
set { this._vector[index] = value; }
}

public int Length { get { return this._upperBound; } }

public void Add(MyValueType newMember)
{
if (this._upperBound >= this._vector.Length - 1)
{
MyValueType newVector[] = new
MyValueType[this._vector.Length * 2];
for (int i = 0; i < this._vector.Length; i++)
{
newVector = this._vector;
}
this._vector = newVector;
}
this._vector[this._upperBound] = newMember;
this._upperBound += 1;
}
}

Some notes:

1. Notice that the indexer does no bounds check. That is, it is
possible for client code to read/write "past the end of" the array"
(above the upperBounds limit) so long as it doesn't write outside the
physical capacity of the allocated array. This is for speed. Adding the
check will cost one additional comparison per reference / assignment,
and would look like this:

public MyValueType this[int index]
{
get
{
if (index >= this._upperBounds) { throw new
IndexOutOfBoundsException(...); }
return this._vector[index];
}
set
{
if (index >= this._upperBounds) { throw new
IndexOutOfBoundsException(...); }
this._vector[index] = value;
}
}

Even this omits the index < 0 check, as this will throw an exception on
the array access itself. Again, not the prettiest, but we're going for
speed, here.

2. The original indexer is sufficiently simple that the compiler /
JITter should in-line the calls, reducing indexed access to the
"vector" to a simple array access (one bounds comparison, one address
calculation, and one read or write) with no method call overhead. If it
turns out tha the JITter does not in-line calls, you could always make
_vector public and access it directly via myVector.Vector, which is
ugly but possible. You shouldn't have to do this, though, because the
compiler really should in-line such a simple indexer as the first one.
The second with the bounds check probably wouldn't be in-lined.

3. The class is _not_ thread-safe. In particular, the reallocation
operation is _not_ atomic and cannot be made atomic without slowing the
whole thing down tremendously by introducing locks around the indexer
getter and setter and the Add operation including reallocation. I
assume that you're not multi-threading.
 
P

Peter Olcott

public MyValueType this[uint index] // does this eliminate the possibility of
index < 0 ?
{
get { return this._vector[index]; }
set { this._vector[index] = value; }
}


Peter said:
It also must dynamically re-allocate when it needs to grow in size.

I do not vouch for the following: I wrote it off the top of my head, so
the syntax may not even be 100%, but to give you an idea:

public class MyValueTypeVector
{
private MyValueType[] _vector;
private int _upperBound;

public MyValudTypeVector() : this(10) { }

public MyValueTypeVector(int initialCapacity)
{
this._vector = new MyValueType[initialCapacity];
this._upperBound = 0;
}

public MyValueType this[int index]
{
get { return this._vector[index]; }
set { this._vector[index] = value; }
}

public int Length { get { return this._upperBound; } }

public void Add(MyValueType newMember)
{
if (this._upperBound >= this._vector.Length - 1)
{
MyValueType newVector[] = new
MyValueType[this._vector.Length * 2];
for (int i = 0; i < this._vector.Length; i++)
{
newVector = this._vector;
}
this._vector = newVector;
}
this._vector[this._upperBound] = newMember;
this._upperBound += 1;
}
}

Some notes:

1. Notice that the indexer does no bounds check. That is, it is
possible for client code to read/write "past the end of" the array"
(above the upperBounds limit) so long as it doesn't write outside the
physical capacity of the allocated array. This is for speed. Adding the
check will cost one additional comparison per reference / assignment,
and would look like this:

public MyValueType this[int index]
{
get
{
if (index >= this._upperBounds) { throw new
IndexOutOfBoundsException(...); }
return this._vector[index];
}
set
{
if (index >= this._upperBounds) { throw new
IndexOutOfBoundsException(...); }
this._vector[index] = value;
}
}

Even this omits the index < 0 check, as this will throw an exception on
the array access itself. Again, not the prettiest, but we're going for
speed, here.

2. The original indexer is sufficiently simple that the compiler /
JITter should in-line the calls, reducing indexed access to the
"vector" to a simple array access (one bounds comparison, one address
calculation, and one read or write) with no method call overhead. If it
turns out tha the JITter does not in-line calls, you could always make
_vector public and access it directly via myVector.Vector, which is
ugly but possible. You shouldn't have to do this, though, because the
compiler really should in-line such a simple indexer as the first one.
The second with the bounds check probably wouldn't be in-lined.

3. The class is _not_ thread-safe. In particular, the reallocation
operation is _not_ atomic and cannot be made atomic without slowing the
whole thing down tremendously by introducing locks around the indexer
getter and setter and the Add operation including reallocation. I
assume that you're not multi-threading.
 
B

Bruce Wood

Peter said:
public MyValueType this[uint index] // does this eliminate the possibility of
index < 0 ?
{
get { return this._vector[index]; }
set { this._vector[index] = value; }
}

Yes, but you will have to mark the method as non-CLS compliant because
some .NET languages don't support unsigned types. No biggie unless
you're planning on calling the code from another .NET language.

And no, C# has no "inline" indication. That is left entirely up to the
compiler and the JITter.

Yes, C# generics offer similar functionality to C++ templates, although
the former are compiler constructs while the latter are
text-substitution constructs, which introduces some subtle differences.
Nonetheless, you can think of them as the "same thing" for basic usage
purposes.
 
P

Peter Olcott

Bruce Wood said:
Peter said:
public MyValueType this[uint index] // does this eliminate the possibility
of
index < 0 ?
{
get { return this._vector[index]; }
set { this._vector[index] = value; }
}

Yes, but you will have to mark the method as non-CLS compliant because
some .NET languages don't support unsigned types. No biggie unless
you're planning on calling the code from another .NET language.

And no, C# has no "inline" indication. That is left entirely up to the
compiler and the JITter.

Yes, C# generics offer similar functionality to C++ templates, although
the former are compiler constructs while the latter are
text-substitution constructs, which introduces some subtle differences.
Nonetheless, you can think of them as the "same thing" for basic usage
purposes.

Now all that remains is to see if your get/set code mentioned above is placed
inline.
 
B

Bruce Wood

Peter said:
Bruce Wood said:
Peter said:
public MyValueType this[uint index] // does this eliminate the possibility
of
index < 0 ?
{
get { return this._vector[index]; }
set { this._vector[index] = value; }
}

Yes, but you will have to mark the method as non-CLS compliant because
some .NET languages don't support unsigned types. No biggie unless
you're planning on calling the code from another .NET language.

And no, C# has no "inline" indication. That is left entirely up to the
compiler and the JITter.

Yes, C# generics offer similar functionality to C++ templates, although
the former are compiler constructs while the latter are
text-substitution constructs, which introduces some subtle differences.
Nonetheless, you can think of them as the "same thing" for basic usage
purposes.

Now all that remains is to see if your get/set code mentioned above is placed
inline.

I know that the compiler inlines getters and setters that do nothing
but fetch and assign a corresponding field. I don't see why an indexer
would be any different.
 
P

Peter Olcott

Bruce Wood said:
Peter said:
Bruce Wood said:
Peter Olcott wrote:
public MyValueType this[uint index] // does this eliminate the
possibility
of
index < 0 ?
{
get { return this._vector[index]; }
set { this._vector[index] = value; }
}

Yes, but you will have to mark the method as non-CLS compliant because
some .NET languages don't support unsigned types. No biggie unless
you're planning on calling the code from another .NET language.

And no, C# has no "inline" indication. That is left entirely up to the
compiler and the JITter.

Yes, C# generics offer similar functionality to C++ templates, although
the former are compiler constructs while the latter are
text-substitution constructs, which introduces some subtle differences.
Nonetheless, you can think of them as the "same thing" for basic usage
purposes.

Now all that remains is to see if your get/set code mentioned above is placed
inline.

I know that the compiler inlines getters and setters that do nothing
but fetch and assign a corresponding field. I don't see why an indexer
would be any different.

Yes, but, this issue will remain unresolved until I know for sure. One of the
rules that I strictly enforce is never make any assumptions. How could the
answer to this question be empirically verified?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top