Roll your own std::vector ???

P

Peter Olcott

I need std::vector like capability for several custom classes. I already
discussed this extensively in the thread named ArrayList without Boxing and
Unboxing. The solution was to simply create non-generic (non C++ template)
std::vector like capability for each of these custom classes. (Solution must
work in Visual Studio 2002).

Since I have already written one std::vector for a YeOlde C++ compiler (Borland
C++ 1.0) that had neither templates nor STL, I know how to do this. What I don't
know how to do is to directly re-allocate memory in the garbage collected C#.

I have written what I need in pseudocode, what is the correct C# syntax for
this?

int Size;
int Capacity;

bool AppendDataItem(DataItemType Data) {
if (Size == Capacity) {
(1) Capacity = Capacity * 2; // Or * 1.5
(2) Temp = MemoryPointer;
(3) MemoryPointer = Allocate(Capacity);
(4) Copy Data from Temp to MemoryPointer;
(5) DeAllocate(Temp);
(6) MemoryPointer[Size] = Data;
(7) Size++;
}
}
 
J

Joanna Carter [TeamB]

"Peter Olcott" <[email protected]> a écrit dans le message de [email protected]...

|I need std::vector like capability for several custom classes.

| I have written what I need in pseudocode, what is the correct C# syntax
for
| this?

In C#, should not normally deallocate memory manually as this slows down the
GC.

If you really can't or don't want to use the generic List<T> class, which
would be faster and easier, then try this code, I think it achieves what you
want :

public class Test
{
private int size = 0;

int[] ia = new int[4];

public void AppendDataItem(int data)
{
if (size == ia.Length)
Array.Resize(ref ia, ia.Length * 2);

ia[size++] = data;
}
}


{
Test t = new Test();

for (int i = 0; i < 10; i++ )
t.AppendDataItem(i);
}

Joanna
 
J

Joanna Carter [TeamB]

"Joanna Carter [TeamB]" <[email protected]> a écrit dans le message de
news: (e-mail address removed)...

| If you really can't or don't want to use the generic List<T> class, which
| would be faster and easier, then try this code, I think it achieves what
you
| want :

Sorry, I used array.Resize(...) which is only available in .NET 2.0; you
said you wanted .NET 1 compatibilty, thyerefore you need to use Copy or
CopyTo :

| public class Test
| {
| private int size = 0;
|
| int[] ia = new int[4];
|
| public void AppendDataItem(int data)
| {
if (size == ia.Length)
{
int[] temp = new int[ia.Length * 2];

Array.Copy(ia, temp, ia.Length);

ia = temp;
}

ia[size++] = data;
| }
| }

Joanna
 
P

Peter Olcott

Joanna Carter said:
"Joanna Carter [TeamB]" <[email protected]> a écrit dans le message de
news: (e-mail address removed)...

| If you really can't or don't want to use the generic List<T> class, which
| would be faster and easier, then try this code, I think it achieves what
you
| want :

Sorry, I used array.Resize(...) which is only available in .NET 2.0; you
said you wanted .NET 1 compatibilty, thyerefore you need to use Copy or
CopyTo :

| public class Test
| {
| private int size = 0;
|
| int[] ia = new int[4];

I am assuming that you are allocating four integers here, and that I could just
as easily allocate one.
|
| public void AppendDataItem(int data)
| {
if (size == ia.Length)
{
int[] temp = new int[ia.Length * 2];

Array.Copy(ia, temp, ia.Length);

If I understand this correctly, we could improve the performance a little using
this statement instead:
Array.Copy(ia, temp, size);
Am I correct? (Destination, Source, Length) ???
ia = temp;

Is this like a pointer assignment in C++ ???
}

ia[size++] = data;
| }
| }

Joanna
 
J

Joanna Carter [TeamB]

"Peter Olcott" <[email protected]> a écrit dans le message de [email protected]...

| I am assuming that you are allocating four integers here, and that I could
just
| as easily allocate one.

Yes, no problem; I was assuming a default starting size that you would
typically expect with something like ArrayList.

| If I understand this correctly, we could improve the performance a little
using
| this statement instead:
| Array.Copy(ia, temp, size);

Minimally, yes.

| Am I correct? (Destination, Source, Length) ???

No, (source, destination, length).

| Is this like a pointer assignment in C++ ???

..NET does not use explicit pointers or indirection, except in unsafe code.
Everything derives from System.Object and all "object holders" are
implicitly "pointers", although the address of the pointer or of the
contents of the pointer are deemed irrelevant. Objects just "are".

The only real difference you will see in assignment is that value types
(int, double, or any other struct) are assigned as a copy, whereas reference
types (proper classes) are assigned the "reference".

e.g.

struct S
{
public int x;
}

class C
{
public int x;
}

{
S i = new S();
i.x = 2; // i.x contains 2

S j = new S();
j = i; // j.x contains 2

i.x = 3; // j.x still contains 2

C o = new C();
o.x = 2; // o.x contains 2

C p = o; // p.x contains 2

o.x = 3; // p.x now also contains 3
}

.... and not a ->, & or * to be seen :)

Joanna
 
P

Peter Olcott

Joanna Carter said:
"Peter Olcott" <[email protected]> a écrit dans le message de [email protected]...

| I am assuming that you are allocating four integers here, and that I could
just
| as easily allocate one.

Yes, no problem; I was assuming a default starting size that you would
typically expect with something like ArrayList.

| If I understand this correctly, we could improve the performance a little
using
| this statement instead:
| Array.Copy(ia, temp, size);

Minimally, yes.

If the current allocation is 100 MB, and the current size is 50 MB, the
difference might not be so trivial.
| Am I correct? (Destination, Source, Length) ???

No, (source, destination, length).

| Is this like a pointer assignment in C++ ???

.NET does not use explicit pointers or indirection, except in unsafe code.
Everything derives from System.Object and all "object holders" are
implicitly "pointers", although the address of the pointer or of the
contents of the pointer are deemed irrelevant. Objects just "are".

So its sort of like the way that C/C++ treats arrays?
The only real difference you will see in assignment is that value types
(int, double, or any other struct) are assigned as a copy, whereas reference
types (proper classes) are assigned the "reference".

Array.Copy(ia, temp, ia.Length);
ia = temp;

So exactly what is going on under the covers with this last statement? It
certainly is not copying the whole array again, is it?

It has got to be some sort of copying of pointers doesn't it? Would it copy the
whole array again in the last statement if the underlying type was a value type
such as int?
 
J

Joanna Carter [TeamB]

"Peter Olcott" <[email protected]> a écrit dans le message de [email protected]...

| If the current allocation is 100 MB, and the current size is 50 MB, the
| difference might not be so trivial.

Assuming that the underlying implementation of the array holds a field that
holds the length of the array, then all that the Length property would do
would be to return the value held in that field. In which case the only
overhead would be the call through the property accessor rather than
straight to an externally held field.

| So its sort of like the way that C/C++ treats arrays?

It's a long time since I did C/C++ so I could not comment on that.

| Array.Copy(ia, temp, ia.Length);
| ia = temp;
|
| So exactly what is going on under the covers with this last statement? It
| certainly is not copying the whole array again, is it?

No, the first statement copies the items in the ia array to the first half
of the new temp array as expected. Then the second assignment is a simple
"reference" assignment where ia gets a reference to temp which will replave
the original array it was holding; the original array will be lost and then
be eligible for garbage collection.

Change an item in temp and ia would see the change; they are both "pointers"
to the same array.

| It has got to be some sort of copying of pointers doesn't it? Would it
copy the
| whole array again in the last statement if the underlying type was a value
type
| such as int?

As I said in my previous post, the contents of value types are copied on
assignment, but those of reference types are simply "pointed to" by the
second reference.

Joanna
 
J

Jon Skeet [C# MVP]

Peter Olcott said:
Array.Copy(ia, temp, ia.Length);
ia = temp;

So exactly what is going on under the covers with this last statement? It
certainly is not copying the whole array again, is it?

It's copying the reference, not the contents of the array.
It has got to be some sort of copying of pointers doesn't it? Would it copy the
whole array again in the last statement if the underlying type was a value type
such as int?

No. Arrays are always reference types, even if they're arrays of value
types. The difference is that if you have an array of a reference type
(eg String[]) each element of the array is a *reference* to a string,
not the string data itself. If you have an array of a value type
(eg int[]) then each element of the array is a value directly.
 
P

Peter Olcott

Joanna Carter said:
"Peter Olcott" <[email protected]> a écrit dans le message de [email protected]...

| If the current allocation is 100 MB, and the current size is 50 MB, the
| difference might not be so trivial.

Assuming that the underlying implementation of the array holds a field that
holds the length of the array, then all that the Length property would do
would be to return the value held in that field. In which case the only
overhead would be the call through the property accessor rather than
straight to an externally held field.
I saw where I goofed, I thought you were copying the capacity of the new array,
you are actually copying the capacity of the old array. Copying the capacity of
the new array would index out-of-bound on the old array. My mistake.
| So its sort of like the way that C/C++ treats arrays?

It's a long time since I did C/C++ so I could not comment on that.

The commonly understood basic principle where an array name is
one-and-the-same-thing as an array address.
| Array.Copy(ia, temp, ia.Length);
| ia = temp;
|
| So exactly what is going on under the covers with this last statement? It
| certainly is not copying the whole array again, is it?

No, the first statement copies the items in the ia array to the first half
of the new temp array as expected. Then the second assignment is a simple
"reference" assignment where ia gets a reference to temp which will replace
the original array it was holding; the original array will be lost and then
be eligible for garbage collection.

Change an item in temp and ia would see the change; they are both "pointers"
to the same array.

Yes that is what I was talking about, they are both pointers within the
underlying architecture.
| It has got to be some sort of copying of pointers doesn't it? Would it
copy the
| whole array again in the last statement if the underlying type was a value
type
| such as int?

As I said in my previous post, the contents of value types are copied on
assignment, but those of reference types are simply "pointed to" by the
second reference.

Array.Copy(ia, temp, ia.Length);
ia = temp;

So the last statement contains two reference types, even though the underlying
type of array element may be a value type or a reference type ???
 
P

Peter Olcott

Jon Skeet said:
Peter Olcott said:
Array.Copy(ia, temp, ia.Length);
ia = temp;

So exactly what is going on under the covers with this last statement? It
certainly is not copying the whole array again, is it?

It's copying the reference, not the contents of the array.
It has got to be some sort of copying of pointers doesn't it? Would it copy
the
whole array again in the last statement if the underlying type was a value
type
such as int?

No. Arrays are always reference types, even if they're arrays of value
types. The difference is that if you have an array of a reference type
(eg String[]) each element of the array is a *reference* to a string,
not the string data itself. If you have an array of a value type
(eg int[]) then each element of the array is a value directly.

So it look like if I want to avoid the huge overhead of boxing and unboxing that
value types have, I must comprise all of my (hand-rolled std::vector like)
arrays of value type which can include elemental types such as int, char,
double, and also composite types such as struct, but not composite types such as
class. Is this right ???
 
J

Joanna Carter [TeamB]

"Peter Olcott" <[email protected]> a écrit dans le message de [email protected]...

| The commonly understood basic principle where an array name is
| one-and-the-same-thing as an array address.

In that case, that would appear to be the principle for all .NET types. The
variable that you see *is* the object that the variable points to. There is
no indirection on the surface, even though the underlying framework may
involve such.

I think the thing that you are having trouble with is this lack of
indirection; when it comes to C#, you need to try to forget about ->, & and
* and simply think of variables as being one and the same as the object
which they hold. For the sake of your comprehension, forget about pointers
and addresses :)

| Yes that is what I was talking about, they are both pointers within the
| underlying architecture.

Maybe but that should not concern you; forget pointers :)).

| > As I said in my previous post, the contents of value types are copied on
| > assignment, but those of reference types are simply "pointed to" by the
| > second reference.
|
| Array.Copy(ia, temp, ia.Length);
| ia = temp;
|
| So the last statement contains two reference types, even though the
underlying
| type of array element may be a value type or a reference type ???

Array is a reference type, regardless of the type that it holds. See Jon's
post as well.

Joanna
 
P

Peter Olcott

Joanna Carter said:
"Peter Olcott" <[email protected]> a écrit dans le message de [email protected]...

| The commonly understood basic principle where an array name is
| one-and-the-same-thing as an array address.

In that case, that would appear to be the principle for all .NET types. The
variable that you see *is* the object that the variable points to. There is
no indirection on the surface, even though the underlying framework may
involve such.

I think the thing that you are having trouble with is this lack of
indirection; when it comes to C#, you need to try to forget about ->, & and
* and simply think of variables as being one and the same as the object
which they hold. For the sake of your comprehension, forget about pointers
and addresses :)

One can't forget about this completely otherwise ones makes the mistake of
taking a shallow copy to be one-and-the-same-thing as a deep copy. I think that
..NET may have simplified this somewhat in some ways, I am currently not sure of
exactly how they did this. The way that this problem is typically simplified is
to always provide all of the overhead of a deep copy just in case that is what
was wanted.
| Yes that is what I was talking about, they are both pointers within the
| underlying architecture.

Maybe but that should not concern you; forget pointers :)).

Yet then one must still wonder about the shallow versus deep copy problem, and
avoiding the overhead of the deep copy, when it is not needed. The best solution
at this point in time for applications programming might be to simply always do
a deep copy, and make doing a shallow copy syntactically impossible. There are
cases on systems programming where this would be unacceptable.
 
J

Joanna Carter [TeamB]

"Peter Olcott" <[email protected]> a écrit dans le message de [email protected]...

| So it look like if I want to avoid the huge overhead of boxing and
unboxing that
| value types have, I must comprise all of my (hand-rolled std::vector like)
| arrays of value type which can include elemental types such as int, char,
| double, and also composite types such as struct, but not composite types
such as
| class. Is this right ???

Or you can simply use List<T> generic class under .NET 2.0 as this provides
a native typesafe, dynamic collection that doesn't use any boxing or
unboxing.

WYSIWYG. It says it is a list of a cetain type, it *is* a list of a certain
type - no casting required.

Why do you want to write your own version of something that already exists
and does the job, possibly better than your attempt ?

{
List<int> intList = new List<int>();

intList.Add(123); // compiles

intList.Add("123"); // will not compile

...
}

Or if you want to have your own generic Vector class, then do something like
this :

public class Vector<T> : IEnumerable<T>
{
private List<T> items = new List<T>();

public int Size
{
get { return items.Count; }
}

public bool Empty
{
get { return items.Count == 0; }
}

public void PushBack(T item)
{
items.Add(item);
}

public T this[int index]
{
get { return items[index]; }
}

.... etc

#region IEnumerable members

public IEnumerator GetEnumerator()
{
return items.GetEnumerator();
}

#endregion

#region IEnumerable<T> members

public IEnumerator<T> GetEnumerator()
{
return items.GetEnumerator();
}

#endregion
}

Joanna
 
J

Joanna Carter [TeamB]

"Peter Olcott" <[email protected]> a écrit dans le message de [email protected]...

| One can't forget about this completely otherwise ones makes the mistake of
| taking a shallow copy to be one-and-the-same-thing as a deep copy. I think
that
| .NET may have simplified this somewhat in some ways, I am currently not
sure of
| exactly how they did this. The way that this problem is typically
simplified is
| to always provide all of the overhead of a deep copy just in case that is
what
| was wanted.

You don't need to think about pointers and addresses to separate shallow and
deep copy semantics.

All types support shallow cloning by means of the derived protected
System.Object.MemberwiseClone() method which simply copies the contents of
instance fields following the semantics of the field type.

| Yet then one must still wonder about the shallow versus deep copy problem,
and
| avoiding the overhead of the deep copy, when it is not needed. The best
solution
| at this point in time for applications programming might be to simply
always do
| a deep copy, and make doing a shallow copy syntactically impossible. There
are
| cases on systems programming where this would be unacceptable.

Deep cloning is usually only available if the type supports ICloneable, but
is still at the discretion of the implementer as to how deep that cloning
goes.

Joanna
 
P

Peter Olcott

Joanna Carter said:
"Peter Olcott" <[email protected]> a écrit dans le message de [email protected]...

| So it look like if I want to avoid the huge overhead of boxing and
unboxing that
| value types have, I must comprise all of my (hand-rolled std::vector like)
| arrays of value type which can include elemental types such as int, char,
| double, and also composite types such as struct, but not composite types
such as
| class. Is this right ???

Or you can simply use List<T> generic class under .NET 2.0 as this provides
a native typesafe, dynamic collection that doesn't use any boxing or
unboxing.

Of even better use the actual std::vector itself that is now available with the
currently released version of visual studio, that way I further still reduce my
learning curve. There are two reason why I don't want to upgrade yet, (1) Price,
(2) The exams refer to the older version.
WYSIWYG. It says it is a list of a cetain type, it *is* a list of a certain
type - no casting required.

Why do you want to write your own version of something that already exists
and does the job, possibly better than your attempt ?

{
List<int> intList = new List<int>();

intList.Add(123); // compiles

intList.Add("123"); // will not compile

...
}

Or if you want to have your own generic Vector class, then do something like
this :

public class Vector<T> : IEnumerable<T>
{
private List<T> items = new List<T>();

public int Size
{
get { return items.Count; }
}

public bool Empty
{
get { return items.Count == 0; }
}

public void PushBack(T item)
{
items.Add(item);
}

public T this[int index]
{
get { return items[index]; }
}

... etc

#region IEnumerable members

public IEnumerator GetEnumerator()
{
return items.GetEnumerator();
}

#endregion

#region IEnumerable<T> members

public IEnumerator<T> GetEnumerator()
{
return items.GetEnumerator();
}

#endregion
}

Joanna
 
J

Joanna Carter [TeamB]

"Peter Olcott" <[email protected]> a écrit dans le message de [email protected]...

| Of even better use the actual std::vector itself that is now available
with the
| currently released version of visual studio, that way I further still
reduce my
| learning curve. There are two reason why I don't want to upgrade yet, (1)
Price,
| (2) The exams refer to the older version.

I think you will find that List<T> should give you most of the functionality
you need. (1)You can get a copy of Visual Studio 2005 Express for free (2)
are you more interested in writing code and achieving results or passing
exams ? :)

Joanna
 
S

Samuel R. Neff

Are you doing all this work because you've profiled your app or done
tests to confirm that boxing/unboxing is an unacceptable performance
hit or is all this extra work and overhead just because you think
boxing is going to be a problem?

Sam
 
P

Peter Olcott

Systems programming is entirely different than applications programming. When
you are amortizing development costs over millions of users, you just don't test
something to see if its good enough. In this case you shoot for the ballpark of
as good as possible, right from the very beginning.

Samuel R. Neff said:
Are you doing all this work because you've profiled your app or done
tests to confirm that boxing/unboxing is an unacceptable performance
hit or is all this extra work and overhead just because you think
boxing is going to be a problem?

Sam

------------------------------------------------------------
We're hiring! B-Line Medical is seeking Mid/Sr. .NET
Developers for exciting positions in medical product
development in MD/DC. Work with a variety of technologies
in a relaxed team environment. See ads on Dice.com.



I need std::vector like capability for several custom classes. I already
discussed this extensively in the thread named ArrayList without Boxing and
Unboxing. The solution was to simply create non-generic (non C++ template)
std::vector like capability for each of these custom classes. (Solution must
work in Visual Studio 2002).

Since I have already written one std::vector for a YeOlde C++ compiler
(Borland
C++ 1.0) that had neither templates nor STL, I know how to do this. What I
don't
know how to do is to directly re-allocate memory in the garbage collected C#.

I have written what I need in pseudocode, what is the correct C# syntax for
this?

int Size;
int Capacity;

bool AppendDataItem(DataItemType Data) {
if (Size == Capacity) {
(1) Capacity = Capacity * 2; // Or * 1.5
(2) Temp = MemoryPointer;
(3) MemoryPointer = Allocate(Capacity);
(4) Copy Data from Temp to MemoryPointer;
(5) DeAllocate(Temp);
(6) MemoryPointer[Size] = Data;
(7) Size++;
}
}
 
P

Peter Olcott

Joanna Carter said:
"Peter Olcott" <[email protected]> a écrit dans le message de [email protected]...

| Of even better use the actual std::vector itself that is now available
with the
| currently released version of visual studio, that way I further still
reduce my
| learning curve. There are two reason why I don't want to upgrade yet, (1)
Price,
| (2) The exams refer to the older version.

I think you will find that List<T> should give you most of the functionality
you need. (1)You can get a copy of Visual Studio 2005 Express for free (2)
are you more interested in writing code and achieving results or passing
exams ? :)

Both simultaneously without any tradeoffs if possible.
(1) Where can I get this free copy?
(2) This free copy probably is not licensed for commercial use, and even if it
is, would most likely be missing the code optimizers.
 
J

Joanna Carter [TeamB]

"Peter Olcott" <[email protected]> a écrit dans le message de [email protected]...

| Systems programming is entirely different than applications programming.
When
| you are amortizing development costs over millions of users, you just
don't test
| something to see if its good enough. In this case you shoot for the
ballpark of
| as good as possible, right from the very beginning.

There is a well known anti-pattern called "Premature Optimisation", beware
of it, prove that something is too slow before wasting time and effort
trying to speed it up.

Joanna
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top