Array.Resize or List<> or some other data structure

T

Trecius

Hello, Newsgroupians:

I've an optimization question for you all really quick. I have a stream
that I am reading some bytes. At times, the stream can contain a small
amount of bytes such as 50 or so or it can contain as much 10000000 bytes.
In reality, I do not know the maximum number of bytes.

In my function, I am going to read() the byte stream using a buffer. Now,
is it better to read it into a buffer and dump the buffer into a List<byte>
maybe using AddRange() or should I Array.Resize the buffer to grow a specific
size everytime?

Code for List<byte>

List<byte> lstBytes = new List<byte>();
byte[] buffer = new byte[2048];

while (stream.Read(buffer, 0, buffer.Length) != -1)
{
lstBytes.AddRange(buffer);
}
return lstBytes.ToArray();



Code for resizing array:

byte[] buffer = new byte[2048];
while (stream.Read(buffer, buffer.Length - 2048, 2048) != -1)
{
Array.Resize(ref buffer, buffer.Size + 2048);
}
return buffer;



So which way should I use? Should I dump it into a list everytime, or
should I resize the array everytime? Is there another way you would
recommend? Thank you all for your help and suggestions.


Trecius
 
T

Trecius

My stream isn't a file. :(

Family Tree Mike said:
Are you aware there is:

byte [] bytes = File.ReadAllBytes("file.bin");

?

Trecius said:
Hello, Newsgroupians:

I've an optimization question for you all really quick. I have a stream
that I am reading some bytes. At times, the stream can contain a small
amount of bytes such as 50 or so or it can contain as much 10000000 bytes.
In reality, I do not know the maximum number of bytes.

In my function, I am going to read() the byte stream using a buffer. Now,
is it better to read it into a buffer and dump the buffer into a List<byte>
maybe using AddRange() or should I Array.Resize the buffer to grow a specific
size everytime?

Code for List<byte>

List<byte> lstBytes = new List<byte>();
byte[] buffer = new byte[2048];

while (stream.Read(buffer, 0, buffer.Length) != -1)
{
lstBytes.AddRange(buffer);
}
return lstBytes.ToArray();



Code for resizing array:

byte[] buffer = new byte[2048];
while (stream.Read(buffer, buffer.Length - 2048, 2048) != -1)
{
Array.Resize(ref buffer, buffer.Size + 2048);
}
return buffer;



So which way should I use? Should I dump it into a list everytime, or
should I resize the array everytime? Is there another way you would
recommend? Thank you all for your help and suggestions.


Trecius
 
F

Family Tree Mike

Then will Stream.Length work to initially size the array?

Trecius said:
My stream isn't a file. :(

Family Tree Mike said:
Are you aware there is:

byte [] bytes = File.ReadAllBytes("file.bin");

?

Trecius said:
Hello, Newsgroupians:

I've an optimization question for you all really quick. I have a stream
that I am reading some bytes. At times, the stream can contain a small
amount of bytes such as 50 or so or it can contain as much 10000000 bytes.
In reality, I do not know the maximum number of bytes.

In my function, I am going to read() the byte stream using a buffer. Now,
is it better to read it into a buffer and dump the buffer into a List<byte>
maybe using AddRange() or should I Array.Resize the buffer to grow a specific
size everytime?

Code for List<byte>

List<byte> lstBytes = new List<byte>();
byte[] buffer = new byte[2048];

while (stream.Read(buffer, 0, buffer.Length) != -1)
{
lstBytes.AddRange(buffer);
}
return lstBytes.ToArray();



Code for resizing array:

byte[] buffer = new byte[2048];
while (stream.Read(buffer, buffer.Length - 2048, 2048) != -1)
{
Array.Resize(ref buffer, buffer.Size + 2048);
}
return buffer;



So which way should I use? Should I dump it into a list everytime, or
should I resize the array everytime? Is there another way you would
recommend? Thank you all for your help and suggestions.


Trecius
 
R

Rudy Velthuis

Family said:
Then will Stream.Length work to initially size the array?

If his stream reads bytes from, say, a port, I guess Length is not
known before all bytes are read.
 
T

Trecius

In fact, it is a port. :)

Rudy Velthuis said:
If his stream reads bytes from, say, a port, I guess Length is not
known before all bytes are read.

--
Rudy Velthuis http://rvelthuis.de

"The study of non-linear physics is like the study of non-elephant
biology." -- Unknown
 
R

Rudy Velthuis

Peter said:
If by "port", you mean a NetworkStream retrieved from a Socket
instance, then Rudy is correct...the Length property cannot be
determined and in fact will always throw a NotSupportedException.

I actually meant a physical port, like an USB port with some kind of
lab device attached, but the kind of port you meant has the same
problems. You simply can't know the amount of data to expect.

After all, data can be read from so many sources. <g>
 
T

Trecius

Thank you, Mr. Duniho. I will use your suggestion. It seems like it will
work perfectly for my needs. Thank you again.

Trecius

Peter Duniho said:
[...]
So which way should I use? Should I dump it into a list everytime, or
should I resize the array everytime? Is there another way you would
recommend? Thank you all for your help and suggestions.

The two approaches you're asking about are basically equivalent. The
List<T> class uses an array internally, and will do effectively the same
operation as Array.Resize(). The only real difference between the two is
that List<T> always doubles the size of the storage, so that you need to
resize fewer and fewer times as the data gets larger. Of course, you
could always use that strategy when using Array.Resize() as well, if that
was important.

Personally, I wouldn't use either. I would make every effort to try to
process the bytes as they are read, so that they never have to be all in
memory at once. That's the most ideal solution, as it avoids the whole
business of having to buffer an arbitrarily large amount of data
altogether.

If you can't process the bytes as they are read, but instead need to store
them all up first, I would use a MemoryStream, and write to the
MemoryStream as the bytes come in. Then when you're done, you can use the
MemoryStream.ToArray() method to get the byte array representing the data.

I believe that MemoryStream uses the same double-and-copy algorithm as
List<T>, so if that wound up being a performance liability, I would switch
to allocating individual buffers and storing them in a List<byte[]>. That
is, rather than resizing a single byte[] over and over, just allocate a
new byte[] when you've run out of room in your current byte[], storing a
reference to each byte[] in the List<byte[]>.

One more alternative would be to have the i/o code use individual byte[]
instances only, and hand those off to a different thread that deals with
writing them to a MemoryStream. In terms of performance, this would
probably be somewhere in between using a List<byte[]> to store individual
buffers and just always writing to a MemoryStream.

With this alternative, you could either use a double- or triple-buffering
scheme where you have two or three such buffers that are used in rotation,
or you could just allocate a new buffer as needed, letting the used ones
be garbage collected after they've been copied to the MemoryStream. The
former has the advantage of not causing a lot of repeated allocations and
collections, at the cost of complexity and the possibility of having the
i/o thread having to wait for a buffer to become available.

Personally, if you have to buffer all the data, I would start with writing
to a MemoryStream. It is by far the simplest approach, and may well
perform adequately for your needs. Only if I ran into some specific
performance issue would I then start exploring some of these other
options. They are reasonably straightforward to code, but would certainly
obfuscate the core purpose of the code and any complication of the code
should avoided unless absolutely necessary.

Pete
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top