PC Review


Reply
Thread Tools Rate Thread

BinaryReader.ReadBytes issue

 
 
Guest
Posts: n/a
 
      24th Feb 2004
Hi,

I am trying to optimize the reading of a huge binary file into a byte[]...

I am doing the following..



byte[] ba = new byte[br.BaseStream.Length];

ba = br.ReadBytes((int)br.BaseStream.Length);



The problem is., BinaryReader.ReadBytes(...) only takes an int wherase
BinaryReader.BaseStream.Length is a long. Why isnt there a ReadBytes that
takes a long?

Chances are I wont reach this problem but the problem will be there none the
less.




 
Reply With Quote
 
 
 
 
Jon Skeet [C# MVP]
Guest
Posts: n/a
 
      24th Feb 2004
<<.>> wrote:
> I am trying to optimize the reading of a huge binary file into a byte[]...
>
> I am doing the following..
>
> byte[] ba = new byte[br.BaseStream.Length];
>
> ba = br.ReadBytes((int)br.BaseStream.Length);


Why are you allocating an array and then immediately turning it into
garbage?

> The problem is., BinaryReader.ReadBytes(...) only takes an int wherase
> BinaryReader.BaseStream.Length is a long. Why isnt there a ReadBytes that
> takes a long?
>
> Chances are I wont reach this problem but the problem will be there none the
> less.


It's certainly not ideal, but I would expect that if you actually had a
file larger than 2Gb, you wouldn't want to be reading it all in in a
single call anyway.

--
Jon Skeet - <(E-Mail Removed)>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
 
Reply With Quote
 
=?ISO-8859-2?Q?Marcin_Grz=EAbski?=
Guest
Posts: n/a
 
      24th Feb 2004
Hi,

> I am trying to optimize the reading of a huge binary file into a byte[]...


Writing huge files into byte[] is not any optimisation.


> byte[] ba = new byte[br.BaseStream.Length];


You don't need the initialization of "ba" while you are using
br.ReadBytes(...)

> ba = br.ReadBytes((int)br.BaseStream.Length);


This is a bad practise to read unknown stream (unknown size) in
single line.

> The problem is., BinaryReader.ReadBytes(...) only takes an int wherase
> BinaryReader.BaseStream.Length is a long. Why isnt there a ReadBytes that
> takes a long?


Because mostly there's no need to fill memory with huge files.
(files greater than 2 GB)
Think over your design and your needs. If you really want to
read huge files than you can easily get out of memmory!

Marcin
 
Reply With Quote
 
Guest
Posts: n/a
 
      24th Feb 2004

"Jon Skeet [C# MVP]" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> <<.>> wrote:
> > I am trying to optimize the reading of a huge binary file into a

byte[]...
> >
> > I am doing the following..
> >
> > byte[] ba = new byte[br.BaseStream.Length];
> >
> > ba = br.ReadBytes((int)br.BaseStream.Length);

>
> Why are you allocating an array and then immediately turning it into
> garbage?


Why does ReadBytes reallocate it or copy into it?

>
> > The problem is., BinaryReader.ReadBytes(...) only takes an int wherase
> > BinaryReader.BaseStream.Length is a long. Why isnt there a ReadBytes

that
> > takes a long?
> >
> > Chances are I wont reach this problem but the problem will be there none

the
> > less.

>
> It's certainly not ideal, but I would expect that if you actually had a
> file larger than 2Gb, you wouldn't want to be reading it all in in a
> single call anyway.


Thats the extreme which I wont be anywhere near that range.


>
> --
> Jon Skeet - <(E-Mail Removed)>
> http://www.pobox.com/~skeet
> If replying to the group, please do not mail me too



 
Reply With Quote
 
Guest
Posts: n/a
 
      24th Feb 2004
I now do the following... to just be safe on the casting limit.

byte[] ba = new byte[br.BaseStream.Length]; // are you saying this
should be null before and let .ReadBytes allocate it (if it does that?)

if (br.BaseStream.Length <= int.MaxValue)
{
// we are within the casting limits so we can use the optimized
method of reading
ba = br.ReadBytes((int)br.BaseStream.Length);
}
else
{
// we are outside the limits (rare) so we can use the normal way
of reading (slower)
ArrayList b = new ArrayList();
byte readByte = 0x00;
while(br.BaseStream.Position < br.BaseStream.Length)
{
readByte = br.ReadByte();
b.Add(readByte);
}

ba = (byte[])b.ToArray(typeof(byte));
}




"Jon Skeet [C# MVP]" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> <<.>> wrote:
> > I am trying to optimize the reading of a huge binary file into a

byte[]...
> >
> > I am doing the following..
> >
> > byte[] ba = new byte[br.BaseStream.Length];
> >
> > ba = br.ReadBytes((int)br.BaseStream.Length);

>
> Why are you allocating an array and then immediately turning it into
> garbage?
>
> > The problem is., BinaryReader.ReadBytes(...) only takes an int wherase
> > BinaryReader.BaseStream.Length is a long. Why isnt there a ReadBytes

that
> > takes a long?
> >
> > Chances are I wont reach this problem but the problem will be there none

the
> > less.

>
> It's certainly not ideal, but I would expect that if you actually had a
> file larger than 2Gb, you wouldn't want to be reading it all in in a
> single call anyway.
>
> --
> Jon Skeet - <(E-Mail Removed)>
> http://www.pobox.com/~skeet
> If replying to the group, please do not mail me too



 
Reply With Quote
 
Guest
Posts: n/a
 
      24th Feb 2004
I need it as a byte[] internally, its being read.


"Marcin Grzębski" <(E-Mail Removed)> wrote in message
news:c1fi48$egc$(E-Mail Removed)...
> Hi,
>
> > I am trying to optimize the reading of a huge binary file into a

byte[]...
>
> Writing huge files into byte[] is not any optimisation.
>
>
> > byte[] ba = new byte[br.BaseStream.Length];

>
> You don't need the initialization of "ba" while you are using
> br.ReadBytes(...)
>
> > ba = br.ReadBytes((int)br.BaseStream.Length);

>
> This is a bad practise to read unknown stream (unknown size) in
> single line.
>
> > The problem is., BinaryReader.ReadBytes(...) only takes an int wherase
> > BinaryReader.BaseStream.Length is a long. Why isnt there a ReadBytes

that
> > takes a long?

>
> Because mostly there's no need to fill memory with huge files.
> (files greater than 2 GB)
> Think over your design and your needs. If you really want to
> read huge files than you can easily get out of memmory!
>
> Marcin



Strange, that you say its not an optimization, it sure runs faster. I guess
you know better than the runtime.


 
Reply With Quote
 
=?ISO-8859-2?Q?Marcin_Grz=EAbski?=
Guest
Posts: n/a
 
      24th Feb 2004
.. wrote:

> I need it as a byte[] internally, its being read.


I see.
But can't you keep it as a collection of byte[] buffers?
e.g. as an *ArrayList* of byte[] elements with length = 4096

Then you can access those buffers as *ArrayList* items.
Of course buffer length can be set to other value.

Marcin
 
Reply With Quote
 
Guest
Posts: n/a
 
      24th Feb 2004

"Marcin Grzębski" <(E-Mail Removed)> wrote in message
news:c1fi48$egc$(E-Mail Removed)...
> Hi,
>
> > I am trying to optimize the reading of a huge binary file into a

byte[]...
>
> Writing huge files into byte[] is not any optimisation.
>
>
> > byte[] ba = new byte[br.BaseStream.Length];

>
> You don't need the initialization of "ba" while you are using
> br.ReadBytes(...)


fine its byte[] ba = null then.

>
> > ba = br.ReadBytes((int)br.BaseStream.Length);

>
> This is a bad practise to read unknown stream (unknown size) in
> single line.




Hello, EARTH. BaseStream.Length IS THE SIZE and therefore KNOWN. What do you
propse then genius boy. I need the entire file in memory in a byte[], so how
else would you do it brainiac mr.mensa.



> > The problem is., BinaryReader.ReadBytes(...) only takes an int wherase
> > BinaryReader.BaseStream.Length is a long. Why isnt there a ReadBytes

that
> > takes a long?

>
> Because mostly there's no need to fill memory with huge files.
> (files greater than 2 GB)
> Think over your design and your needs. If you really want to
> read huge files than you can easily get out of memmory!
>


ALl i need to do is get the file into memory for another part, that other
part I dont give a rats a.rse about not my problem.

I am talking average of 300 K files


> Marcin



 
Reply With Quote
 
Jon Skeet [C# MVP]
Guest
Posts: n/a
 
      24th Feb 2004
<<.>> wrote:
> > > byte[] ba = new byte[br.BaseStream.Length];
> > >
> > > ba = br.ReadBytes((int)br.BaseStream.Length);

> >
> > Why are you allocating an array and then immediately turning it into
> > garbage?

>
> Why does ReadBytes reallocate it or copy into it?


Well look at the call - you're not telling it where to read to, it's
returning a reference to a new array.

> > It's certainly not ideal, but I would expect that if you actually had a
> > file larger than 2Gb, you wouldn't want to be reading it all in in a
> > single call anyway.

>
> Thats the extreme which I wont be anywhere near that range.


In which case, it's fine

--
Jon Skeet - <(E-Mail Removed)>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
 
Reply With Quote
 
Jon Skeet [C# MVP]
Guest
Posts: n/a
 
      24th Feb 2004
<<.>> wrote:
> I now do the following... to just be safe on the casting limit.
>
> byte[] ba = new byte[br.BaseStream.Length]; // are you saying this
> should be null before and let .ReadBytes allocate it (if it does that?)


I'm saying you don't need to assign a value to it at all, as you assign
the value when you've done the read.

> if (br.BaseStream.Length <= int.MaxValue)
> {
> // we are within the casting limits so we can use the optimized
> method of reading
> ba = br.ReadBytes((int)br.BaseStream.Length);
> }
> else
> {
> // we are outside the limits (rare) so we can use the normal way
> of reading (slower)
> ArrayList b = new ArrayList();
> byte readByte = 0x00;
> while(br.BaseStream.Position < br.BaseStream.Length)
> {
> readByte = br.ReadByte();
> b.Add(readByte);
> }
>
> ba = (byte[])b.ToArray(typeof(byte));
> }


No, that second bit isn't a good idea. If you've got a file of over
2Gb, you most certainly *don't* want to create an ArrayList where each
element is a byte read from the file. It would take at least 12 times
the file size - so you'd end up with a memory usage of *at least* 24Gb.
Not pretty.

Do you really want to create an array that is the size of the whole
file, if it's more than 2Gb? I would expect any sane use of such a file
to be either something which can discard the bytes as it reads and
processes them, or something which seeks around within the file. A
safer bet is probably to throw an exception.

--
Jon Skeet - <(E-Mail Removed)>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
 
Reply With Quote
 
 
 
Reply

Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
MemoryStream ReadBytes() Nicolas Microsoft C# .NET 2 11th Mar 2005 01:44 AM
ReadBytes hung when file is removed Allen Microsoft Dot NET Compact Framework 0 28th Oct 2004 01:31 AM
BinaryReader.ReadBytes issue Microsoft VC .NET 29 25th Feb 2004 07:20 PM
BinaryReader.ReadBytes issue Microsoft Dot NET Framework 29 25th Feb 2004 07:20 PM
BinaryReader.ReadBytes issue Microsoft C# .NET 29 25th Feb 2004 07:20 PM


Features
 

Advertising
 

Newsgroups
 


All times are GMT +1. The time now is 06:47 AM.