What is PeekChar() ???

Guest · Dec 5, 2004

I was trying to use BinaryReader.PeekChar to "peek" the next byte, that works
fine unless the byte value's above 127 .... is there anything I can do abouit
it?

thnx,
mikolas

Alexander Muylaert · Dec 6, 2004

Don't you have PeekByte? PeekChar will read a unicode char, this can be
more than 1 byte.

Kind regards

Alexander

Morten Wennevik · Dec 6, 2004

Hi Mikolas,

The default constructor for BinaryReader is set to use UTF8 encoding which
will be ascii for the first 127 byte characters but will mean unicode for
characters above 127 (ie reading two bytes etc).

Use the encoding used in the text. This will create a reader with the
default windows page on your machine.
BinaryReader br = new BinaryReader(stream, Encoding.Default);

Guest · Dec 6, 2004

Morten Wennevik said:
Hi Mikolas,

The default constructor for BinaryReader is set to use UTF8 encoding which
will be ascii for the first 127 byte characters but will mean unicode for
characters above 127 (ie reading two bytes etc).

Use the encoding used in the text. This will create a reader with the
default windows page on your machine.
BinaryReader br = new BinaryReader(stream, Encoding.Default);

I admit I don't know much about all those different encodings, but .Default
sounds a bit dangerous since I'm writing code that should be running on
different devices. So should I use UnicodeEncoding and then mask the desired
byte?

thanks,
mikolas

Morten Wennevik · Dec 7, 2004

Morten Wennevik said:
Morten Wennevik said:

Hi Mikolas,

The default constructor for BinaryReader is set to use UTF8 encoding
which
will be ascii for the first 127 byte characters but will mean unicode
for
characters above 127 (ie reading two bytes etc).

Use the encoding used in the text. This will create a reader with the
default windows page on your machine.
BinaryReader br = new BinaryReader(stream, Encoding.Default);

Click to expand...

I admit I don't know much about all those different encodings, but
.Default
sounds a bit dangerous since I'm writing code that should be running on
different devices. So should I use UnicodeEncoding and then mask the
desired
byte?

thanks,
mikolas

Hard to say. If the text could be in different encodings you would
probably benefit from using Unicode or UTF8. If you only need the
standard default extended English ascii you could get that specific
encoding with
Encoding e = Encoding.GetEncoding("Windows-1252");
Encoding e = Encoding.GetEncoding("ISO-8859-1");
I believe the code table are the same for these two code pages. There is
also "US ASCII" listed under western code pages.

You can read more about Unicode and UTF8 on this page
http://www.pobox.com/~skeet/csharp/unicode.html

Jon Skeet [C# MVP] · Dec 7, 2004

Morten Wennevik said:
Hard to say. If the text could be in different encodings you would
probably benefit from using Unicode or UTF8. If you only need the
standard default extended English ascii you could get that specific
encoding with
Encoding e = Encoding.GetEncoding("Windows-1252");
Encoding e = Encoding.GetEncoding("ISO-8859-1");
I believe the code table are the same for these two code pages.

No they're not - they differ between 128 and 140.

Morten Wennevik · Dec 7, 2004

No they're not - they differ between 128 and 140.

I see. Which is considered the standard western codepage? Or is the
standard something that differs between europe and usa?

Jon Skeet [C# MVP] · Dec 7, 2004

Morten Wennevik said:
I see. Which is considered the standard western codepage? Or is the
standard something that differs between europe and usa?

Encoding.Default is Windows 1252, I believe.

Morten Wennevik · Dec 7, 2004

Encoding.Default is Windows 1252, I believe.

Well, on this system, Norwegian Windows 98, Encoding.Default is
iso-8859-1. I believe Encoding.Default varies with whatever codepage the
OS uses.

Jon Skeet [C# MVP] · Dec 7, 2004

Morten Wennevik said:
Well, on this system, Norwegian Windows 98, Encoding.Default is
iso-8859-1.

Interesting. I thought 1252 was the default throughout Western
Europe... Which property did you use to display iso-8859-1 though? The
BodyName property for 1252 returns iso-8859-1, but the others return
Windows-1252.

I believe Encoding.Default varies with whatever codepage the
OS uses.

It certainly does.

Morten Wennevik · Dec 7, 2004

Interesting. I thought 1252 was the default throughout Western
Europe... Which property did you use to display iso-8859-1 though? The
BodyName property for 1252 returns iso-8859-1, but the others return
Windows-1252.

It certainly does.

Interesting, I only checked the BodyName and indeed the other values are
1252/windows 1252, but if BodyName reports the codename for mail agents
one would think the two are identical. Odd.

Guest · Dec 8, 2004

Well, this is is strange. Obviously the encoding thing's pretty complex so do
I really have to deal with all this? All I want is to peek one byte ahead ...
I remember like 10 years ago when I started learning programming things like
that were possible(even here in th Czech rep.) - what is it that happened in
the meantime?

Many Thanks,
mikolas

Guest · Dec 8, 2004

Alexander Muylaert said:
Don't you have PeekByte?

this would certainly be great and I wouldn't have to spend 1hour debugging
but I haven't seen it anywhere...

mikolas

Morten Wennevik · Dec 8, 2004

Well, this is is strange. Obviously the encoding thing's pretty complex
so do
I really have to deal with all this? All I want is to peek one byte
ahead ...
I remember like 10 years ago when I started learning programming things
like
that were possible(even here in th Czech rep.) - what is it that
happened in
the meantime?

Many Thanks,
mikolas

Well, if you aren't going to read the bytes as characters just use the
overloaded
constructor for the BinaryReader.

new BinaryReader(stream, Encoding.Default);

This should force a character to be considered 8 bits in size so PeekChar
should return a single byte.

Guest · Dec 8, 2004

Morten Wennevik said:
Well, if you aren't going to read the bytes as characters just use the
overloaded
constructor for the BinaryReader.

new BinaryReader(stream, Encoding.Default);

This should force a character to be considered 8 bits in size so PeekChar
should return a single byte.

I'm sorry but I still don't get it. How do you know that? My SW should also
be running on WinCE and as far as I know there's only unicode.
This is just so weird - why there's PeekChar and not PeekByte, one would
expect to use binary reader for binary reading....

thanks,
mikolas

Jon Skeet [C# MVP] · Dec 8, 2004

Mikolas said:
I'm sorry but I still don't get it. How do you know that? My SW should also
be running on WinCE and as far as I know there's only unicode.
This is just so weird - why there's PeekChar and not PeekByte, one would
expect to use binary reader for binary reading....

I know, it's very strange that there isn't a PeekByte

Rather than trust to Encoding.Default, you could use Encoding.ASCII or
Encoding.GetEncoding(28591), the latter of which is ISO-8859-1.

Morten Wennevik · Dec 10, 2004

Mikolas,

Hm, I mistakingly though Encoding.Default always would use an 8-bit
encoding, but a simple test proved me wrong. I would use
Encoding.GetEncoding("ISO-8859-1") or any of the encodings you know are
8-bit. You should not however use Encoding.ASCII as that is only 7-bit
and will truncate any values above 127.

Jon Skeet [C# MVP] · Dec 10, 2004

Morten Wennevik said:
Hm, I mistakingly though Encoding.Default always would use an 8-bit
encoding, but a simple test proved me wrong. I would use
Encoding.GetEncoding("ISO-8859-1") or any of the encodings you know are
8-bit. You should not however use Encoding.ASCII as that is only 7-bit
and will truncate any values above 127.

I don't see why that causes an issue if the encoding is only being used
for PeekChar, to detect the end of the file. If it's *not* only being
used for PeekChar, then the encoding of the actual text in the file is
what should be used, whether that's ASCII, ISO-8859-1 or whatever.
Detecting the end of the file might then become significantly more
difficult if the desired encoding is multi-byte.

Morten Wennevik · Dec 10, 2004

I don't see why that causes an issue if the encoding is only being used
for PeekChar, to detect the end of the file. If it's *not* only being
used for PeekChar, then the encoding of the actual text in the file is
what should be used, whether that's ASCII, ISO-8859-1 or whatever.
Detecting the end of the file might then become significantly more
difficult if the desired encoding is multi-byte.

Well, the original message didn't say anything about only reading the end
of file, nor using it for reading text. PeekChar with Encoding.ASCII will
read 7 bits so a byte value of say 130 would be returned as 63, ie not
what you would want.

Jon Skeet [C# MVP] · Dec 10, 2004

Morten Wennevik said:
Well, the original message didn't say anything about only reading the end
of file, nor using it for reading text. PeekChar with Encoding.ASCII will
read 7 bits so a byte value of say 130 would be returned as 63, ie not
what you would want.

Oops - I was getting confused with a different thread asking why there
isn't a PeekByte method...

What is PeekChar() ???

Guest

Alexander Muylaert

Morten Wennevik

Guest

Morten Wennevik

Jon Skeet [C# MVP]

Morten Wennevik

Jon Skeet [C# MVP]

Morten Wennevik

Jon Skeet [C# MVP]

Morten Wennevik

Guest

Guest

Morten Wennevik

Guest

Jon Skeet [C# MVP]

Morten Wennevik

Jon Skeet [C# MVP]

Morten Wennevik

Jon Skeet [C# MVP]