Seek() don't move pointer to exact location in file have Arabic Te

G

Guest

My programs searches the header of input barcode in index file. Get the
record position next to Barcode header. Then moves the file pointer of
products file to reach that record.

My products file contains records as follows:

9N-F1T0153|002002820327|Data Switch|EA|00030900|00000
36-EPSON7753|010343600003|حنمبر طابعه 500/570|EA|00001200|00000
ER-270019|013388270019|بقتال الشارع|EA|00019900|00000

while index file for this products file contains the indexes of each barcode:
format: BarcodeHeader (space) RecordStartPosition

00200282 0
01034360 55
01338827 121

for first two records file pointer is working file and i can get whole
record by ReadLine(). But from the 3rd record i don't get the whole record
but a part. i realized that this is due to Arabic Text(Unicode Characters)
that exists in second record and therefore file pointer

If the file don't have Arabic Text(Unicode Character) then everything works
fine.

Code Snippet:

this.sr_FmtdFile.DiscardBufferedData();
this.sr_FmtdFile.BaseStream.Position = 0;
this.sr_FmtdFile.BaseStream.Seek(index , SeekOrigin.Begin);

while( (record = this.sr_FmtdFile.ReadLine()) != null )
{
//Code to get desired record.
}

Please help how to handle Arabic Text(Unicode Character) in file pointer
movement.

Arif.
 
N

Nicholas Paldino [.NET/C# MVP]

Arif,

If you have a combination of unicode and ascii characters in the file,
then you will not be able to use a reader. As a matter of fact, I don't see
how you will be able to tell if the characters are unicode or ASCII unless
only that specific field is unicode and it is predictable when this is so.

Regardless, you will not be able to use a stream reader for this. You
will have to go through the file and get the bytes yourself, converting them
when appropriate.

Or, you should just write the whole file in unicode characters to begin
with, and then you can read the lines as you were doing before.

Hope this helps.
 
J

Jon Skeet [C# MVP]

Arif said:
My programs searches the header of input barcode in index file. Get the
record position next to Barcode header. Then moves the file pointer of
products file to reach that record.

My products file contains records as follows:

9N-F1T0153|002002820327|Data Switch|EA|00030900|00000
36-EPSON7753|010343600003|حنمبر طابعه 500/570|EA|00001200|00000
ER-270019|013388270019|بقتال الشارع|EA|00019900|00000

And how *exactly* are those records encoded? That's the important bit -
if you can work out the encoding of the file, that would help a lot.

See http://www.pobox.com/~skeet/csharp/unicode.html for more about
encodings.
while index file for this products file contains the indexes of each barcode:
format: BarcodeHeader (space) RecordStartPosition

00200282 0
01034360 55
01338827 121

How has that been calculated? If it's done on the number of characters
(rather than the number of bytes) then it won't work with
variable-length encodings (such as UTF-8).

Jon
 
G

Guest

Much thanks Jon,

i was calculating on number of characters. now i test with number of bytes
and it is working fine.

again much thanks Jon for understanding my problem and giving me an idea for
solution.

Arif.
 
G

Guest

Thanks Nicholas,

I am sure that there was some mistake in telling my problem. I confused you
with combination of ASCII and Unicode characters because that time i was also
confused with my problem. Sorry for that.
But Jon Skeet understood my problem exactly. He give me idea to do indexing
on number of bytes while i was doing on number of characters.

Arif.

Nicholas Paldino said:
Arif,

If you have a combination of unicode and ascii characters in the file,
then you will not be able to use a reader. As a matter of fact, I don't see
how you will be able to tell if the characters are unicode or ASCII unless
only that specific field is unicode and it is predictable when this is so.

Regardless, you will not be able to use a stream reader for this. You
will have to go through the file and get the bytes yourself, converting them
when appropriate.

Or, you should just write the whole file in unicode characters to begin
with, and then you can read the lines as you were doing before.

Hope this helps.


--
- Nicholas Paldino [.NET/C# MVP]
- (e-mail address removed)

Arif said:
My programs searches the header of input barcode in index file. Get the
record position next to Barcode header. Then moves the file pointer of
products file to reach that record.

My products file contains records as follows:

9N-F1T0153|002002820327|Data Switch|EA|00030900|00000
36-EPSON7753|010343600003|????? ????? 500/570|EA|00001200|00000
ER-270019|013388270019|????? ??????|EA|00019900|00000

while index file for this products file contains the indexes of each
barcode:
format: BarcodeHeader (space) RecordStartPosition

00200282 0
01034360 55
01338827 121

for first two records file pointer is working file and i can get whole
record by ReadLine(). But from the 3rd record i don't get the whole record
but a part. i realized that this is due to Arabic Text(Unicode Characters)
that exists in second record and therefore file pointer

If the file don't have Arabic Text(Unicode Character) then everything
works
fine.

Code Snippet:

this.sr_FmtdFile.DiscardBufferedData();
this.sr_FmtdFile.BaseStream.Position = 0;
this.sr_FmtdFile.BaseStream.Seek(index , SeekOrigin.Begin);

while( (record = this.sr_FmtdFile.ReadLine()) != null )
{
//Code to get desired record.
}

Please help how to handle Arabic Text(Unicode Character) in file pointer
movement.

Arif.
 
Top