Byte to Chr - not correctly translated!!

G

Guest

I have a String outputhex which consists of unicodetext translated into hex.
example: test = 7400650073007400

now i translate each two characters of the hexstring (outputhex) into byte
and then into chars. but this is the point where something is wrong...
(having the byteorder mark removed would also be very good)

"Schließen" gets to "Schlie鿃攀渀" and so on...



Here is the code:

Dim intIndex As Short
Dim j As Integer
Dim ch As Char

Dim file As System.IO.StreamWriter
file = My.Computer.FileSystem.OpenTextFileWriter("test.uni", True)

For intIndex = 1 To Len(outputhex) Step 2
j = CByte("&H" & Mid(outputhex, intIndex, 2))
ch = Convert.ToChar(j)
file.Write(ch)
Next

file.Close()




Thanks for the help, its urgent
 
C

Charles Appel

kenny said:
I have a String outputhex which consists of unicodetext translated into
hex.
example: test = 7400650073007400

now i translate each two characters of the hexstring (outputhex) into byte
and then into chars. but this is the point where something is wrong...
(having the byteorder mark removed would also be very good)

"Schließen" gets to "Schlie???" and so on...

Correct me if I'm wrong, but aren't characters in VB.NET
two bytes in length rather than one?
 
B

Brian Henry

yes they are unicode... you need to translate the text format to accept
standard ASCII
 
C

Cor Ligthert [MVP]

Brian,
yes they are unicode... you need to translate the text format to accept
standard ASCII

I assume you mean a kind of Extended ASCII
standard ASCII is 7 bits where the ß (German character) is probably in the
kenny's code set.

Cor
 
J

Joseph S.

Here is the code:
For intIndex = 1 To Len(outputhex) Step 2
j = CByte("&H" & Mid(outputhex, intIndex, 2))
ch = Convert.ToChar(j)
file.Write(ch)
Next
Hi,
I have a related problem:
getting an MD5 string.

There are a few md5 databases e.g. http://md5.rednoize.com which return
the md5 hashes of your input strings. Can this be done by some type
conversion?
I tried the following but cannot get the type converstion right:

Public Function MD5String(vData As String) As String
Dim DBytes(vData.Length) As Byte, i As Integer
For i = 0 To vData.Length - 1
DBytes(i) = Convert.ToByte(Convert.ToChar(vData.Substring(i, 1)))
Next
Dim md5 As New MD5CryptoServiceProvider()
Dim HBytes As Byte() = md5.computeHash(DBytes)
Dim retstr As String
retstr = ""
For i = 0 To HBytes.Length - 1
Dim zero as integer = 0
'retstr &= Convert.ToString(Convert.ToChar(HBytes(i)))
retstr &= Convert.toString(zero Or HBytes(i))
Next
Return retstr
End Function

This should be simple but I've spent a lot of time going through SDK
docs but to no good.

TIA,
JS
 
G

Guest

Hello kenny,

Your string seems to represent UTF-16 encoded characters, but you are converting somehow as if it was UTF-8.
Your code should look like this:

Dim intIndex As Integer
Dim b() As Byte
Dim s As String
Dim file As System.IO.StreamWriter

ReDim b(Len(outputhex) \ 2)
For intIndex = 1 To Len(outputhex) Step 2
b(intIndex \ 2) = Convert.ToByte(Mid(outputhex, intIndex, 2), 16)
Next
s = System.Text.Encoding.Unicode.GetString(b)

file = My.Computer.FileSystem.OpenTextFileWriter("test.uni", True)
file.Write(s)
file.Close()

Regards.


"kenny" <[email protected]> escribió en el mensaje |I have a String outputhex which consists of unicodetext translated into hex.
| example: test = 7400650073007400
|
| now i translate each two characters of the hexstring (outputhex) into byte
| and then into chars. but this is the point where something is wrong...
| (having the byteorder mark removed would also be very good)
|
| "Schließen" gets to "Schlie鿃攀渀" and so on...
|
|
|
| Here is the code:
|
| Dim intIndex As Short
| Dim j As Integer
| Dim ch As Char
|
| Dim file As System.IO.StreamWriter
| file = My.Computer.FileSystem.OpenTextFileWriter("test.uni", True)
|
| For intIndex = 1 To Len(outputhex) Step 2
| j = CByte("&H" & Mid(outputhex, intIndex, 2))
| ch = Convert.ToChar(j)
| file.Write(ch)
| Next
|
| file.Close()
|
|
|
|
| Thanks for the help, its urgent
 
J

Joseph S.

found this:
http://groups.google.com/group/micr...es.vb/browse_thread/thread/272413e9094151a8/#

Found a Hex function which converts a Byte or Integer or Short or Long
or Object to a Hexadecimal string
ms-help://MS.NETFrameworkSDKv1.1/vblr7net/html/vafctHex.htm

However,
common md5 databases give different results:
"standard" --> c00f0c4675b91fb8b918e4079a0b1bac

VB.Net code:
"standard" --> 92172E213F3179EB749BE8FF3D4C2B8F

what could be wrong?

My code is:
Public Function MD5String(vData As String) As String
Dim DBytes(vData.Length) As Byte, i As Integer
For i = 0 To vData.Length - 1
DBytes(i) = Convert.ToByte(Convert.ToChar(vData.Substring(i, 1)))
Next
Dim md5 As New MD5CryptoServiceProvider()
Dim HBytes As Byte() = md5.computeHash(DBytes)
Dim retstr As String
retstr = ""
For i = 0 To HBytes.Length - 1
retstr &= Hex(HBytes(i))
Next
Return retstr
End Function


TIA,
JS
 
G

Guest

No, it is UTF-8...but there are also other characters...not only unicode. Is
there a way to write them to a file independent from the encoding?? I tried
BinaryWriter but it seems also to need an encoding specified to write
correctly. I just want to write any data to a file....
 
G

Guest

It doesn't exist such thing as writing a file with independence from the encoding. A file holds binary data. How that data represents text characters or other things depends on the encoding you select.
What exactly do you want to do? Write and read text in a file? Serialize objects? Write and read binary data?

By the way, OpenTextFileWriter defaults to ASCII encoding. You may use OpenTextFileWriter("test.uni", True, System.Text.Encoding.Unicode) to be able to store all characters.

Regards.


"kenny" <[email protected]> escribió en el mensaje | No, it is UTF-8...but there are also other characters...not only unicode. Is
| there a way to write them to a file independent from the encoding?? I tried
| BinaryWriter but it seems also to need an encoding specified to write
| correctly. I just want to write any data to a file....
|
| "José Manuel Agüero" schrieb:
|
| > Hello kenny,
| >
| > Your string seems to represent UTF-16 encoded characters, but you are converting somehow as if it was UTF-8.
| > Your code should look like this:
| >
| > Dim intIndex As Integer
| > Dim b() As Byte
| > Dim s As String
| > Dim file As System.IO.StreamWriter
| >
| > ReDim b(Len(outputhex) \ 2)
| > For intIndex = 1 To Len(outputhex) Step 2
| > b(intIndex \ 2) = Convert.ToByte(Mid(outputhex, intIndex, 2), 16)
| > Next
| > s = System.Text.Encoding.Unicode.GetString(b)
| >
| > file = My.Computer.FileSystem.OpenTextFileWriter("test.uni", True)
| > file.Write(s)
| > file.Close()
| >
| > Regards.
| >
| >
| > "kenny" <[email protected]> escribió en el mensaje | > |I have a String outputhex which consists of unicodetext translated into hex.
| > | example: test = 7400650073007400
| > |
| > | now i translate each two characters of the hexstring (outputhex) into byte
| > | and then into chars. but this is the point where something is wrong...
| > | (having the byteorder mark removed would also be very good)
| > |
| > | "Schließen" gets to "Schlie鿃攀渀" and so on...
| > |
| > |
| > |
| > | Here is the code:
| > |
| > | Dim intIndex As Short
| > | Dim j As Integer
| > | Dim ch As Char
| > |
| > | Dim file As System.IO.StreamWriter
| > | file = My.Computer.FileSystem.OpenTextFileWriter("test.uni", True)
| > |
| > | For intIndex = 1 To Len(outputhex) Step 2
| > | j = CByte("&H" & Mid(outputhex, intIndex, 2))
| > | ch = Convert.ToChar(j)
| > | file.Write(ch)
| > | Next
| > |
| > | file.Close()
| > |
| > |
| > |
| > |
| > | Thanks for the help, its urgent
 
G

Guest

If I use unicode encoding then all characters are displayed correcty but
there are too much null bytes between them.
Example:
normal: S c h l i e ß e n

modified: S c h l i e ß e n
 
G

Guest

Are you really sure that the bytes represent text that is encoded as
UTF8? The example you show is clearly UTF16.
 
G

Guest

Hmmm, yes...you are right! But i still don't know how to get dir of all these
null bytes...
 
M

Mattias Sjögren

Dim DBytes(vData.Length) As Byte, i As Integer
^^^^^^^^^^^^

Should be vData.Length - 1, or your buffer will be one byte too long
which I guess affects the computed hash.

For i = 0 To vData.Length - 1
DBytes(i) = Convert.ToByte(Convert.ToChar(vData.Substring(i, 1)))
Next

If you expect the input to be all ASCII you can use the
System.Text.ASCIIEncoding to convert the input string to a byte array.

For i = 0 To HBytes.Length - 1
retstr &= Hex(HBytes(i))
Next

You probably want to zero pad the returned string to ensure that each
byte gets represented by two characters.


Mattias
 
G

Guest

It seems that you're writing using UTF-16 and reading using UTF-8 (or UTF-7 or ASCII). Review your code and keep the same encoding when accesing the same data.

Regards.


"kenny" <[email protected]> escribió en el mensaje | If I use unicode encoding then all characters are displayed correcty but
| there are too much null bytes between them.
| Example:
| normal: S c h l i e ß e n
|
| modified: S c h l i e ß e n
|
| "José Manuel Agüero" wrote:
|
| > It doesn't exist such thing as writing a file with independence from the encoding. A file holds binary data. How that data represents text characters or other things depends on the encoding you select.
| > What exactly do you want to do? Write and read text in a file? Serialize objects? Write and read binary data?
| >
| > By the way, OpenTextFileWriter defaults to ASCII encoding. You may use OpenTextFileWriter("test.uni", True, System.Text.Encoding.Unicode) to be able to store all characters.
| >
| > Regards.
| >
| >
| > "kenny" <[email protected]> escribió en el mensaje | > | No, it is UTF-8...but there are also other characters...not only unicode. Is
| > | there a way to write them to a file independent from the encoding?? I tried
| > | BinaryWriter but it seems also to need an encoding specified to write
| > | correctly. I just want to write any data to a file....
| > |
| > | "José Manuel Agüero" schrieb:
| > |
| > | > Hello kenny,
| > | >
| > | > Your string seems to represent UTF-16 encoded characters, but you are converting somehow as if it was UTF-8.
| > | > Your code should look like this:
| > | >
| > | > Dim intIndex As Integer
| > | > Dim b() As Byte
| > | > Dim s As String
| > | > Dim file As System.IO.StreamWriter
| > | >
| > | > ReDim b(Len(outputhex) \ 2)
| > | > For intIndex = 1 To Len(outputhex) Step 2
| > | > b(intIndex \ 2) = Convert.ToByte(Mid(outputhex, intIndex, 2), 16)
| > | > Next
| > | > s = System.Text.Encoding.Unicode.GetString(b)
| > | >
| > | > file = My.Computer.FileSystem.OpenTextFileWriter("test.uni", True)
| > | > file.Write(s)
| > | > file.Close()
| > | >
| > | > Regards.
| > | >
| > | >
| > | > "kenny" <[email protected]> escribió en el mensaje | > | > |I have a String outputhex which consists of unicodetext translated into hex.
| > | > | example: test = 7400650073007400
| > | > |
| > | > | now i translate each two characters of the hexstring (outputhex) into byte
| > | > | and then into chars. but this is the point where something is wrong...
| > | > | (having the byteorder mark removed would also be very good)
| > | > |
| > | > | "Schließen" gets to "Schlie鿃攀渀" and so on...
| > | > |
| > | > |
| > | > |
| > | > | Here is the code:
| > | > |
| > | > | Dim intIndex As Short
| > | > | Dim j As Integer
| > | > | Dim ch As Char
| > | > |
| > | > | Dim file As System.IO.StreamWriter
| > | > | file = My.Computer.FileSystem.OpenTextFileWriter("test.uni", True)
| > | > |
| > | > | For intIndex = 1 To Len(outputhex) Step 2
| > | > | j = CByte("&H" & Mid(outputhex, intIndex, 2))
| > | > | ch = Convert.ToChar(j)
| > | > | file.Write(ch)
| > | > | Next
| > | > |
| > | > | file.Close()
| > | > |
| > | > |
| > | > |
| > | > |
| > | > | Thanks for the help, its urgent
| >
| >
 
G

Guest

As the data is UTF16 you don't need to decode it. Just convert the
values in the string to char values.

string test = "7400650073007400";

char[] data = new char[test.Length / 4];
int pos = 0;
for (int i=0; i<data.Length; i++) {
data = (char)(
((int)test[pos++] - 48) * 16 +
((int)test[pos++] - 48) +
((int)test[pos++] - 48) * 4096 +
((int)test[pos++] - 48) * 256
);
}

string result = new String(data);

[Disclaimer: untested code]
 
J

Joseph S.

Hi, thanks for the clarifications, its now working.
Mattias said:
^^^^^^^^^^^^
Should be vData.Length - 1, or your buffer will be one byte too long
which I guess affects the computed hash.

I have a book which says that
Dim a(30) as Integer
has 30 objects from a(0) to a(29).

However, I tested it and found that
Dim a(30) as Integer
has _31_ objects from a(0) to a(30)

Is there some kind of system-level setting or is it a mistake in the
book?
Or am I missing something obvious?
You probably want to zero pad the returned string to ensure that each
byte gets represented by two characters.
Yes, I had to zero pad the hex strings.

Regards,
JS
 
?

=?ISO-8859-1?Q?G=F6ran_Andersson?=

Joseph said:
Hi, thanks for the clarifications, its now working.


I have a book which says that
Dim a(30) as Integer
has 30 objects from a(0) to a(29).

However, I tested it and found that
Dim a(30) as Integer
has _31_ objects from a(0) to a(30)

Is there some kind of system-level setting or is it a mistake in the
book?
Or am I missing something obvious?

It's definitely a mistake in the book.

In VB you specify the highest index to use in the array, not the number
of items. This is differnt from most other languages, where you specify
the number of items instead.

C# example:

int[] a = new int[30]; // creates an array with 30 items.
 
C

C-Services Holland b.v.

Joseph said:
Hi, thanks for the clarifications, its now working.



I have a book which says that
Dim a(30) as Integer
has 30 objects from a(0) to a(29).

However, I tested it and found that
Dim a(30) as Integer
has _31_ objects from a(0) to a(30)

Is there some kind of system-level setting or is it a mistake in the
book?
Or am I missing something obvious?



Yes, I had to zero pad the hex strings.

Regards,
JS

The book is wrong... unless it's stated it used option base 1 (don't
know if that still exists in dotnet though, but it works in vb5/6.
Option base 1 makes the 1st index 1 instead of 0.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

hex to unicode: problem 5

Top