MD5 Hash - Converting Result From Byte[] Back To String

D

dlarock

I wrote the following to do an MD5 hash. However, I have a problem (I
think) with the conversion from the Byte[] MD5 hash back to string.

Watching this through the debugger it appears as if the MD5 is
computing the right Byte[] for the hash when compared to other MD5 hash
generators online. However, when I attempt to convert it back to tring
using the line
String outputData = textConverter.GetString( result ) ;

I essentially get garbage. Any thoughts on what I am doing wrong?
Does the MD5 hash not generate a UTF8 byte array?

Amy.

UTF8Encoding textConverter = new UTF8Encoding();
String inputData = "" ;

Console.Write( "Enter a string: " ) ;
inputData = Console.ReadLine() ;

MD5 md5 = new MD5CryptoServiceProvider();

byte[] result = md5.ComputeHash( textConverter.GetBytes( inputData ) );
String outputData = textConverter.GetString( result ) ;

Console.WriteLine( "MD5 Hash: " + outputData ) ;
Console.ReadLine() ;
 
T

The Crow

string outputData = Convert.ToBase64String(result);

byte[] backResult = Convert.FromBase64String(outputData);
 
D

dlarock

Would you mind expanding on this? All you did was convert the byte
array it to a base64 string and than back into a byte[]?

Amy
 
J

Jon Skeet [C# MVP]

I wrote the following to do an MD5 hash. However, I have a problem (I
think) with the conversion from the Byte[] MD5 hash back to string.

Watching this through the debugger it appears as if the MD5 is
computing the right Byte[] for the hash when compared to other MD5 hash
generators online. However, when I attempt to convert it back to tring
using the line
String outputData = textConverter.GetString( result ) ;

I essentially get garbage. Any thoughts on what I am doing wrong?
Does the MD5 hash not generate a UTF8 byte array?

No. It generates the MD5 bytes. Why would it decide to encode the bytes
in UTF-8? If it were converting the data to text, it would just return
it as a string.

The result of hashing is arbitrary binary data, and it should be
treated as such. Base64 encoding it is a fine way to get it into a text
form without the risk of any loss of data.
 
D

dlarock

Jon,

Thank you for your reply, but I seem to be missing something. So far
what I understand is that when I convert it to a MD5 hash it generates
a byte array of the MD5 hash. I should be able to convert that byte
array into a string to view it

for example: MD5 hashing the string "C#" is
"d7efa19fbe7d3972fd5adb6024223d74".

Now it was suggested to convert the MD5 byte array into a base64 string
which I can do, but that is a base64 string of the hash and not really
what I am looking for as I am looking for the MD5 hash in a string
format (like above and not in a base64 format)?

Playing around I found that
BitConverter.ToString( result ) ;
works in doing what I need done - but why doesn't
outputData = textConverter.GetString( result ) ;?

Thank
Amy
 
J

Jon Skeet [C# MVP]

Thank you for your reply, but I seem to be missing something. So far
what I understand is that when I convert it to a MD5 hash it generates
a byte array of the MD5 hash.

Well, the MD5 hash *is* a byte array. It's binary data - it's not a
string.
I should be able to convert that byte
array into a string to view it

for example: MD5 hashing the string "C#" is
"d7efa19fbe7d3972fd5adb6024223d74".

No, it's the binary data which that string is the hex-encoded version
of. Just as you could load up a JPEG into a hex editor and post the
hex-encoded data, that doesn't mean the data is inherently text.
Now it was suggested to convert the MD5 byte array into a base64 string
which I can do, but that is a base64 string of the hash and not really
what I am looking for as I am looking for the MD5 hash in a string
format (like above and not in a base64 format)?

Base64 *is* a string format. It happens to be in base64 instead of
base16 (which is what you posted above), that's all.
Playing around I found that
BitConverter.ToString( result ) ;
works in doing what I need done - but why doesn't
outputData = textConverter.GetString( result ) ;?

Because that's treating the arbitrary binary data as if it were the
results of applying the UTF-8 encoding to text.

I suspect you have some fundamental misunderstandings about what
encodings are about. See
http://www.pobox.com/~skeet/csharp/unicode.html for a bit of an
introduction.
 
R

Reginald Blue

Thank you for your reply, but I seem to be missing something. So far
what I understand is that when I convert it to a MD5 hash it generates
a byte array of the MD5 hash. I should be able to convert that byte
array into a string to view it

for example: MD5 hashing the string "C#" is
"d7efa19fbe7d3972fd5adb6024223d74".

No, actually, the MD5 hashing is the 128 bit binary data which, when
converted to a hexidecimal display, is the string you indicate. Basically
the hash (any hash) just generates some kind of binary data through a set of
mathematical operations on the underlying data that's being examined. For
example, it doesn't actually look at the text "C#" but at the binary
representation of that string, so a different encoding of C# (for example,
in UTF-16) will yield a different MD5 hash result.
Now it was suggested to convert the MD5 byte array into a base64
string which I can do, but that is a base64 string of the hash and
not really what I am looking for as I am looking for the MD5 hash in
a string format (like above and not in a base64 format)?

Actually, the base64 transformation turns a set of binary data into the
string representation you indicate above, so, in fact, it's exactly what you
want.
Playing around I found that
BitConverter.ToString( result ) ;
works in doing what I need done - but why doesn't
outputData = textConverter.GetString( result ) ;?

Because Text Converter attempts to transform a set of binary data into a
textual representation assuming a specific character code set. For example,
in ASCII, the 'A' character is (off the top of my head) 65 in decimal, and
41 in hexidecimal. So, conversely, the "39" in your string above, would
convert back to (again, off the top of my head) the ? symbol, at least with
the ASCII set (or a bunch of other sets which also map similarly). So,
thus, you get garbage in your translation.

To put it a different way, what you want is the hexidecimal representation
of your binary (byte) data, not the text representation of your binary data.
Thus, BitConverter and Base64 do what you want, but Text Converter does not.

Hope that helps.

--
Reginald Blue
"I have always wished that my computer would be as easy to use as my
telephone. My wish has come true. I no longer know how to use my
telephone."
- Bjarne Stroustrup (originator of C++) [quoted at the 2003
International Conference on Intelligent User Interfaces]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top