Hash Code?

J

Joseph Lee

Dear All,

Is there a function in C# that will hash code a large byte array into an int
value?

I am searching for a hash function that will convert any size data(as for me
the input is a byte array of size 20) to a value range of my choosing.

If there is no such thing, would need some idea how to program this in C#.

In Java a byte array can be dump into a BigInteger object and then reduced
to a smaller value by dividing it. but this is a bad way and it is not like
real hashing...

Thanks

Joey
 
J

Justin Rogers

System.Security.Cryptography then look at HashAlgorithm and it's derived
implementations.
What you'll note is that each hashing algorithm has output of different lengths,
so you'll have
to convert the smaller byte array hash value that you get into some other form
for storage.
I often turn hash values into hex strings for storage in the database.
 
J

Joseph Lee

Thanks..

By the way, is it possible to set(overwrite?) the output size of the hash
algorithms?
example SHA-1 will give 20 bytes and cannot be changed and i might need only
12 bytes? so izzit possible

Thanks again

Joey
 
D

Daniel O'Connell [C# MVP]

Joseph Lee said:
Thanks..

By the way, is it possible to set(overwrite?) the output size of the hash
algorithms?
example SHA-1 will give 20 bytes and cannot be changed and i might need
only
12 bytes? so izzit possible

No, you should either use a hash provider that has the proper size(don't
think any are 12) or discard some of the bytes returned so that it gets down
to your size.

What exactly are you trying to achieve here, anyway? If its for duplicate
checking or for security\cryptography reasons then its probably best to keep
all the bytes, but if you are trying to convert it down simply for an
identifier you might be able to find a simple algorthim to do what you want
or safely use the TripleDes keyed hash which returns an 8 byte hash.
 
H

Horatiu Ripa

It's trivial to make your own hash algoritm, why didn't you just do it from
scratch? Than you can have whatever output you want.
 
J

Joseph Lee

Thanks for helping me :)
What exactly are you trying to achieve here, anyway? If its for duplicate
checking or for security\cryptography reasons then its probably best to keep
all the bytes, but if you are trying to convert it down simply for an
identifier you might be able to find a simple algorthim to do what you want
or safely use the TripleDes keyed hash which returns an 8 byte hash.

Before i contiunue, i would like to explain that i am trying to write a
research paper comparing different search method on encrypted data.
I have look at papers by Song, Wagner, Perring
http://citeseer.ist.psu.edu/song00practical.html
and others...

One of them i am looking at is by Eu Jin Goh
http://crypto.stanford.edu/~eujin/papers/secureindex/ -> page
http://crypto.stanford.edu/~eujin/papers/secureindex/2003nov-encsearch.pdf ->
the slides

So I am trying to implement this method of searching on encrypted data

I will try to summarize and simplify about the method implementation
1>a keyword is hashed using a hash method, the author suggests HMAC-SHA1
2>the ouput value determines the location of an array where the value in
that location is changed to flag an entry.(the author does not define
anything about how the array is created in programming, just that it acts
like a database of single bit value 0 and 1 to mark as flagged or not)
3>this is done to every keywords there is filling up the array

So I am having a problem where the output value is a 20 bytes(HMAC-SHA1).
I don't think that an array that i define can be of that size?? am i wrong
here?

So i need a way to either
reduce the size of the output value 20 bytes to a smaller value that i can
define array[value] and flag the value in the array
or a way to represent a huge array for a value up to 20 bytes location
spaces 2^20*8

Any other way of implementation is greatly appreciated.
 
J

Jon Skeet [C# MVP]

I will try to summarize and simplify about the method implementation
1>a keyword is hashed using a hash method, the author suggests HMAC-SHA1
2>the ouput value determines the location of an array where the value in
that location is changed to flag an entry.(the author does not define
anything about how the array is created in programming, just that it acts
like a database of single bit value 0 and 1 to mark as flagged or not)
3>this is done to every keywords there is filling up the array

So I am having a problem where the output value is a 20 bytes(HMAC-SHA1).
I don't think that an array that i define can be of that size?? am i wrong
here?

So i need a way to either
reduce the size of the output value 20 bytes to a smaller value that i can
define array[value] and flag the value in the array
or a way to represent a huge array for a value up to 20 bytes location
spaces 2^20*8

Any other way of implementation is greatly appreciated.

Unless you're willing to use really large amounts of memory for this
array, I would suggest a different technique - the same technique used
in normal hashtables.

Break your sparse array into "buckets". The more entries you have, the
more buckets you have. Each bucket represents part of the array - each
bucket is usually the same size, and they're set at regular intervals.
Within each bucket, you have a straight list of entries. Within a
hashtable each entry would be the hash code, the key and the value.

When you are asked to look up a key (or a hashcode) you find the right
bucket (which is very quick, as it's basically just a division
operation) and then you walk down the list of entries in that bucket.
The more buckets you have, the faster it is to find an entry but the
more memory is taken.

I would suggest cutting the 20 byte hash down to 8 bytes to start with,
as you can then use a long to represent it in a simple way. You can
always keep the "full" hash in the list entries, if you want.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top