Generating a unique hash of a known length

O

Omatase

I need to generate an account number, the account number needs to be
unique. Currently I am generating it with the following fields:
ClientReferenceNumber (guaranteed to be unique in the client's system)
and Source (unique in my system).

The ClientReferenceNumber can be anywhere between 6 and 10 digits long
and the Source can also be differing lengths (it's a 5 character
string for our first client) but I want the hash algorithm to generate
a number of equal length each time. I don't know if this is something
I can only guarantee with a pad statement, or if I can just build that
in to the algorithm. I also need the generated hash to be unique,
which should be simple given that the input is guaranteed unique. It
is also a requirement that the hash not resemble the original input.

My questions are:

What is the best way to do this?

I have this code

long hash = clientReferenceNumber + client.ToString().GetHashCode();

I am pretty sure that will be guaranteed unique, but I don't know
exactly what GetHashCode() is doing to my string. I also don't know
what length ranges I should expect to get back.

Thanks
 
O

Omatase

Correction, I am using this code:

long hash = 0;
hash += clientReferenceNumber.ToString().GetHashCode();
hash += client.ToString().GetHashCode();

The number ends up being smaller.
 
P

Peter Duniho

I need to generate an account number, the account number needs to be
unique. Currently I am generating it with the following fields:
ClientReferenceNumber (guaranteed to be unique in the client's system)
and Source (unique in my system).

The ClientReferenceNumber can be anywhere between 6 and 10 digits long
and the Source can also be differing lengths (it's a 5 character
string for our first client) but I want the hash algorithm to generate
a number of equal length each time. I don't know if this is something
I can only guarantee with a pad statement, or if I can just build that
in to the algorithm. I also need the generated hash to be unique,
which should be simple given that the input is guaranteed unique. It
is also a requirement that the hash not resemble the original input.

My questions are:

What is the best way to do this?

I have this code

long hash = clientReferenceNumber + client.ToString().GetHashCode();

I am pretty sure that will be guaranteed unique, but I don't know
exactly what GetHashCode() is doing to my string. I also don't know
what length ranges I should expect to get back.

GetHashCode() is not guaranteed to return unique values. It's not even
possible for it to. This is true of _any_ hash algorithm, so you should
stop thinking about this conversion as a "hash". Just consider it some
kind of conversion.

Perhaps you could be more specific about "the hash not resemble the
original input". If you simply mean that you don't want a human to be
able to look at it and readily recognize it as the original input, then a
simple transformation of the original input should be fine. For example,
just invert all the bits in the integer values. Or (since I'm posting
this via Usenet :) ), use the ROT13 algorithm. Anything along those lines
would be fine.

If instead you mean that it should not be possible to recover the original
input from the converted value, then you will want to using some form of
strong encryption to convert the original input. There are cryptographic
classes in .NET that can help you with that.

As far as the output always having a fixed length, as long as you're
representing some numerical value as a string, you'll have to do something
to pad it out to your desired length. The easiest way to do that is to
just provide the necessary "0" placeholders in a custom numeric format
string.

Pete
 
P

Peter Duniho

Correction, I am using this code:

long hash = 0;
hash += clientReferenceNumber.ToString().GetHashCode();
hash += client.ToString().GetHashCode();

The number ends up being smaller.

See my other reply. The above code is not even a good way to composite
hash values, but in any case there's no way to guarantee a unique outcome
for all possible inputs.

You need something other than a hash code here.
 
F

Family Tree Mike

Omatase said:
I need to generate an account number, the account number needs to be
unique. Currently I am generating it with the following fields:
ClientReferenceNumber (guaranteed to be unique in the client's system)
and Source (unique in my system).

The ClientReferenceNumber can be anywhere between 6 and 10 digits long
and the Source can also be differing lengths (it's a 5 character
string for our first client) but I want the hash algorithm to generate
a number of equal length each time. I don't know if this is something
I can only guarantee with a pad statement, or if I can just build that
in to the algorithm. I also need the generated hash to be unique,
which should be simple given that the input is guaranteed unique. It
is also a requirement that the hash not resemble the original input.

My questions are:

What is the best way to do this?

I have this code

long hash = clientReferenceNumber + client.ToString().GetHashCode();

I am pretty sure that will be guaranteed unique, but I don't know
exactly what GetHashCode() is doing to my string. I also don't know
what length ranges I should expect to get back.

Thanks

OK, maybe I have had a long day...

If you want the account number not to resemble the ClientReferenceNumber
and the Source, then why base it on that at all? Presumably you will
have a table linking this new account number back to the other two, so
why not generate a sequence starting from some arbitrarily high value,
such as 1843243873289?
 
F

Faust

/_Family Tree Mike_ a formulé ce jeudi/ :
OK, maybe I have had a long day...
If you want the account number not to resemble the ClientReferenceNumber and
the Source, then why base it on that at all? Presumably you will have a
table linking this new account number back to the other two, so why not
generate a sequence starting from some arbitrarily high value, such as
1843243873289?

and why not simply use a Guid ?
- it's not absolutly guaranteed to be unique, but very close to
- it has constant length

--
*/Teträm/*
http://www.tetram.org

"Avale tout sans réfléchir, ce qui n'est pas commestible resortira
toujours" - Proverbe Troll
 
F

Family Tree Mike

Faust said:
/_Family Tree Mike_ a formulé ce jeudi/ :



and why not simply use a Guid ?
- it's not absolutly guaranteed to be unique, but very close to
- it has constant length

--
*/Teträm/*
http://www.tetram.org

"Avale tout sans réfléchir, ce qui n'est pas commestible resortira
toujours" - Proverbe Troll

One drawback I see is that many ATM machines only have numeric data entry.
Other than that, your suggestion is completely valid.

Mike
 
O

Omatase

I decided to use the bit toggle and throw out one of the numbers. Now
all I use is the client reference number, the client id will be in the
database and will serve with the generated account number as a
composite key for the table.

Here is my final code

long onesCompliment = 0;

// ones compliment the client number to obfuscate it
onesCompliment = ~(UInt32)clientReferenceNumber;

Thanks for everyone's help.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top