Random String

shapper · Mar 12, 2009

Hello,

I am creating a random string, 4 characters long, as follows:

Int16 length = 4;
Random random = new Random();
String characters = "abcdefghijklmnopqrstuvwxyz1234567890";
StringBuilder alias = new StringBuilder();
while (length-- > 0)
alias.Append(characters[(int)(random.NextDouble() *
characters.Length)]);

I suppose this is a good option but any suggestion is welcome.

I am inserting the random string into an SQL table. I need to be sure
that is unique ...
.... if it isn't then I should recreate a new one.

What is the best way to do this?

Thanks,
Miguel

shapper · Mar 12, 2009

Hello,

Click to expand...

I am creating a random string, 4 characters long, as follows:

Click to expand...

Int16 length = 4;
Random random = new Random();
String characters = "abcdefghijklmnopqrstuvwxyz1234567890";
StringBuilder alias = new StringBuilder();
while (length-- > 0)
alias.Append(characters[(int)(random.NextDouble() *
characters.Length)]);

Click to expand...

I suppose this is a good option but any suggestion is welcome.

Click to expand...

Some minor cleanup:

Int16 length = 4;
Random random = new Random();
String characters = "abcdefghijklmnopqrstuvwxyz1234567890";
StringBuilder alias = new StringBuilder(length);
while (length-- > 0)
alias.Append(characters[random.Next(characters.Length)]);

I like to initialize StringBuilders explicitly if I know the exact length
ahead of time, even if that length is the same or smaller than the default
initial capacity. And of course, the Random class already has a perfectly
fine way to get an integer in the range [0, upper-bound).

You could also make the "characters" variable a constant instead of
initializing it each time through the code.

I am inserting the random string into an SQL table. I need to be sure
that is unique ...
... if it isn't then I should recreate a new one.

Click to expand...

What is the best way to do this?

Click to expand...

To do what? Make sure the string is unique? That depends on what sources
the string could have, and over what period of time you need for the
instances to be unique. But, an obvious possibility is to simply inspect
the table before inserting it to see if it already exists. Or (I'm no SQL
expert, so bear with me if this makes no sense) use the string as the (a?)
key for the table, and simply check for an error trying to insert it (my
understanding is that SQL has an idea of a column in a table that has to
have a unique value for each record...I'm calling that column the "key").

If the string only needs to be unique for a given execution of your
program, you could maintain a dictionary (Hashtable, Dictionary<TKey,
TValue>, etc.) and check the dictionary for the presence of the key before
using it in your table. For that matter, if the table exists outside of a
given execution of your program (not uncommon for databases ), you
could initialize the dictionary when you start using the table, using the
current records in the table, before you start trying to add things to it..

Note that your 4-character key has "only" about a million and a half
possibilities. You may find the Guid class more useful, if you happen to
be dealing with really large numbers of data records. For that matter,
even a standard int (even if converted to a string) would give you about 4
billion possibilities instead.

Finally, note that it might make more sense just to assign these keys in
sequence, rather than picking them randomly. Unless you have some
specific need to obfuscate the selection pattern for the keys, the code
would be simpler and could deal with collisions in a much more efficient
way (i.e. just pick the next available one, rather than repeatedly trying
a random one until you find a non-duplicate).

Pete

Pete,

I am trying to replicate TinyUrl system on my ASP.NET MVC Application.

The application creates Twitter entries where:
http://mywebsite.com/Article/Edit/568aad33-c406-46c3-a760-a486f9b35370

Would become:
http://mywebsite.com/ab12

Every time a twitter entry is created a new alias is added to a table:
AliasId = ab12;
SourceUrl = Article/Edit/568aad33-c406-46c3-a760-a486f9b35370

With a 4 characters length I get 36^4 = 1 679 616
I don't believe I will get more than 10 000 alias (this is even a very
high limit) so it means that I get at worse a 0.6% probability of
create a repeated alias.
And the probability of making it twice is almost null (0.0035%) ... So
I think I would be safe to use the random method.

I create a new Alias and check if it exists. If it does I create it
again.

Your idea of creating it in sequence it is interesting but I would
always need to check what was the last alias created so I would need
to add a new column to the SQL table (CreateAt), order it by
CreatedAt, get the last one and determine what would be the next one.

I think in terms of performance it will be the same.

Note: if I use an alias with 36^5 = 60 566 176 then all this numbers
get even better so ...

Any suggestion is welcome.

Thanks,
Miguel

Pavel Minaev · Mar 12, 2009

To do what? Make sure the string is unique? That depends on what sources
the string could have, and over what period of time you need for the
instances to be unique. But, an obvious possibility is to simply inspect
the table before inserting it to see if it already exists.

Which, of course, presents some interesting race conditions, given the
concurrent nature of your typical DBMS

Or (I'm no SQL

expert, so bear with me if this makes no sense) use the string as the (a?)
key for the table, and simply check for an error trying to insert it (my
understanding is that SQL has an idea of a column in a table that has to
have a unique value for each record...I'm calling that column the "key").

In SQL it's not called a key, but you can create a UNIQUE constraint
on any column (or combination thereof) with the same effect.

Pavel Minaev · Mar 12, 2009

Your idea of creating it in sequence it is interesting but I would
always need to check what was the last alias created so I would need
to add a new column to the SQL table (CreateAt), order it by
CreatedAt, get the last one and determine what would be the next one.

Er, no, you wouldn't. Pretty much any SQL DBMS out there has some ways
of auto-generating a number for a column. For MSSQL, you'd use
IDENTITY (http://msdn.microsoft.com/en-us/library/aa933196(SQL.
80).aspx). For Oracle, CREATE SEQUENCE and associated functions.

Then, of course, you can always use GUIDs.

shapper · Mar 12, 2009

Er, no, you wouldn't. Pretty much any SQL DBMS out there has some ways
of auto-generating a number for a column. For MSSQL, you'd use
IDENTITY (http://msdn.microsoft.com/en-us/library/aa933196(SQL.
80).aspx). For Oracle, CREATE SEQUENCE and associated functions.

Then, of course, you can always use GUIDs.

Pavel, yes I can use Guid's and yes I can use Int.
(I usually use Guid's even if the performance is a little bit
worse ... but there are some advantages)

But that is not the point here because I need a 4/5 character long
id ...

Hans Kesting · Mar 12, 2009

shapper presented the following explanation :

Hello,

I am creating a random string, 4 characters long, as follows:

Int16 length = 4;
Random random = new Random();

[snip]

One thing to note is that if you do this very often, you might end up
with the same "random" sequence.
When you create an instance of the Random class, it initializes itself
based on the current clock. If you create a second instance within the
same clock-tick, it will start at and generate the same values!

The solution is simple: instead of a new instance every time, use a
single static instance
private static readonly Random random = new Random();

Hans Kesting

proxyuser · Mar 12, 2009

shapper said:
I am inserting the random string into an SQL table. I need to be sure
that is unique ...
... if it isn't then I should recreate a new one.

What is the best way to do this?

You have around 1,700,000 possibilities. Do you plan on using a lot of
them? Is this something that's going to go on for a long time over months
or years of a large application? If so, you could generate a table of all
the strings, and pick one at random to remove. Then you don't have to worry
about checking for duplicates. For example, install the
(file/table/whatever) initially with 1,700,000 entries. Generate a random
number between 1 and 1,700,000 and remove that entry. Next time check the
size of the (f/t/w) - it will be 1,699,999. Generate a random number
between 1 and 1,699,999 and remove that entry. etc.

This is probably not what you want though.

Random String Extension. How to get all available characters?	10	Sep 29, 2009
Random Numbers - Will this produce working results?	7	Feb 18, 2007
Generate password	5	Apr 9, 2008
Random Record. Alternative.	6	Aug 5, 2009
Generate Unique Random String	2	Aug 27, 2005
More on Randomness (with text this time :-)	3	Jan 17, 2007
Random Extension	10	Nov 16, 2009
Random Double with a given Range	1	Sep 29, 2006

Random String

shapper

shapper

Pavel Minaev

Pavel Minaev

shapper

Hans Kesting

proxyuser

Ask a Question

Similar Threads