Random String Extension. How to get all available characters?

S

shapper

Hello,

I created the following String extension to use the characters on a
String and create a new random string with those characters and with a
given length:

public static String Random(this String value, Int32 length) {
StringBuilder randomized = new StringBuilder(length);
while (length-- > 0)
randomized.Append(value[random.Next(value.Length)]);
return randomized.ToString();
} // Random

If a length is not provided it is used the value length.

How can I use All characters if a value is not provided, eg, value is
String.Empty.

My problem is to get a list of all characters.

Thanks,
Miguel
 
S

shapper

Hello,

I created the following String extension to use the characters on a
String and create a new random string with those characters and with a
given length:

    public static String Random(this String value, Int32 length) {
      StringBuilder randomized = new StringBuilder(length);
      while (length-- > 0)
        randomized.Append(value[random.Next(value.Length)]);
      return randomized.ToString();
    } // Random

If a length is not provided it is used the value length.

How can I use All characters if a value is not provided, eg, value is
String.Empty.

My problem is to get a list of all characters.

Thanks,
Miguel

I think I found a way:
Enumerable.Range(char.MinValue, char.MaxValue).Select(c => (char)
c).Where(c => !char.IsControl(c)).ToArray().ToString()
 
P

Peter Duniho

Hello,

I created the following String extension to use the characters on a
String and create a new random string with those characters and with a
given length:

public static String Random(this String value, Int32 length) {
StringBuilder randomized = new StringBuilder(length);
while (length-- > 0)
randomized.Append(value[random.Next(value.Length)]);
return randomized.ToString();
} // Random

If a length is not provided it is used the value length.

How can I use All characters if a value is not provided, eg, value is
String.Empty.

My problem is to get a list of all characters.

Are you sure really want to use "all characters"? In theory, UTF-16 (the
format used by String and Char) can represent 65536 different characters,
but as far as I know, there may be gaps in which code points are actually
used, and I'm not aware of any efficient way to validate individual 16-bit
values as valid Char values.

So, you could just pick a random number from 0-65535, and cast that to
Char, but that wouldn't necessarily ensure you are picking valid
characters.

It seems to me that a better approach would be to make your source
character set in the default case be some easier-to-define subset. For
example, all ASCII characters (values 0-127), or all characters from some
other 8-bit character encoding (e.g. ISO-8859-1). Personally, I'd prefer
the former, since you can simply cast a random number from 0-127 to Char
and be done with it.

Pete
 
P

Peter Duniho

I think I found a way:
Enumerable.Range(char.MinValue, char.MaxValue).Select(c => (char)
c).Where(c => !char.IsControl(c)).ToArray().ToString()

Ouch. I think you definitely don't want to do that. Not only do you
potentially have the problem of using invalid characters (which I
mentioned in my other reply), you are generating a nearly 128K string
every time you execute that line of code, not to mention the 128K array
that gets created and discarded immediately.

I think the basic logic is flawed, because of the potential for invalid
characters, but if you insist on the basic logic anyway, at the very
least, just cast to Char as you pick random numbers between 0 and 65535
(Char.MinValue, Char.MaxValue). If you want to filter out control
characters, then just discard any character you generate that happens to
be a control character; it's much more efficient to just make another try
until you get a non-control character than to waste time and memory
enumerating every possible Char value.

Pete
 
S

shapper

Ouch.  I think you definitely don't want to do that.  Not only do you 
potentially have the problem of using invalid characters (which I  
mentioned in my other reply), you are generating a nearly 128K string  
every time you execute that line of code, not to mention the 128K array  
that gets created and discarded immediately.

I think the basic logic is flawed, because of the potential for invalid  
characters, but if you insist on the basic logic anyway, at the very  
least, just cast to Char as you pick random numbers between 0 and 65535  
(Char.MinValue, Char.MaxValue).  If you want to filter out control  
characters, then just discard any character you generate that happens to  
be a control character; it's much more efficient to just make another try 
until you get a non-control character than to waste time and memory  
enumerating every possible Char value.

Pete

Hi Pete,

I didn't realize that I would have so many characters on my approach.

So you think I should use:

Enumerable.Range(0, 127).Select(c => (char)c).Where(c => !
char.IsControl(c)).ToArray().ToString()

This would give me a..z, A..Z, 0..9 and punctuation marks.

Is that it?

Thank You,
Miguel
 
P

Peter Duniho

[...]
So you think I should use:

Enumerable.Range(0, 127).Select(c => (char)c).Where(c => !
char.IsControl(c)).ToArray().ToString()

This would give me a..z, A..Z, 0..9 and punctuation marks.

Actually, I stated too broad a range. 32-127 will eliminate the control
characters from consideration.

Even so, IMHO creating the whole source array and converting to a string
is unecessary. Just pick a random number between 32 and 127 for each
character and cast it as necessary. Yes, that means you have different
code paths depending on whether your method was called with an empty list
of characters or not, but IMHO that's better code than the code that
creates extra objects when not necessary. :)

Pete
 
S

shapper

Actually, I stated too broad a range.  32-127 will eliminate the control  
characters from consideration.

In fact I was using 20 to 126:
http://en.wikipedia.org/wiki/ASCII#ASCII_printable_characters

This is to create a random String to be used as a password.
Of course I will hash it after ...

This is when a user needs to reset the password. I create a new random
one, hash it and send it by email.
I suppose 20 to 126 is ok or maybe I should restrict a little bit
more?
Even so, IMHO creating the whole source array and converting to a string  
is unecessary.  Just pick a random number between 32 and 127 for each  
character and cast it as necessary.  Yes, that means you have different 
code paths depending on whether your method was called with an empty list 
of characters or not, but IMHO that's better code than the code that  
creates extra objects when not necessary.  :)

Sorry, got lost :)
 
P

Peter Duniho


If you're going to use that table, you need to make sure you are
consistent about which number base you're looking at. The printable
characters start at 0x20, which is 32.

And you're right...you should stop at 126, not 127.
[...]
Even so, IMHO creating the whole source array and converting to a
string  
is unecessary.  Just pick a random number between 32 and 127 for each  
character and cast it as necessary.  Yes, that means you have different
code paths depending on whether your method was called with an empty
list  
of characters or not, but IMHO that's better code than the code that  
creates extra objects when not necessary.  :)

Sorry, got lost :)

This loop generates a random string using the ASCII characters from 32 to
126, without creating a string representing the pool of possible
characters first:

StringBuilder randomized = new StringBuilder(length);

while (length-- > 0)
{
randomized.Append((char)random.Next(32, 127));
}

Pete
 
S

shapper

On Sep 30, 12:09 am, "Peter Duniho" <[email protected]>
wrote:

If you're going to use that table, you need to make sure you are  
consistent about which number base you're looking at.  The printable  
characters start at 0x20, which is 32.

And you're right...you should stop at 126, not 127.
[...]
Even so, IMHO creating the whole source array and converting to a  
string  
is unecessary.  Just pick a random number between 32 and 127 for each  
character and cast it as necessary.  Yes, that means you have different  
code paths depending on whether your method was called with an empty  
list  
of characters or not, but IMHO that's better code than the code that  
creates extra objects when not necessary.  :)
Sorry, got lost :)

This loop generates a random string using the ASCII characters from 32 to 
126, without creating a string representing the pool of possible  
characters first:

    StringBuilder randomized = new StringBuilder(length);

    while (length-- > 0)
    {
       randomized.Append((char)random.Next(32, 127));
    }

Pete

The reason why I used this extension is because I could specify not a
sequence of characters, for example from 32 to 127, but also specific
characters like:

Create random string from these characters "abcfh$%#$$#". This was my
initial idea because I have another situation where I use it.

I could maybe create a method to get all characteres from StartCode to
EndCode and apply the extension.

String RandomString = StringHelpers.Chars(32, 127).Random(10);

This would generate a string with all chars from 32 to 127 and then
apply a Random.

I am just trying to create a flexible solution to use in different
situations.

What do you think?

Thanks,
Miguel
 
P

Peter Duniho

The reason why I used this extension is because I could specify not a
sequence of characters, for example from 32 to 127, but also specific
characters like:

Create random string from these characters "abcfh$%#$$#". This was my
initial idea because I have another situation where I use it.

Seems fine to me. As you stated earlier, we are currently discussing what
that method should do if the caller passes an empty string (or IMHO a null
reference would be better, with an empty string being illegal, but it just
depends on what makes the most sense to you).
I could maybe create a method to get all characteres from StartCode to
EndCode and apply the extension.

String RandomString = StringHelpers.Chars(32, 127).Random(10);

This would generate a string with all chars from 32 to 127 and then
apply a Random.

That's not necessary.
I am just trying to create a flexible solution to use in different
situations.

Right. Just make that one method flexible.
What do you think?

I still think what I thought before: just change the existing method so
that if the input string is empty, it goes through a different code path
in which the characters are just selected directly from random values (per
the code example I posted previous message), rather than generating a
100-or-so character string from which characters are selected.

One way or the other, you have to do something different depending on
whether the caller provides a proper selection string or not. Seems to
me, you might as well put all that right into the one method.

All that said, I'm unconvinced that even allowing the caller to provide
their own character selection string makes sense. You said earlier that
this is to auto-generate passwords. Seems to me that allowing the caller
to provide the list of characters to choose from is just asking for
trouble, letting them generate passwords with very little variability
(what happens if they pass in a string of just one character?).

If you intend for this method to exist in a library where it can be used
more generally, then maybe that's okay. But for the password-specific
application, you should pick what you feel is a reasonable set of
characters (either a range as above, or a hard-coded string constant if
you want to limit the number to something smaller), and then always use
that.

Pete
 
S

shapper

All that said, I'm unconvinced that even allowing the caller to provide  
their own character selection string makes sense.  You said earlier that  
this is to auto-generate passwords.  Seems to me that allowing the caller  
to provide the list of characters to choose from is just asking for  
trouble, letting them generate passwords with very little variability  
(what happens if they pass in a string of just one character?).

If you intend for this method to exist in a library where it can be used  
more generally, then maybe that's okay.  

I have a base library which I use in all my projects.
When I have the need I create something generic enough to be able to
use in the situations I prevent.

For example, in this web application I have a system that implements
an internal TinyUrl system:
http://www.appdomain.com/uh443 leads to http://www.appdomain.com/document/show/232

The key "uh443" is generated using this extension.

The password generator is just another way to use the extension.
Again, this is to be used by me ...

So I would like to keep this functionally but extend to the characters
type.

Thank You,
Miguel
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

Random String 6
Random Numbers - Will this produce working results? 7
ArgumentOutOfRangeException 4
Random Extension 10
Hash MD5, Sha1 and Length 40
Trim String 9
is this a nice thing to do 8
IEnumerable Null 4

Top