Unit testing something which is supposed to be random?

  • Thread starter Thread starter Ethan Strauss
  • Start date Start date
E

Ethan Strauss

Hi,
I have a method which produces a random DNA sequence. I would like to
write a unit test for it, but can't quite figure out what makes sense. It is
easy to test that a DNA sequence is produced. What I can't figure is
randomness. By it's very nature random is random. I have written a simple
test that generates 500 sequences and checks that none of them are the same.
That works, but seems very weak besides, if the sequences are truly random,
the same one should be generated occasionally. I can imagine that I could
create thousands of sequences and then do statistics on them to see if they
are random, but I don't want to go there. Any other thoughts about what to
do?
Thanks!
Ethan
 
Ethan said:
I have a method which produces a random DNA sequence. I would like to
write a unit test for it, but can't quite figure out what makes sense. It is
easy to test that a DNA sequence is produced. What I can't figure is
randomness. By it's very nature random is random. I have written a simple
test that generates 500 sequences and checks that none of them are the same.
That works, but seems very weak besides, if the sequences are truly random,
the same one should be generated occasionally. I can imagine that I could
create thousands of sequences and then do statistics on them to see if they
are random, but I don't want to go there. Any other thoughts about what to
do?

I think you can only do two things:
- verify that it given a specific initialization of the RNG
produces the correct result
- do a simulation with millions of runs and check the statistics

Arne
 
Ethan,

Well, it seems like you want a random distribution which is even as
well.

Statistics here don't really test how random a generator is, it tests to
see that the distribution is even (which is different from random).

Rather, I think if you truly want to test to see if something is random,
you need to try and predict what the sequence generator will develop given
the current conditions. If your code can do that, then it is not random.

How are you generating the random number sequences? To be honest, I
don't know if you have to test for this if you are using a cryptographically
secure number generator. That's not to say that they are not predictable at
all, but they are MUCH less predictable than the Random implementation.

You want to make sure you are using the RNGCryptoServiceProvider class.
 
Thanks,
For what I need it sounds like I should just test for the appearance of
randomness and not even try to validate that they really truely are random.
This question was more for curiosity than it was for a critical need. If I do
need to test the randomness at a deeper level, I will probably try to
generate enough random sequences to be able to do appropriate statistics and
then I'll also have to figure out the stats involved.

Ethan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top