Tuesday, May 20, 2008

PHP rand(0,1) on Windows < OpenSSL rand() on Debian

512x512 image made using `if rand(0,1)` on PHP/Windows

(Please excuse the inflammatory title.)

Bo Allen recently made a post regarding the difference between true random and pseudorandom numbers, using 512x512 bitmaps as illustration.

Some might doubt his result on Windows, as the image seems to be too predictable. Initially, I passed it off as PHP using the default Windows pseudorandom number generator, which is a Linear Congruential Generator.

The default windows rand() function is implemented as follows (coefficients from this site, many others have them also):

unsigned long randState = 0;

//Get random number
unsigned long rand(void) {
randState = randState * 214013 + 2531011L;
return (randState >> 16) & 0x7fff;
}

//Seed RNG
void srand(unsigned long seed) {
randState = seed;
}

Here is Bo Allen's code:
// Requires the GD Library
header ("Content-type: image/png");
$im = @imagecreatetruecolor(512, 512)
or die("Cannot Initialize new GD image stream");
$white = imagecolorallocate($im, 255, 255, 255);
for($y=0;$y<512;$y++){
for($x=0;$x<512;$x++){
if(rand(0,1)===1){
imagesetpixel($im, $x, $y, $white);
}
}
$x=0;
}
imagepng($im);
imagedestroy($im);


Let's have a look at how PHP does rand(0,1). All code is from the php-5.2.6 source release.

//ext/standard/rand.c:296-315
/* {{{ proto int rand([int min, int max])
Returns a random number */
PHP_FUNCTION(rand)
{
long min;
long max;
long number;
int argc = ZEND_NUM_ARGS();

if (argc != 0 &&
zend_parse_parameters(argc TSRMLS_CC, "ll", &min, &max) == FAILURE)
return;

number = php_rand(TSRMLS_C);
if (argc == 2) {
RAND_RANGE(number, min, max, PHP_RAND_MAX);
}

RETURN_LONG(number);
}
/* }}} */


A bit of digging reveals that php_rand is just a wrapper to the system's random() function. Okay, so what exactly is RAND_RANGE?

//ext/standard/php_rand.h:43-44
#define RAND_RANGE(__n, __min, __max, __tmax) \
(__n) = (__min) + (long) ((double) ( (double) (__max) - (__min) + 1.0) \
* ((__n) / ((__tmax) + 1.0)))

Wow, that is some ugly code.

Python's default random generator uses Mersenne Twister, which is an excellent choice for any application outside of cryptography. Here's code that replicates Bo's experiment, including a re-implementation of php's broken rand() function.


import PIL.Image, random, time

class MsftRand(random.Random):
'''This is equivalent to rand() and srand() on the Windows platform'''
state = 0
def __init__(self): # constructor method
self.state = int(time.time()) # seconds since the epoch
def seed(self,s):
self.state = s
def setstate(self,s):
self.state = s
def getstate(self):
return self.state
def random(self): # returns a float [0,1)
self.state = (self.state * 214013 + 2531011) % 2**32
return float((self.state >> 16) & 0x7fff) / 2**15

class bogoRand(MsftRand):
'''This replicates how PHP does randint()'''
def randint(self, a, b):
self.state = (self.state * 214013 + 2531011) % 2**32
return a + int(float((float(b-a+1)*((self.state&0x7fff)/((1<<15)+1.0)))))

image = PIL.Image.new("1",(512,512)) # Creates a 512x512 bitmap

# creates a list of different RNG objects and names
rand_mapping = [ (MsftRand(), "Microsoft"),
(random.Random(), "Mersenne"),
(bogoRand(), "bogoRand") ]

for rnd, name in rand_mapping:
for y in xrange(512):
for x in xrange(512):
image.putpixel((x,y), rnd.randint(0,1))
image.save("random_"+name+".png")

This will create three images when you run it--random_Mersenne.png, which uses Python's default random generator and randint() implementations, random_Microsoft.png, which uses the Windows LCG and Python's randint() implementation, and random_bogoRand.png, which uses the Windows LCG and PHP's randint() implementation.

Here's what random_bogoRand.png looks like:

This doesn't look exactly like Bo's image, as I probably didn't exactly replicate all the quirks of PHP's RAND_RANGE macro. I could be using the wrong LCG as well.

The other two images have no patterns obvious to the eye. How does Python do randint()?

from Lib/rand.py:

def randint(self, a, b):
"""Return random integer in range [a, b], including both end points.
"""
return self.randrange(a, b+1)
def randrange(self, start, stop):
#I'm paraphrasing here, it does checks to make sure the arguments
# are integers and other sanity stuff
return start + int(self.random() * (stop - start))

It's not that PRNGs are visibly flawed, it's that PHP has a terrible method of generating a random range.

Labels: , , ,

9 Comments:

Blogger Bo said...

Wow, excellent follow-up! I appreciate the time you spent looking into the details of this.

Using PHP's mt_rand() function (which utilizes the Mersenne Twister) provides a better random value.

May 27, 2008 at 5:34 PM  
OpenID tml said...

After digging deeper into this, it seems likely that the real problem is two-fold:

first, rand() on Win32 has a period of 32767;

secondly, the RAND_RANGE define is trying to scale the generated number into the range you requested. This can result in some loss of randomness (see http://bugs.php.net/bug.php?id=45184).

Python's randrange() solves this by calling _randbelow() in a loop while you're outside the requested range - which is probably the best way to do it. There is some work on this that should be coming out as of 5.3 - thanks for bringing it up!

October 20, 2008 at 5:40 PM  
Blogger Hubadu said...

deja-vu on random.org

September 6, 2009 at 7:56 AM  
Blogger Eric said...

Even with PHP 5.3, the results aren't that much different. Using mt_rand instead, though, is mind-blowing...

PHP 5.3 with rand: http://twitpic.com/gq81b

PHP 5.3 with mt_rand: http://twitpic.com/gq827

September 6, 2009 at 10:01 AM  
Blogger Juln the Squid said...

Yes, you have to use mt_rand. That's why they describe it as "Generate a better random value".

September 6, 2009 at 2:47 PM  
Blogger jacquesm said...

If you want to do a comparison at least make it a meaningful one and use the rand function the way it is intended:

This will give your bit: "rand() & 1;", so imagesetpixel($im,$x,$y,rand() &1);

will do the job. To save you the typing you can look here:

http://ww.com/random.php

That's on a linux box running php 4.4.2, the results look pretty random to me. To make the result reproducible I've seeded the PRNG with '0' using srand(0); at the beginning of the program.

If you want to use the 'scale' function in PHP for something that you should be doing by a one bit and then you will reap the results of that, you are effectively sampling the lowest bit after a bunch of float operations, which is clearly sub-optimal.

PHP can't correct for incorrect use.

As for the rest of your code, the $x=0; line can be dropped, there is no need to 'reset' your loop variable at the end of a loop, last I checked.

I'm no PHP fan but before you criticize a project make sure you understand the material.

September 6, 2009 at 4:06 PM  
Blogger abadinfluence said...

http://pastebin.com/m40b85179

Public domain. Have fun.

September 8, 2009 at 12:11 AM  
Blogger Bo said...

@jacquesm

1.) Linux was never the problem. Read the post title...

2.) Your code is wrong.

imagesetpixel($im,$x,$y,rand() &1);

...Pretty much just makes a black square.

3.) After fixing your code, your bitwise use of rand() on Windows actually makes an even worse random graphic.

4.) rand(0, 1) works just fine. Yes, rand() &1 is faster, but that doesn't mean rand(0, 1) is "wrong"... Especially since, in this case, rand(0, 1) produces better randomness.

September 3, 2011 at 4:06 PM  
Blogger Emmanuel Jeandel said...

Your analysis is completely wrong.

You made a mistake in the code for bogoRand. IF php were to call windows random(), the call would be:

return a + int(float((float(b-a+1)*((self.state&0x7fff >> 16)/((1<<15)+1.0)))))

You forgot the ">> 16", I hope it's not intentional. But IF php were to call windows random(), there would be no problem, as anyone can see by adding back the ">> 16" that is missing.

What's really happening is that PHP in the ZTS mode (I don't know what it stands for, but it has something to do with Thread Safety) uses its own version of random() which is defined by:

return ((*ctx = *ctx * 1103515245 + 12345) % ((u_long)PHP_RAND_MAX + 1));


And you can test that by replacing your code for bogoRand by:
def randint(self, a, b):
self.state = (self.state* 1103515245 + 12345) % (1 << 15)
return a + int(float((float(b-a+1)*((self.state/((1<<15)+1.0))))))

you will obtain EXACTLY what happens in Bo Allen post.

October 5, 2012 at 3:42 AM  

Post a Comment

Subscribe to Post Comments [Atom]

<< Home