Tuesday, May 20, 2008

PHP rand(0,1) on Windows < OpenSSL rand() on Debian

512x512 image made using `if rand(0,1)` on PHP/Windows

(Please excuse the inflammatory title.)

Bo Allen recently made a post regarding the difference between true random and pseudorandom numbers, using 512x512 bitmaps as illustration.

Some might doubt his result on Windows, as the image seems to be too predictable. Initially, I passed it off as PHP using the default Windows pseudorandom number generator, which is a Linear Congruential Generator.

The default windows rand() function is implemented as follows (coefficients from this site, many others have them also):

unsigned long randState = 0;

//Get random number
unsigned long rand(void) {
randState = randState * 214013 + 2531011L;
return (randState >> 16) & 0x7fff;
}

//Seed RNG
void srand(unsigned long seed) {
randState = seed;
}

Here is Bo Allen's code:
// Requires the GD Library
header ("Content-type: image/png");
$im = @imagecreatetruecolor(512, 512)
or die("Cannot Initialize new GD image stream");
$white = imagecolorallocate($im, 255, 255, 255);
for($y=0;$y<512;$y++){
for($x=0;$x<512;$x++){
if(rand(0,1)===1){
imagesetpixel($im, $x, $y, $white);
}
}
$x=0;
}
imagepng($im);
imagedestroy($im);


Let's have a look at how PHP does rand(0,1). All code is from the php-5.2.6 source release.

//ext/standard/rand.c:296-315
/* {{{ proto int rand([int min, int max])
Returns a random number */
PHP_FUNCTION(rand)
{
long min;
long max;
long number;
int argc = ZEND_NUM_ARGS();

if (argc != 0 &&
zend_parse_parameters(argc TSRMLS_CC, "ll", &min, &max) == FAILURE)
return;

number = php_rand(TSRMLS_C);
if (argc == 2) {
RAND_RANGE(number, min, max, PHP_RAND_MAX);
}

RETURN_LONG(number);
}
/* }}} */


A bit of digging reveals that php_rand is just a wrapper to the system's random() function. Okay, so what exactly is RAND_RANGE?

//ext/standard/php_rand.h:43-44
#define RAND_RANGE(__n, __min, __max, __tmax) \
(__n) = (__min) + (long) ((double) ( (double) (__max) - (__min) + 1.0) \
* ((__n) / ((__tmax) + 1.0)))

Wow, that is some ugly code.

Python's default random generator uses Mersenne Twister, which is an excellent choice for any application outside of cryptography. Here's code that replicates Bo's experiment, including a re-implementation of php's broken rand() function.


import PIL.Image, random, time

class MsftRand(random.Random):
'''This is equivalent to rand() and srand() on the Windows platform'''
state = 0
def __init__(self): # constructor method
self.state = int(time.time()) # seconds since the epoch
def seed(self,s):
self.state = s
def setstate(self,s):
self.state = s
def getstate(self):
return self.state
def random(self): # returns a float [0,1)
self.state = (self.state * 214013 + 2531011) % 2**32
return float((self.state >> 16) & 0x7fff) / 2**15

class bogoRand(MsftRand):
'''This replicates how PHP does randint()'''
def randint(self, a, b):
self.state = (self.state * 214013 + 2531011) % 2**32
return a + int(float((float(b-a+1)*((self.state&0x7fff)/((1<<15)+1.0)))))

image = PIL.Image.new("1",(512,512)) # Creates a 512x512 bitmap

# creates a list of different RNG objects and names
rand_mapping = [ (MsftRand(), "Microsoft"),
(random.Random(), "Mersenne"),
(bogoRand(), "bogoRand") ]

for rnd, name in rand_mapping:
for y in xrange(512):
for x in xrange(512):
image.putpixel((x,y), rnd.randint(0,1))
image.save("random_"+name+".png")

This will create three images when you run it--random_Mersenne.png, which uses Python's default random generator and randint() implementations, random_Microsoft.png, which uses the Windows LCG and Python's randint() implementation, and random_bogoRand.png, which uses the Windows LCG and PHP's randint() implementation.

Here's what random_bogoRand.png looks like:

This doesn't look exactly like Bo's image, as I probably didn't exactly replicate all the quirks of PHP's RAND_RANGE macro. I could be using the wrong LCG as well.

The other two images have no patterns obvious to the eye. How does Python do randint()?

from Lib/rand.py:

def randint(self, a, b):
"""Return random integer in range [a, b], including both end points.
"""
return self.randrange(a, b+1)
def randrange(self, start, stop):
#I'm paraphrasing here, it does checks to make sure the arguments
# are integers and other sanity stuff
return start + int(self.random() * (stop - start))

It's not that PRNGs are visibly flawed, it's that PHP has a terrible method of generating a random range.

Labels: , , ,