rpresley@gmcf.org
=========================================================================
Date: Fri, 21 Feb 2003 17:12:05 -0500
Reply-To: Ian Whitlock <WHITLOI1@WESTAT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Ian Whitlock <WHITLOI1@WESTAT.COM>
Subject: Re: Using a macro to simulate data
Content-Type: text/plain; charset="iso-8859-1"
David,
I agree that it is better not to use the clock time as the seed. I only
suggested it to keep the spirit of the original code. For your one-liner I
expected, in SAS, to use
%let seed = %sysevalf ( %sysfunc( ranuni(3426) )*(2**31 - 1) ) ;
(more code less documentation), However I ran into problems.
644 data _null_ ;
645 x = ranuni ( 3426 ) * ( 2**31 - 1 ) ;
646 put x = ;
647 run ;
x=1464077493
NOTE: DATA statement used:
real time 0.04 seconds
648
649 %let seed = %sysevalf ( %sysfunc( ranuni(3426) )*(2**31 - 1) ) ;
650 %put seed = &seed ;
seed = 543941940.99998
First I noticed that the arithmetic is not the same. Then I noticed the
wildly different values. My only explanation for the wildly different
values, is that, when RANUNI is called with %SYSFUNC the seed is ignored
after the first call, just like in one DATA step. Did I miss something in
the documentation again?
Oh well, this three liner
%let seed = 3426 ;
%let x = 0 ;
%syscall ranuni(seed,x) ;
gets the seed. Now, do I need 5 lines of documentation to explain why I
have a variable X? I hope not, perhaps one more line of code will do it.
%symdel x ;
Oh, I took you too seriously about using clock time. Perhaps the first line
should be
%let seed = 0 ;
First law of computer languages: When the ratio of documentation lines to
code lines is greater than 2, don't trust the programmer.
Second law: When the ratio is less than epsilon the program better be very
short.
IanWhitlock@westat.com
-----Original Message-----
From: David L. Cassell [mailto:cassell.david@EPAMAIL.EPA.GOV]
Sent: Friday, February 21, 2003 1:46 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: Using a macro to simulate data
Ian Whitlock <WHITLOI1@WESTAT.COM>, in one of his replies, wrote in
part:
> I suggest that you use
>
> %let seed = %sysevalf(%sysfunc(time()) + 1001 * &id) ;
While I would never disagree with Ian, I much prefer his guidance in
another post in the same thread. I strongly recommend using a fixed
seed at the beginning of your simulation, and then transferring the
sequences form the pseudo-random number generator into later parts
of one's (macro) code, so that all the code runs from a single seed.
Being able to replicate one's analysis is crucial. Particularly since
the Big Boss will only think to ask you to repeat your work exactly in
the case where you can't replicate your work. :-) Seriously, you need
to be able to do this replication so that others can validate your
research.
Ian's above code is a fine way to generate a random seed for you to use
as a starting value. I often use the following snippet of Perl code to
get some random starts for some SAS programs we run here:
#!/usr/local/bin/perl
# gr3.pl - *G*enerate *R*andoms for SAS design programs
# last mod: David L. Cassell, 2001/09/20
# Usage: gr3.pl [n]
# [where n is an optional number of random numbers, between 0 and
2**31-1,
# to be printed in %010d format, with the default set at 10 random
numbers]
# Note 1: 2**31 is used in the rand() call because SAS uses the range 0
to 2**31-1
# Note 2: a call to srand() is no longer needed, as of Perl 5.004
map { printf "%010d\n", int rand 2**31 } ( 1..(shift||10) )
Umm, yes, that is *one* line of code, with 7 lines of documentation.
David
--
David Cassell, CSC
Cassell.David@epa.gov