Date: Sun, 5 Jan 2003 16:35:06 +0000
Reply-To: sashole@bellsouth.net
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Paul Dorfman <paul_dorfman@HOTMAIL.COM>
Subject: Re: what does this mean?
Content-Type: text/plain; format=flowed
Andy,
I have seen the reply posted by Roland, however I beg to differ with his
interpretation.
First, as you are a C, rather than a SAS, expert, you may be surprised to
know that SAS gets all its mileage from just 2 data types: numeric and
character. The latter is self-explanatory; the former is always a
double-precision float (8-byte real binary). Not all combination of the 64
bits in this format represent numbers, so SAS have chosen to use these nans
(not-a-numbers) to represent missing numeric values, or, if you will,
null-values. Since SAS is, among many other things, an analytic and
statistical package, missing values are of utmost importance and almost
never should be (as some people do) be replaced by zeroes or other real
numeric values.
Now, there are more than one nan, and SAS uses this fact to discriminate
between missing values themselves - which is quite useful because different
missing values can represent different missing entities (e.g. hair and/or
brains). There are 28 numeric missing values having their own collating
sequence. SAS uses the "dot-symbol" notation to hard-code missing values
(below, "<" is just what it is, i.e. the "less than" sign):
_. < . < .a < .b < .c < ...... < .x < .y < .z
Note that i) it naturally corresponds to the collating sequence of the
characters used to the right of the period to represent different missing
values and ii) the notation is case-insensitive, id est (.a = .A) and so on.
The lone period (second from the left) is actually dot-blank. It is
considered the "standard" numeric missing value, because that is what SAS
uses by default for its own diverse purposes when a missing value must be
assigned to a variable, for example, when a function call that is supposed
to return a numeric value cannot be executed, as in
x = sin (4) ;
y = log2 (-123 ;
or a user asks to interpret a string 'A*B-C' as a number straight-up. The
choice of the dot-blank for the standard numeric is somewhat arbitrary (for
it is not the smallest, nor the largest missing value), but kind of
consistent, since the SAS' only character missing value is represented by a
blank (no diversity in this department). So, for the standard (and only)
character null-value you've got a blank; for the standard numeric null-value
you've got a dot-blank. Seems logical.
Now, back to the .Z business, what is the programming intent of using a
comparison like
if (RV > .z) then... ?
Well, since .Z is the largest missing value, the expression evaluates to
true if RV is not missing, i.e. is not equal to *any* missing value, and to
false otherwise, i.e. if RV is equal to any of the missing values, or, in
other words, is missing. In the current version, SAS provided the MISSING()
function, so the same could (and I think should) be coded as
if not missing (rv) then ...
The function works regardless of the argument type, i.e. it will test RV for
missing whether it is numeric or character variable. Note that if you see
something like
if n (rv) > 0 then ...
("if the number of non-missing arguments is greater than 0") or
if nmiss (rv) = 0
("if the number of missing arguments is zero")
will do the same, but only if the argument is numeric. Finally, lest there
be any confusion left, let me reiterate that "numeric variable" or "SAS
number" means a variable of the numeric data type (i.e. SAS-computable value
in the double-float real-binary representation) and *not* a string
consisting of digits, looking like a number of some kind, etc. SAS provides
numerous routines called informats to interpret a great variety of things
like that and convert them into its RB8 number, but this is a topic for a
different discussion.
Kind regards,
------------------
Paul M. Dorfman
Jacksonville, FL
------------------
>From: Andy Walsh <aw@SOMELAB.NOSPAM.COM>
>Reply-To: Andy Walsh <aw@SOMELAB.NOSPAM.COM>
>To: SAS-L@LISTSERV.UGA.EDU
>Subject: what does this mean?
>Date: Mon, 6 Jan 2003 00:58:53 +1100
>
>I am converting SAS code output to C (I know nothing about SAS).
>
>What does the following line of code mean? (surely should be easy for
>experts!)
>
>if ( RV > .z ) goto N0_5;
>
>I just don't get the .z part...
_________________________________________________________________
MSN 8 with e-mail virus protection service: 2 months FREE*
http://join.msn.com/?page=features/virus