LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (January 2003, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Sun, 5 Jan 2003 16:35:06 +0000
Reply-To:     sashole@bellsouth.net
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Paul Dorfman <paul_dorfman@HOTMAIL.COM>
Subject:      Re: what does this mean?
Comments: To: aw@SOMELAB.NOSPAM.COM
Content-Type: text/plain; format=flowed

Andy,

I have seen the reply posted by Roland, however I beg to differ with his interpretation.

First, as you are a C, rather than a SAS, expert, you may be surprised to know that SAS gets all its mileage from just 2 data types: numeric and character. The latter is self-explanatory; the former is always a double-precision float (8-byte real binary). Not all combination of the 64 bits in this format represent numbers, so SAS have chosen to use these nans (not-a-numbers) to represent missing numeric values, or, if you will, null-values. Since SAS is, among many other things, an analytic and statistical package, missing values are of utmost importance and almost never should be (as some people do) be replaced by zeroes or other real numeric values.

Now, there are more than one nan, and SAS uses this fact to discriminate between missing values themselves - which is quite useful because different missing values can represent different missing entities (e.g. hair and/or brains). There are 28 numeric missing values having their own collating sequence. SAS uses the "dot-symbol" notation to hard-code missing values (below, "<" is just what it is, i.e. the "less than" sign):

_. < . < .a < .b < .c < ...... < .x < .y < .z

Note that i) it naturally corresponds to the collating sequence of the characters used to the right of the period to represent different missing values and ii) the notation is case-insensitive, id est (.a = .A) and so on. The lone period (second from the left) is actually dot-blank. It is considered the "standard" numeric missing value, because that is what SAS uses by default for its own diverse purposes when a missing value must be assigned to a variable, for example, when a function call that is supposed to return a numeric value cannot be executed, as in

x = sin (4) ; y = log2 (-123 ;

or a user asks to interpret a string 'A*B-C' as a number straight-up. The choice of the dot-blank for the standard numeric is somewhat arbitrary (for it is not the smallest, nor the largest missing value), but kind of consistent, since the SAS' only character missing value is represented by a blank (no diversity in this department). So, for the standard (and only) character null-value you've got a blank; for the standard numeric null-value you've got a dot-blank. Seems logical.

Now, back to the .Z business, what is the programming intent of using a comparison like

if (RV > .z) then... ?

Well, since .Z is the largest missing value, the expression evaluates to true if RV is not missing, i.e. is not equal to *any* missing value, and to false otherwise, i.e. if RV is equal to any of the missing values, or, in other words, is missing. In the current version, SAS provided the MISSING() function, so the same could (and I think should) be coded as

if not missing (rv) then ...

The function works regardless of the argument type, i.e. it will test RV for missing whether it is numeric or character variable. Note that if you see something like

if n (rv) > 0 then ...

("if the number of non-missing arguments is greater than 0") or

if nmiss (rv) = 0

("if the number of missing arguments is zero")

will do the same, but only if the argument is numeric. Finally, lest there be any confusion left, let me reiterate that "numeric variable" or "SAS number" means a variable of the numeric data type (i.e. SAS-computable value in the double-float real-binary representation) and *not* a string consisting of digits, looking like a number of some kind, etc. SAS provides numerous routines called informats to interpret a great variety of things like that and convert them into its RB8 number, but this is a topic for a different discussion.

Kind regards, ------------------ Paul M. Dorfman Jacksonville, FL ------------------

>From: Andy Walsh <aw@SOMELAB.NOSPAM.COM> >Reply-To: Andy Walsh <aw@SOMELAB.NOSPAM.COM> >To: SAS-L@LISTSERV.UGA.EDU >Subject: what does this mean? >Date: Mon, 6 Jan 2003 00:58:53 +1100 > >I am converting SAS code output to C (I know nothing about SAS). > >What does the following line of code mean? (surely should be easy for >experts!) > >if ( RV > .z ) goto N0_5; > >I just don't get the .z part...

_________________________________________________________________ MSN 8 with e-mail virus protection service: 2 months FREE* http://join.msn.com/?page=features/virus


Back to: Top of message | Previous page | Main SAS-L page