Date: Fri, 30 Dec 2005 10:24:42 -0500
Reply-To: Michael Raithel <michaelraithel@WESTAT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Michael Raithel <michaelraithel@WESTAT.COM>
Subject: Re: loading sas dates into sql server
Content-Type: text/plain; charset="us-ascii"
Dear SAS-L-ers,
In further discussing this issue, Washington, DC area's own Howard
Schreier posted the following:
> Afraid I have to disagree with Mike here. Using valid numeric
> values to record non-quantitative information via convention
> is not good practice and can get you in trouble.
>
> If SQL Server has nothing like SAS special missing values,
> create additional variables to record the explanatory information.
>
Howard, I guess that great minds really do think alike! Actually, your
suggestion is my "PLAN B". I didn't put it in my original posting
because it damaged the streamlined message that I had so carefully
crafted. But, I was tempted to.
I call the entities that you described "satellite variables". A
satellite variables is a variable associated with another variable that
specifies whether the value of the original variable is "missing",
"could not be obtained", etc. In my world, satellite variables always
have a the same name to the original variable, but end with an "_s".
(Other people may have different conventions--though mine is obviously
the very best convention of all). Consequently, "gender" would have a
satellite variable of "gender_s". Satellite variables allow
me/you/us/them to get around tough-minded integrity constraints.
There are two reasons that satellite variables are my PLAN B. The first
is that they lead to data set bloat. One is adding at least one byte of
satellite variable for each variable that may have missing values (and
other problems). Like calories, those darn bytes seem to add
up--especially around the holidays. The second is that people using the
data have to know that the satellite variables exist, the function of
the satellite variables, and what the satellite variable values signify.
They could easily get into trouble by not knowing or understanding these
issues. This is not very different from having to have specialized
knowledge that the value of 01/01/1800 is the 'could not obtain' value.
So, I am not sure that we are ahead of the game with PLAN B unless PLAN
A is untenable.
Howard, have Happy New Year! I will look forward to seeing you at DCSUG
meetings in the new year!
I hope that this suggestion proves helpful now, and in the future!
Of course, all of these opinions and insights are my own, and do not
reflect those of my organization or my associates. All SAS code and/or
methodologies specified in this posting are for illustrative purposes
only and no warranty is stated or implied as to their accuracy or
applicability. People deciding to use information in this posting do so
at their own risk.
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Michael A. Raithel
"The man who wrote the book on performance"
E-mail: MichaelRaithel@westat.com
Author: Tuning SAS Applications in the MVS Environment
Author: Tuning SAS Applications in the OS/390 and z/OS Environments,
Second Edition
http://www.sas.com/apps/pubscat/bookdetails.jsp?catid=1&pc=58172
Currently Writing: The Complete Guide to SAS Indexes (due February 28,
2006)
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Part of the inhumanity of the computer is that, once it is
competently programmed and working smoothly, it is completely
honest. - Isaac Asimov
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++