LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (January 1997, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 15 Jan 1997 12:20:00 EST
Reply-To:     "Seeman, G. Matthew" <gcs7@ASI.EM.CDC.GOV>
Sender:       "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From:         "Seeman, G. Matthew" <gcs7@ASI.EM.CDC.GOV>
Subject:      Re: Changing the order/Use ARRAY not RETAIN
Comments: cc: "Whittington, John" <johnw@MAG-NET.CO.UK>,
          "Shecter, Robert" <robert_shecter@MERCK.COM>,
          "Whitlock, Ian" <whitloi1@westatpo.westat.com>

In regards to the HORRENDOUS snag in my approach........ In my last email I mentioned that the RETAIN, ARRAY, and LENGTH statements all have uses in reordering the dataset vector....... I admit the array statement would give a default length to vars that are being newly created while existing variables would keep their lengths....... If I was actually reordering new variables that didn't already have a pre-defined length.....guess which one of the above SAS statements I would use to reorder the vector......(hint: it isn't ARRAY) If I was actually reordering variables that were numerous and constantly alternating between CHAR and NUM.....guess which one I would use (hint: still not ARRAY)

Back to the HORRENDOUS snag.....John - Please reread your own words "It is the last four words there that matter!!" referring to "when re-ordering the variables in an existing dataset" Correct me if I am wrong but ARRAY statements don't reset the length of "variables in an existing dataset"

Aren't we beating a Dead horse........ (he says in order to get the last word)

- Matthew Seeman

---------- From: John Whittington[SMTP:johnw@MAG-NET.CO.UK] Sent: Wednesday, January 15, 1997 3:53 AM To: Multiple recipients of list SAS-L Subject: Re: Changing the order/Use ARRAY not RETAIN

Date: Tue, 14 Jan 1997 14:47:00 -0500 Reply-To: Robert Schechter <robert_schechter@MERCK.COM>

>You owe me a beer. > >The problem with RETAIN is caused when you list all the variables you >need in the new (re-arranged) data set in the order you need them AND >then it turns out some of these variables were calculated within the >data step, THEN the RETAIN statement COULD affect your result. I know, >it just happened to me.

No - I don't owe anyone any beers yet. You have moved the goalpost. You will recall that what I said was:

JW|As I keep trying to say, there are NO 'potential RETAIN problems' (at least JW|one beer at SUG22 for anyone who can prove me wrong!) when re-ordering the JW|variables in an existing dataset ....

It is the last four words there that matter!! I agree that there are potential problems in terms of variables CREATED WITHIN the DATA step in question, but then we are not talking about what I said, and we are not even talking abour RE-ordering variables.

In this new scenario you have posed, I reckon it is still often/usually sensible to use RETAIN, but with manual resetting to missing at the 'top' of the DATA step of any variables created within that DATA step . If there are just a few of them, one can just do this explicitly with assignment statements; more generally, one can do it with arrays - although one has to do the resetting at the 'bottom' of the DATA step, after an OUTPUT statement, in order for all variables created within the step to be included in the arrays). For example:

data new ; retain char1 num1 char2 num2 char3 num3 etc. ; set old ; [ other statements, including some creating variables] output ; array char _character_ ; array num _numeric_ ; do over char ; char = '' ; end ; do over num ; num = . ; end ; run ;

One never needs more than these two arrays and two DO OVER loops (maybe only one, if one knows that all variables created in the DATA step are character, or all are numeric). In contrast, Matthew's approach (using ARRAY statements to do the ordering), could need as many ARRAY statements as variables if the desired ordering alternated character and numeric ones.

In any event, I think that there is one HORRENDOUS snag of Matthew's approach which has not yet been mentioned (I had overlooked it when I responded to Matthew). Unlike RETAIN (with no initial value stated), an ARRAY statement irrevocably defines the length of variables (defaulting to 8, for either numeric or character). This means that one has to define, in the ARRAY statements (which *must* precede SET if they are to re-order) the lengths of all character variables (unless one accepts default of 8), INCLUDING those that have come from a previous dataset via SET. If one does not do this, any character variables, including those from the input dataset, will be truncated to length 8 if longer. This, to my mind, represents at least as much a potential problem and 'hazard' as anything about RETAIN !!

The beer offer is still on, but only for those who leave the goalposts unmoved!!

Regards

John

----------------------------------------------------------- Dr John Whittington, Voice: +44 1296 730225 Mediscience Services Fax: +44 1296 738893 Twyford Manor, Twyford, E-mail: johnw@mag-net.co.uk Buckingham MK18 4EL, UK CompuServe: 100517,3677 -----------------------------------------------------------


Back to: Top of message | Previous page | Main SAS-L page