LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (June 1999, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 7 Jun 1999 09:52:21 -0400
Reply-To:     WHITLOI1 <WHITLOI1@WESTAT.COM>
Sender:       "SAS(r) Discussion" <SAS-L@UGA.CC.UGA.EDU>
From:         WHITLOI1 <WHITLOI1@WESTAT.COM>
Subject:      Re[2]: Copying One Observation within a BY Group
Comments: To: Howard Schreier <Howard_Schreier@ITA.DOC.GOV>
Content-Type: text/plain; charset=US-ASCII

Howard, Bob, Paul, and Stuart,

I am not sure what all the fuss is about. Bob's technique with Howard's AND is correct when X3 and X4 are non-missing for the same observation. If Paul's assumption that they could be non-missing for different observations is correct then just use Bob's technique twisted a little harder.

data w ; merge a ( keep = x1 x2 ) a ( keep = x1 x3 where = ( x3 ^= . ) ) a ( keep = x1 x4 where = ( x4 ^= . ) ) ; by x1 ;

if not first.x1 then do ; x3 = . ; x4 = . ; end ; run ;

If you want X3 and X4 to come from different observations, they clearly have to come from different subsets of A.

By the way, thanks Howard for a great presentation our WesSug meeting last month.

Ian Whitlock <whitloi1@westat.com>

____________________Reply Separator____________________ Subject: Re: Copying One Observation within a BY Group Author: Howard Schreier <Howard_Schreier@ITA.DOC.GOV> Date: 6/7/1999 8:53 AM

It looks to me like Bob's solution is pretty close. I have changed the "OR" to "AND" and also added the conditional assignment of missing values which Paul suggested.

My code

DATA R; MERGE A (DROP=X3 X4) A (DROP=X2 WHERE=(X3 > . and X4 > .)); BY X1; if not first.x1 then do; x3 = .; x4 = .; end; run;

This reflects my interpretation of the original problem statement. It's usually a good idea to articulate assumptions about data and to analyze the consequences if they are violated. Here, the key point point is that the non-missing values of X3 and X4 which are to be "floated" up to the first observation are found together in a single observation, and that there is exactly one such observation in each BY group. If the X3 and X4 values are in different observations (as in Paul's test data), my code will lose them. If there is more than one pair in a BY group, the first one will be used and others ignored.

I'm not sure why Paul calls X4 a "satellite". Stuart calls for symmetric treatment of X3 and X4; Bob's code provides that, as does mine. X2 is the model for variables to be preserved in their original observations. If there are others, they can be treated as X2 is treated. Note that this technique can change the order of variables in the data set (example: if there is an X5, treated like X2, the PDV sequence will be X1 X2 X5 X3 X4).

> From: "Paul M. Dorfman" <sashole@EARTHLINK.NET> > Subject: Re: Copying One Observation within a BY Group > > Robert Virgile <virgile@MEDIAONE.NET> wrote: > > >The original question came from Stuart Wugalter, with an answer by Paul > >Dorfman. Here is an easy approach: > > > >data repeated; > >merge old (drop=var3 var4) > > old (drop=x2 where=(var3 > . or var4 > .)); > >by x1; > > > >The intent is to take the one record per X1 that has data for x3 and x4, > >and copy those values to all matching X1 values. > > Robert, > > This is an impressively short solution, but it will not work. (The original > question and my solution can be found below the sig line.) Suppose the test > dataset is the same I have used, i.e. > > DATA A; > INPUT X1 X2 $ X3 X4; > CARDS; > 1 X 5 8 > 1 Y . . > 1 Z . . > 2 A . 9 > 2 B 7 . > 2 C . . > 2 D . . > 4 P . 6 > 4 Q . . > 4 R 3 . > 7 I . . > 7 J . . > 9 G . . > 9 H 4 . > ; > > Assuming that you really meant x3 and x4 rather than var3 and var4, your code in > the form > > DATA R; > MERGE A (DROP=X3 X4) > A (DROP=X2 WHERE=(X3 > . OR X4 > .)); > BY X1; > > will produce the following output: > > OBS X1 X2 X3 X4 > 1 1 X 5 8 > 2 1 Y 5 8 > 3 1 Z 5 8 > 4 2 A . 9 > 5 2 B 7 . > 6 2 C 7 . > 7 2 D 7 . > 8 4 P . 6 > 9 4 Q 3 . > 10 4 R 3 . > 11 7 I . . > 12 7 J . . > 13 9 G 4 . > 14 9 H 4 . > > Note that it does not achieve the goal with respect to x3 and alters the values > for x4. Besides, the nature of the task should not involve testing of x4 at all > - it is just a satellite and must stay where it was. We can probably do better > by eliminating x4 > . altogether, but, first, it will do more than asked, namely > fill ALL the x3 values in a by-group with that group's nonmissing value, and, > second, invariably screw up x4 (and whatever else satellites there might be): > > OBS X1 X2 X3 X4 > > 1 1 X 5 8 > 2 1 Y 5 8 > 3 1 Z 5 8 > 4 2 A 7 . > 5 2 B 7 . > 6 2 C 7 . > 7 2 D 7 . > 8 4 P 3 . > 9 4 Q 3 . > 10 4 R 3 . > 11 7 I . . > 12 7 J . . > 13 9 G 4 . > 14 9 H 4 . > > Of course, not-first x3 values could be set to missing with an extra IF. [snip] > Kind regards, > > ================= > Paul M. Dorfman > Jacksonville, FL > ================= > > "Stuart E. Wugalter" wrote: > > > > I swear someone asked this about 6 months ago, but I'll be @#$%'ed if I > > can find it in the newsgroup archive. Below is my sample dataset. There > > are four variables, say x1-x4, where x1 identifies an observation. Note > > that variables x3 & x4 are missing for all but 1 line per observation. I > > want to be able to check each observation to make sure that x3 & x4 > > exist in the first line of each observation and, if not, then copy the > > values from one of the other lines. In the example below, observation > > x1=1 is okay, but observation x2=2 needs the values of x3 & x4 copied to > > the first line of observation 2 (i.e. where x1=2 and x2=a). > > > > I know one solution would be to create another dataset with x1, x3 & x4 > > (where x3 ne .) and merge back into this dataset. Is there a more > > efficient solution? TIA > > > > Stuart E. Wugalter > > SAS Consultant > > > > 1 x .0001 47 > > 1 y . . > > 1 z . . > > 2 a . . > > 2 b .0527 47 > > 2 c . . > > 2 d . .

(signed) Howard_Schreier@ita.doc.gov


Back to: Top of message | Previous page | Main SAS-L page