```Date: Fri, 4 Jun 1999 15:01:42 -0400 Reply-To: "Paul M. Dorfman" Sender: "SAS(r) Discussion" From: "Paul M. Dorfman" Subject: Re: Copying One Observation within a BY Group Comments: To: sas-l@listserv.uga.edu Content-Type: text/plain; charset=koi8-r Robert Virgile wrote: >The original question came from Stuart Wugalter, with an answer by Paul >Dorfman. Here is an easy approach: > >data repeated; >merge old (drop=var3 var4) > old (drop=x2 where=(var3 > . or var4 > .)); >by x1; > >The intent is to take the one record per X1 that has data for x3 and x4, >and copy those values to all matching X1 values. Robert, This is an impressively short solution, but it will not work. (The original question and my solution can be found below the sig line.) Suppose the test dataset is the same I have used, i.e. DATA A; INPUT X1 X2 \$ X3 X4; CARDS; 1 X 5 8 1 Y . . 1 Z . . 2 A . 9 2 B 7 . 2 C . . 2 D . . 4 P . 6 4 Q . . 4 R 3 . 7 I . . 7 J . . 9 G . . 9 H 4 . ; Assuming that you really meant x3 and x4 rather than var3 and var4, your code in the form DATA R; MERGE A (DROP=X3 X4) A (DROP=X2 WHERE=(X3 > . OR X4 > .)); BY X1; will produce the following output: OBS X1 X2 X3 X4 1 1 X 5 8 2 1 Y 5 8 3 1 Z 5 8 4 2 A . 9 5 2 B 7 . 6 2 C 7 . 7 2 D 7 . 8 4 P . 6 9 4 Q 3 . 10 4 R 3 . 11 7 I . . 12 7 J . . 13 9 G 4 . 14 9 H 4 . Note that it does not achieve the goal with respect to x3 and alters the values for x4. Besides, the nature of the task should not involve testing of x4 at all - it is just a satellite and must stay where it was. We can probably do better by eliminating x4 > . altogether, but, first, it will do more than asked, namely fill ALL the x3 values in a by-group with that group's nonmissing value, and, second, invariably screw up x4 (and whatever else satellites there might be): OBS X1 X2 X3 X4 1 1 X 5 8 2 1 Y 5 8 3 1 Z 5 8 4 2 A 7 . 5 2 B 7 . 6 2 C 7 . 7 2 D 7 . 8 4 P 3 . 9 4 Q 3 . 10 4 R 3 . 11 7 I . . 12 7 J . . 13 9 G 4 . 14 9 H 4 . Of course, not-first x3 values could be set to missing with an extra IF. Nevertheless, it appears to me that even though POINT= (or array) logic is more wordy and may look more complex on the surface, it is, in fact, more flexible (one can choose between COPY and MOVE types), and it is more straightforward algorithmically. Kind regards, ================ Paul M. Dorfman Jacksonville, FL ================ ======================================================================= Stuart, One way of doing it is to use 'look-ahead' capabilities of direct access: DATA A; INPUT X1 X2 \$ X3 X4; CARDS; 1 X 5 8 1 Y . . 1 Z . . 2 A . 9 2 B 7 . 2 C . . 2 D . . 4 P . 6 4 Q . . 4 R 3 . 7 I . . 7 J . . 9 G . . 9 H 4 . ; RUN; DATA B (DROP=X); SET A NOBS=NOBS; BY X1; IF FIRST.X1 THEN DO PTR=_N_+1 TO NOBS; IF X3 > . OR PTR > NOBS THEN LEAVE; SET A (KEEP=X1 X3 RENAME=(X1=X)) POINT=PTR; IF X NE X1 THEN LEAVE; END; ELSE X3 = .; RUN; Here is the printout of the dataset B: OBS X1 X2 X3 X4 1 1 X 5 8 2 1 Y . . 3 1 Z . . 4 2 A 7 9 5 2 B . . 6 2 C . . 7 2 D . . 8 4 P 3 6 9 4 Q . . 10 4 R . . 11 7 I . . 12 7 J . . 13 9 G 4 . 14 9 H . . I assumed that you actually wanted to MOVE the non-missing value to the first record in the BY-group. However, if you meant COPY literally, then remove the instruction ELSE X = . from the code. If your file is really large, and performance is at stake more than the brevity, _temporary_ arrays should be used instead to implement the logic of 'postponed output', rather than 'look-ahead'. Kind regards, ================= Paul M. Dorfman Jacksonville, FL ================= "Stuart E. Wugalter" wrote: > > I swear someone asked this about 6 months ago, but I'll be @#\$%'ed if I > can find it in the newsgroup archive. Below is my sample dataset. There > are four variables, say x1-x4, where x1 identifies an observation. Note > that variables x3 & x4 are missing for all but 1 line per observation. I > want to be able to check each observation to make sure that x3 & x4 > exist in the first line of each observation and, if not, then copy the > values from one of the other lines. In the example below, observation > x1=1 is okay, but observation x2=2 needs the values of x3 & x4 copied to > the first line of observation 2 (i.e. where x1=2 and x2=a). > > I know one solution would be to create another dataset with x1, x3 & x4 > (where x3 ne .) and merge back into this dataset. Is there a more > efficient solution? TIA > > Stuart E. Wugalter > SAS Consultant > > 1 x .0001 47 > 1 y . . > 1 z . . > 2 a . . > 2 b .0527 47 > 2 c . . > 2 d . . > . > . > . ======================================================================= ```

Back to: Top of message | Previous page | Main SAS-L page