Date: Wed, 8 Dec 1999 15:31:03 -0700
Reply-To: Mark S Dehaan/MSD/LMITCO/INEEL/US <MSD@INEL.GOV>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Mark S Dehaan/MSD/LMITCO/INEEL/US <MSD@INEL.GOV>
Subject: Re: a cumbersome situation with data manipulation
Content-type: text/plain; charset=us-ascii
Lou,
>The lags make me a little uncomfortable
It should. I would avoid putting the lags in the IF stmt. SAS is
notoriously hard to predict if all the booleans will be checked or whether
once one is violated SAS jumps outside and doesn't do the rest of the IF
stmt. This might make the LAG flaky in that is is essentially being
conditionally called. There has been considerable discussion of avoiding
conditionally calling LAGs - you get hard to predict results (but SAS is
not in error).
HTH,
Mark DeHaan
Lou Pogoda <lpogoda@HOME.NOSPAM.COM>@LISTSERV.VT.EDU> on 12/08/99 03:00:09
PM
Please respond to Lou Pogoda <lpogoda@HOME.NOSPAM.COM>
Sent by: "SAS(r) Discussion" <SAS-L@LISTSERV.VT.EDU>
To: SAS-L@LISTSERV.VT.EDU
cc:
Subject: Re: a cumbersome situation with data manipulation
I guess it really comes down to how large your input data set is, but the
easiest to *code* (for me, at any rate) would be something like the
following (untested) code:
proc sort data = a;
by descending obs;
data b;
set a;
if lag(answer) = 'd' and
lag(id) = id and
lag(test) = test;
run;
The lags make me a little uncomfortable - without trying it out first I'm
always uneasy that they'll do what I want.
diltilia@my-deja.com wrote in message <82ltqv$9sp$1@nnrp1.deja.com>...
>Here's the situation:
>
>data a;
> input obs id test sequence answer $;
>cards;
>1 1 11 0 a
>2 1 11 1 b
>3 1 11 2 c
>4 1 11 3 d
>5 1 12 0 a
>6 1 12 1 b
>7 2 21 0 b
>8 2 21 1 d
>9 3 31 0 a
>10 3 31 1 c
>11 3 31 2 d
>12 3 31 3 b
>13 3 31 4 d
>14 3 32 0 d
>;
>
>What I need to do here is: within each 'test' for each user,select the
>records where the answer is the one right before answer 'd'. In another
>word, select the records which the sequence number is 1 less than the
>sequence number for answer 'd'.
>
>For example, for id '1' test '11', I want to select observation 3,
>for id '2' test '21', I want to select obs 7, and so on.
>
>What I did was to calculate a number 'newseq'(which equals 'obs'-1) for
>each occurence where answer ='d', and take all these numbers into a data
>set, then use sql to select * from the original data set where "obs"
>number is equal to this new variable 'newseq'. It does the job, but not
>so efficient. Also, if I want to extend this logic: Let's say, if I find
>out 'c' occurs most often before the answer 'd', and i want to select
>the records that's one step before 'c'...
>
>I'm suspecting that there's a simpler way to do this.
>
>Thanks in advance for your patience of reading through this question.
>
>-Diltilia
>
>
>Sent via Deja.com http://www.deja.com/
>Before you buy.