LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (August 2008, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 22 Aug 2008 15:26:48 -0400
Reply-To:     "Howard Schreier <hs AT dc-sug DOT org>"
              <schreier.junk.mail@GMAIL.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Howard Schreier <hs AT dc-sug DOT org>"
              <schreier.junk.mail@GMAIL.COM>
Subject:      Re: SQL to keep records

On Tue, 19 Aug 2008 06:12:05 -0400, KNuak <asom77@HOTMAIL.COM> wrote:

>Could someone help me with the SAS code to obtain ds2 USING SQL, please? >Ie. to retain the ids and the aws, that will give me difw less than 3? I >HAVE 300 DIFFERENT IDS, ONLY 2 ARE SHOWN. IDs are sorted. Aw = age in >weeks, and difw is difference in weeks given by current minus previous week. >* My ultimate aim is to arrive at ds2 below USING SQL. ANY HELP,PLEASE? >Data w1; >Set w; >BY ID; >Run; >Ds1: >Id aw difw >37 3 . >37 6 3 >37 13 7 >37 14 1 >37 18 4 >39 7 . >39 10 3 >39 12 2 >39 17 5 >Ds2: >Id aw difw >37 13 . >37 14 1 >39 10 . >39 12 2

SQL is not particularly well suited for this task, but ...

The critical thing to recognize is that the decision on whether to keep a row requires inspection of both the preceding and following rows (with row order being reflected in the strictly monotonic values of AW within each ID group). Look at the SQL models in http://www.sascommunity.org/wiki/Look-Ahead_and_Look-Back. One is not applicable because it requires an index column which explicitly specifies the row order. The other depends only on the monotonicity of AW, so it can be used. Once it is adapted, it's only matter of coding a WHERE clause to select the rows and coding a CASE structure to suppress DIFW values at the start of each subseries.

Test data:

data one; input id aw; cards; 37 3 37 6 37 13 37 14 37 18 39 7 39 10 39 12 39 17 39 18 39 20 ;

Notice that I've added a few rows to increase generality. I've also skipped DIFW, since it is easily computed on the fly.

Solution:

proc sql; create table two as select lookahead.id , lookahead.aw , case when lookahead.aw - prev.aw > 2 then . else lookahead.aw - prev.aw end as difw from ( select one.*, next.aw as next_aw from one left join one as next on one.ID = next.ID and next.aw > one.aw group by one.id, one.aw having next.aw = min(next.aw) ) as lookahead left join one as prev on lookahead.ID = prev.ID and prev.aw < lookahead.aw where 0 < next_aw - lookahead.aw < 3 or 0 < lookahead.aw - prev.aw < 3 group by lookahead.id, lookahead.aw having prev.aw = max(prev.aw) order by lookahead.id, lookahead.aw ; quit;

Result:

id aw difw

37 13 . 37 14 1 39 10 . 39 12 2 39 17 . 39 18 1 39 20 2


Back to: Top of message | Previous page | Main SAS-L page