|
On Tue, 19 Aug 2008 06:12:05 -0400, KNuak <asom77@HOTMAIL.COM> wrote:
>Could someone help me with the SAS code to obtain ds2 USING SQL, please?
>Ie. to retain the ids and the aws, that will give me difw less than 3? I
>HAVE 300 DIFFERENT IDS, ONLY 2 ARE SHOWN. IDs are sorted. Aw = age in
>weeks, and difw is difference in weeks given by current minus previous week.
>* My ultimate aim is to arrive at ds2 below USING SQL. ANY HELP,PLEASE?
>Data w1;
>Set w;
>BY ID;
>Run;
>Ds1:
>Id aw difw
>37 3 .
>37 6 3
>37 13 7
>37 14 1
>37 18 4
>39 7 .
>39 10 3
>39 12 2
>39 17 5
>Ds2:
>Id aw difw
>37 13 .
>37 14 1
>39 10 .
>39 12 2
SQL is not particularly well suited for this task, but ...
The critical thing to recognize is that the decision on whether to keep a
row requires inspection of both the preceding and following rows (with row
order being reflected in the strictly monotonic values of AW within each ID
group). Look at the SQL models in
http://www.sascommunity.org/wiki/Look-Ahead_and_Look-Back. One is not
applicable because it requires an index column which explicitly specifies
the row order. The other depends only on the monotonicity of AW, so it can
be used. Once it is adapted, it's only matter of coding a WHERE clause to
select the rows and coding a CASE structure to suppress DIFW values at the
start of each subseries.
Test data:
data one;
input id aw;
cards;
37 3
37 6
37 13
37 14
37 18
39 7
39 10
39 12
39 17
39 18
39 20
;
Notice that I've added a few rows to increase generality. I've also skipped
DIFW, since it is easily computed on the fly.
Solution:
proc sql;
create table two as
select lookahead.id
, lookahead.aw
, case when lookahead.aw - prev.aw > 2 then .
else lookahead.aw - prev.aw
end as difw
from ( select one.*, next.aw as next_aw
from one left join one as next
on one.ID = next.ID and next.aw > one.aw
group by one.id, one.aw
having next.aw = min(next.aw)
) as lookahead
left join one as prev
on lookahead.ID = prev.ID and prev.aw < lookahead.aw
where 0 < next_aw - lookahead.aw < 3 or
0 < lookahead.aw - prev.aw < 3
group by lookahead.id, lookahead.aw
having prev.aw = max(prev.aw)
order by lookahead.id, lookahead.aw
;
quit;
Result:
id aw difw
37 13 .
37 14 1
39 10 .
39 12 2
39 17 .
39 18 1
39 20 2
|