Date: Fri, 17 Aug 2007 20:30:54 -0400
Reply-To: Ken Borowiak <EvilPettingZoo97@AOL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Ken Borowiak <EvilPettingZoo97@AOL.COM>
Subject: Re: Padding out a denormalized dset--what's the easy way?
On Fri, 17 Aug 2007 16:00:44 -0700, Pardee, Roy <pardee.r@GHC.ORG> wrote:
>Hey All,
>
>I've got to de-normalize a dataset--that is, take something like this:
>
>mrn dx
>Roy V90
>Roy 402.2
>Roy 333
>Mary 200.3
>
>And turn it into:
>
>mrn dx1 dx2 dx3
>Roy V90 402.2 333
>Mary 200.3
>
>I'm on top of that--can get there w/a call to PROC TRANSPOSE. My source
>dset will change every month, so I'm not sure how many dx<x> fields I'll
>wind up with on a given run, but let's say I know it will never be more
>than 15.
>
>I then need to append that data onto a dset w/vars dx1-dx15 defined. I
>can use the FORCE option on proc append to get it to actually do the
>append, but then I wind up with nastygrams in my log like these here:
>
>WARNING: Variable Dx6 was not found on DATA file.
>WARNING: Variable Dx7 was not found on DATA file.
>
>Question: is there a graceful way of padding out my de-normed dset w/the
>missing dx<x> vars prior to the append? My first thought is a macro w/a
>query on dictionary.columns to find out what my highest dx<x> var is,
>and a loop to generate them, but that seems like a lot of trouble to go
>to. Is there some slick way of getting that work done?
>
>Thanks!
>
>-Roy
>
Roy,
You can flatten your file with DATA step.
data have ;
input mrn $ dx $ ;
datalines ;
Roy V90
Roy 402.2
Roy 333
Mary 200.3
Oprah 250.1
Oprah 378.55
Oprah 410.1
Oprah 689
Oprah 555.10
;
run ;
data want( drop=dx ) ;
array _dx[*] $ dx1-dx15 ;
do _n_=1 by 1 until( last.mrn ) ;
set have ;
by mrn notsorted ;
_dx[_n_]=dx ;
end ;
run ;
HTH,
Ken
|