LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (August 2008, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 22 Aug 2008 10:54:17 -0500
Reply-To:     Mary <mlhoward@avalon.net>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Mary <mlhoward@AVALON.NET>
Subject:      Re: how to copy a row
Comments: To: "./ ADD NAME=Data _null_," <iebupdte@GMAIL.COM>
Content-Type: text/plain; charset="iso-8859-1"

The problem is that your approach loses all formats and informats on the original variables, and with mixed character/numeric data, it drops all the character variables. When I try an original data set like this:

data wide;

informat adm 4.0 dose 4.0 wt 4.1 drug_name $40.;

format adm 4.0 dose 4.0 wt 4.1 drug_name $40.;

input id:$3. visit adm dose wt drug_name/*OP metioned many more*/;

cards;

001 1 52 32 78.2 Prozac

001 2 25 54 78.5

001 3 . . .

002 1 56 54 90.0 Zoloft

002 2 . . .

003 1 52 32 78.5 Wellbutrin

003 2 . 54 78.2 Prozac

003 3 25 . 79.4

;

run;

It drops my drug name, and the resulting data set has no informats or formats. Also, you've got no control on individual variables, as you might want to update some and not update others, or ask more complicate IF statements, like the length of the time of the lag. Here's a longer approach, but in my mind clearer, and also it allows for mixed variable types such as integer, decimal, character, and date, and retains all informats and formats of the original variables:

data wide;

infile cards missover;

informat adm 4.0 dose 4.0 wt 4.1 drug_name $40. flu_vaccine_date mmddyy10.;

format adm 4.0 dose 4.0 wt 4.1 drug_name $40. flu_vaccine_date mmddyy10.;

input id $3. visit adm dose wt drug_name flu_vaccine_date;

cards;

001 1 52 32 78.2 Prozac 11/1/2006

001 2 25 54 78.5

001 3 . . .

002 1 56 54 90.0 Zoloft 10/7/2007

002 2 . . .

003 1 52 32 78.5 Wellbutrin 11/8/2006

003 2 . 54 78.2 Prozac 11/5/2007

003 3 25 . 79.4

;

run;

data wide2;

set wide;

by id;

retain prev_adm prev_dose prev_wt prev_drug_name prev_flu_vaccine_date;

lag_adm=lag(adm);

lag_dose=lag(dose);

lag_wt=lag(wt);

lag_drug_name=lag(drug_name);

lag_flu_vaccine_date=lag(flu_vaccine_date);

if lag_adm ^= . then prev_adm=lag_adm;

if lag_dose ^=. then prev_dose=lag_dose;

if lag_wt ^=. then prev_wt=lag_wt;

if lag_drug_name ^= ' ' then prev_drug_name=lag_drug_name;

if lag_flu_vaccine_date ^= . then prev_flu_vaccine_date=lag_flu_vaccine_date;

if adm=. and not first.id then adm=prev_adm;

if dose=. and not first.id then dose=prev_dose;

if wt=. and not first.id then wt=prev_wt;

if drug_name=' ' and not first.id then drug_name=prev_drug_name;

if flu_vaccine_date=. and not first.id then flu_vaccine_date=prev_flu_vaccine_date;

keep id visit adm dose wt drug_name flu_vaccine_date;

run;

-Mary

----- Original Message ----- From: ./ ADD NAME=Data _null_, To: SAS-L@LISTSERV.UGA.EDU Sent: Friday, August 22, 2008 9:50 AM Subject: Re: how to copy a row

This does not address LOCF for character variables. I can't recall doing LOFC for a character variable but there is no reason not too. Here I assume all _NUMERIC_.

I believe this code is pretty close to what I would consider the current standard LOCF that I have seen posted here on SAS-L or written about is SUG.

Makeing DATA LOCF below a view and then using that as input to PROC SORT would be a bit more efficient espically with more data.

data wide; input id:$3. visit adm dose wt /*OP metioned many more*/; cards; 001 1 52 32 78 001 2 25 54 78 001 3 . . . 002 1 56 54 90 002 2 . . . 003 1 52 32 78 003 2 . 54 78 003 3 25 . 79 ;;;; run; /* normalize */ proc transpose out=tall; by id visit; var adm--wt; run; /* LOCF */ proc sort; by id _name_ visit; run; data locf; do until(last._name_); set tall; by id _name_; locf = ifn(missing(col1),locf,col1); output; end; run; proc print; run; /* back to wide if you like */ proc sort; by id visit; proc transpose out=widelocf; by id visit; var locf; id _name_; run; proc print; run;

On 8/22/08, Mary <mlhoward@avalon.net> wrote: > > And what would your view of "normalizing the data" look like? This data > does look normalized to me- you wouldn't want data to be variables like > patid, visit, variable name, value, since variables have values of > substantially different types and ranges of values, like a visit would be an > integer, a strength would be a MG value like 20MG, a number of tabs would be > an integer, like 2 tabs per day, a weight would be a value perhaps with a > decimal, and a height might have a decimal as well, whereas the name of the > drug might be something like "Prozac" and be character. Thus, having > variables across **does** seem to be appropriate normalization for data like > this, and does not seem to be difficult to deal with in "carrying forward" > previous values. > > Yes, each variable should be carried forward individually; certainly missing > character variables and missing numeric variables could not be combined in > the same array, like this: > > > > do i=1 to dim(vararray); > > a=lag(vararray(i)); > > if vararray(i)=' ' and not first.visit then vararray(i)=a; > > end; > > drop i a; > > run; > > > > But this code could be done on groups of variables with the same type that > are variables that make sense to carry forward values, by specifying the > variables desired in the array and either missing or space in the IF > condition depending on the variable type. > > > > -Mary > > > > > ----- Original Message ----- > From: ./ ADD NAME=Data _null_, > To: SAS-L@LISTSERV.UGA.EDU > Sent: Friday, August 22, 2008 7:00 AM > Subject: Re: how to copy a row > > It should also be pointed out that the example of carrying for an > entire observation as the OP asked for and as provided by my example, > (which suites the sample but not the real world, it wont work with two > or more missing visits) is not the "accepted" LOCF technique. > > The missing pattern is rarely "complete" across and entire > observation. A more general solution would involve carrying forward > each variable individually. This is easily accomplished by > normalizing the data. Makes the code much simpler, too. > > There are many examples on SAS-L and in SUG papers. > > > On 8/21/08, Lou <lpogoda@verizon.net> wrote: > > "Kevin Viel" <citam.sasl@GMAIL.COM> wrote in message > > > news:200808212013.m7LGKBff032286@malibu.cc.uga.edu... > > > On Mon, 18 Aug 2008 08:50:11 -0700, pausha <pausha1@GMAIL.COM> wrote: > > > > > > >i have this strangle condition and need some help > > > >I have a dataset with patients who visit every week ,say if they miss > > > >a week i still have to have their data copied form previous week for > > > >that missed one. > > > >eg > > > > > > > >pat-id visit adm dose wt > > > > > > > >001 1 52 32 78 > > > >001 2 25 54 78 > > > >001 3 . . . > > > >002 1 56 54 90 > > > >002 2 . . . > > > > > > > >in this case i would have to have the previous row copied to the > > > >visits that is missing > > > >help with logic to populate the whole row > > > > > > > >i have 40 variables in a row. > > > > > > I have not read through any but one of the responses, so I hope that I > am > > > not duplicating my warning. No, check that, I hope I am repeating it. > > > > > > I would strongly advise against this. You could bias your results. > > > Perhaps wt (weight) does not change from visit to visit (or more), but > > > other things may. > > > > > > > In a clinical trials setting, it's pretty common to do analyses on a "last > > observation carried forward" basis to fill in missing data in addition to > > having analyses based solely on the data collected. Any tables generated > on > > an LOCF basis are so labeled. > >


Back to: Top of message | Previous page | Main SAS-L page