LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (September 2005, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Tue, 20 Sep 2005 12:52:57 -0700
Reply-To:   David L Cassell <davidlcassell@MSN.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   David L Cassell <davidlcassell@MSN.COM>
Subject:   Re: Imputation
In-Reply-To:   <200509201752.j8KGHO2j029564@malibu.cc.uga.edu>
Content-Type:   text/plain; format=flowed

mailprems@YAHOO.COM wrote: >I have two data sets named ‘good’ and ‘bad’. I want to impute values for >the data in the ‘bad’ data set based on the ‘good’ set. I have variables >like model year, make and type which can be used to compare the datasets. > >Actually, I want to retrieve values from the ‘good’ data set based on model >year, make and type and assign it to the ‘bad’ data set.

I'm going to get all curmudgeon-like (you know, just like always) and disagree with everyone else.

If you are doing real imputation, and not just filling in holes in your data with fixed exact values that are not random in any way, then please do NOT use single imputation methods such as the SAS code offered so far. Look into multiple imputation so that you can later (statistically) assess the consequences of your actions!

PROC MI does *not* provide a mechanism for inserting values off alternative data files. I'm not sure that this is even a good idea for multiple imputation, unless you can do something to establish that the 'good' data come from the EXACT same population as the 'bad' data. Not pretty much the same, with just a few tweaks, but exactly the same targt population. Otherwise, you risk introducing a host of biases due to the fact that your 'good' data are data representing a different target population which may have important distinctions in some underlying characteristics.

If you really do want imputation in the statistical sense, perhaps you should write back to SAS-L and explain your process more fully.

HTH, David -- David L. Cassell mathematical statistician Design Pathways 3115 NW Norwood Pl. Corvallis OR 97330

_________________________________________________________________ On the road to retirement? Check out MSN Life Events for advice on how to get there! http://lifeevents.msn.com/category.aspx?cid=Retirement


Back to: Top of message | Previous page | Main SAS-L page