LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (July 2006, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 6 Jul 2006 07:55:06 -0400
Reply-To:     Peter Flom <Flom@NDRI.ORG>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Peter Flom <Flom@NDRI.ORG>
Subject:      Re: survey regression analysis
Comments: To: David L Cassell <davidlcassell@MSN.COM>
In-Reply-To:  <BAY103-F31013E1D1681B076D7ACCAB0770@phx.gbl>
Content-Type: text/plain; charset=US-ASCII

I wrote

In a lecture that he gave here at NDRI, and in other lectures I have heard him give, Joe Schafer has indicated that some of his results show that MI is a better technique than listwise deletion even when the data are MNAR.

I haven't got any formal published cites for this, although there may be some by now, but thought it apropos. He indicated that the degree of bias introduced by MNAR would have to be quite extreme for llistwise to be better than MI.

and David replied <<< I agree with Joe. (Of course!) I prefer MI to listwise deletion in survey samle analysis.

HOWEVER, the problem remains that treating MNAR (Missing Not At Random) points as if they have the same distributional properties as the sampled data can be fundamentally error-prone. I mean, so what if MI does better than listwise deletion if both of them stink at filling in the holes that are there because the missings are not random and NOT from the same population?

When I teach survey sampling classes, I make a big stink about this, because quite often there is a *reason* why the data are missing.

We sample 98 out of a 100 lakes for mercury, but for the other 2 lakes we cannot get permission. Do we assume those 2 lakes are just like everything else we measured? (That's MAR.) Maybe. Maybe not. Perhaps those 2 lakes are owned by curmudgeons who just don't want a bunch of Feds on their land nosing around. Or perhaps those lakes ought to be Superfund sites because of the dumping that has been going on in them for decades. (Uh-oh. That's MNAR. Those lakes are drastically different, and perhaps are not even part of the same population, depending on our sample frame.)

So can MI or listwise deletion help us here? If the lakes are heavily contaminated and the owners don't want us to find out, then I would say no. The imputation or deletion assumes that we can fill in 'reasonable' values from the sampled observations, which is not the case. >>>

I am not surprised that you agree with Joe. Disagreeing with Joe about missing data would be odd. And, of course, I agree with you. The case you bring up with the lakes is analagous to one I presented to Joe about our own data, e.g., in studying treatment plans for drug abuse, loss to followup is often directly due to drug abuse. He agreed with me (and you) that, in this case, there are no good methods.

His point was, I think, that in cases where the data do not strictly meet the MAR assumption, MI may still produce useful results. Like many assumptions, MAR can be grossly or mildly violated. However, since it is almost always impossible to test, the practical upshot of this is that judgement is necessary. That's good. Keeps us employed

Peter

Peter L. Flom, PhD Assistant Director, Statistics and Data Analysis Core Center for Drug Use and HIV Research National Development and Research Institutes 71 W. 23rd St http://cduhr.ndri.org www.peterflom.com New York, NY 10010 (212) 845-4485 (voice) (917) 438-0894 (fax)


Back to: Top of message | Previous page | Main SAS-L page