Date: Fri, 4 Feb 2005 09:07:26 -0200
Reply-To: Marcos Sanches <marcos.sanches@IPSOS.COM.BR>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: Marcos Sanches <marcos.sanches@IPSOS.COM.BR>
Subject: RES: Problem with factor analysis
Content-Type: text/plain; charset="US-ASCII"
Thamks very much Dirk, you couldn't be clearer!
Now I can understand perfectly why pairwise deletion can be a problem.
De: Dirk Enzmann [mailto:firstname.lastname@example.org]
Enviada em: Friday, February 04, 2005 8:53 AM
Para: SPSSX(r) Discussion; marcos.sanches@IPSOS.COM.BR
Assunto: Re: Problem with factor analysis
The problem with pairwise deletion is, that it can produce correlation
matrices that do not satisfy the triangular inequality conditions among
triples of correlation coefficients.
For an explanation I will draw haevily from Wothke (1993) (see below):
The admissible range of correlations between two variables i and j is
codetermined by the correlations of all other variables with i and j.
Consider the following correlation matrix
v1 v2 v3
v2 .9 1.0
v3 -.5 .9 1.0
V2 correlates .9 with both variables v1 and v3, which, in turn, are
correlated at -.5. However, if r(v1,v2) and r(v2,v3) are taken to be .9,
r(v1,v3) can range only between .62 and 1.0.
You can check the admissible range by calculating the limiting values of
limit1 = cos(acos(vi,vx)-acos(vj,vx))
limit2 = cos(acos(vi,vx)+acos(vj,vx))
where x stands for any arbitrary third variable in the same correlation
matrix. In the example, the range of r(v1,v3) is limited between
cos(acos(.9)-acos(.9)) = cos(.451 - .451) = 1.0
cos(acos(.9)-acos(.9)) = cos(.451 + .451) = .62.
If, due to pairwise deletion of missing values, r(v1,v3) happens to fall
outside these limits, the correlation matrix is not positive definite,
that is, at least one eigenvalue is negative. The above correlation
matrix yields the following eigenvalues:
( 2.047, 1.500, -.547)
As a consequence, the SPSS procedure FACTOR will stop with the warning:
Matrix not positive definite.
For a more detailled discussion see:
Wothke, W. (1993). Nonpositive definite matrices in structural modeling.
In K.A. Bollen & J.S. Long (Eds.), Testing structural equation models
(pp. 256-293). Newbury Park, CA: Sage.
At Thu, 3 Feb 2005 08:49:48 -0200
Marcos Sanches <marcos.sanches@IPSOS.COM.BR> wrote:
> I am performing a factor analysis with 60 attributes. Each one were
> rated in a importance scale - 0 for Not Important to 10 for Very
> Important. As there are so many attributes each respondent rated only
> a random subset of 39 attributes. That means I have 60 - 39 = 21
> missing values for each case. I want to run a factor analysis with
> this data.
> If I select the listwise method of missing values deletion, of
> course I end up with no case left.
> If I select the pairwise method of missing values deletion, I
> get na error message - "The matrix is not positive definite. This may
> be due to pairwise deletion of missing values.".
> If I select the "replace with means" method of missing values
> substitution, then it sorks well.
> My questions are:
> 1) Why does the pairwise method does not work? Even when I select a
> subsample it does not work. I know what is a 'not positive definete
> matrix', but why this happens? Is there a way to handle this?
> Before the enterview had been done, I made a simulation using
> another study. I deleted randomly some values for every case so that
> every cases had some missing values, then I ran a factor analysis with
> pairwise deletion. It worked pretty well, in fact the final factor
> were almost the same these one got with the complete data. So I didn't
> hope this problem could happen.
Dr. Dirk Enzmann
Institute of Criminal Sciences
Dept. of Criminology
phone: +49-040-42838.7498 (office)