Date: Fri, 4 Feb 2005 11:53:26 +0100 Dirk Enzmann "SPSSX(r) Discussion" Dirk Enzmann Re: Problem with factor analysis To: marcos.sanches@IPSOS.COM.BR <200502040508.j1458dvY018124@mx01.rrz.uni-hamburg.de> text/plain; charset=us-ascii; format=flowed

Marcos,

The problem with pairwise deletion is, that it can produce correlation matrices that do not satisfy the triangular inequality conditions among triples of correlation coefficients.

For an explanation I will draw haevily from Wothke (1993) (see below):

The admissible range of correlations between two variables i and j is codetermined by the correlations of all other variables with i and j. Consider the following correlation matrix

v1 v2 v3 v1 1.0 v2 .9 1.0 v3 -.5 .9 1.0

V2 correlates .9 with both variables v1 and v3, which, in turn, are correlated at -.5. However, if r(v1,v2) and r(v2,v3) are taken to be .9, r(v1,v3) can range only between .62 and 1.0.

You can check the admissible range by calculating the limiting values of r(vi,vj) using:

limit1 = cos(acos(vi,vx)-acos(vj,vx)) and limit2 = cos(acos(vi,vx)+acos(vj,vx))

where x stands for any arbitrary third variable in the same correlation matrix. In the example, the range of r(v1,v3) is limited between

cos(acos(.9)-acos(.9)) = cos(.451 - .451) = 1.0 and cos(acos(.9)-acos(.9)) = cos(.451 + .451) = .62.

If, due to pairwise deletion of missing values, r(v1,v3) happens to fall outside these limits, the correlation matrix is not positive definite, that is, at least one eigenvalue is negative. The above correlation matrix yields the following eigenvalues:

( 2.047, 1.500, -.547)

As a consequence, the SPSS procedure FACTOR will stop with the warning: Matrix not positive definite.

For a more detailled discussion see:

Wothke, W. (1993). Nonpositive definite matrices in structural modeling. In K.A. Bollen & J.S. Long (Eds.), Testing structural equation models (pp. 256-293). Newbury Park, CA: Sage.

Dirk

------------------------------

At Thu, 3 Feb 2005 08:49:48 -0200 Marcos Sanches <marcos.sanches@IPSOS.COM.BR> wrote:

> I am performing a factor analysis with 60 attributes. Each one were > rated in a importance scale - 0 for Not Important to 10 for Very > Important. As there are so many attributes each respondent rated only a > random subset of 39 attributes. That means I have 60 - 39 = 21 missing > values for each case. I want to run a factor analysis with this data. > > If I select the listwise method of missing values deletion, of > course I end up with no case left. > > If I select the pairwise method of missing values deletion, I > get na error message - "The matrix is not positive definite. This may be > due to pairwise deletion of missing values.". > > If I select the "replace with means" method of missing values > substitution, then it sorks well. > > My questions are: > > 1) Why does the pairwise method does not work? Even when I select a > subsample it does not work. I know what is a 'not positive definete > matrix', but why this happens? Is there a way to handle this? > > Before the enterview had been done, I made a simulation using another > study. I deleted randomly some values for every case so that every cases > had some missing values, then I ran a factor analysis with pairwise > deletion. It worked pretty well, in fact the final factor were almost > the same these one got with the complete data. So I didn't hope this > problem could happen.

************************************************* Dr. Dirk Enzmann Institute of Criminal Sciences Dept. of Criminology Schlueterstr. 28 D-20146 Hamburg Germany

phone: +49-040-42838.7498 (office) +49-040-42838.4591 (Billon) fax: +49-040-42838.2344 email: dirk.enzmann@jura.uni-hamburg.de www: http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Enzmann.html *************************************************

Back to: Top of message | Previous page | Main SPSSX-L page