LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (January 2010)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 13 Jan 2010 20:47:21 -0300
Reply-To:     Hector Maletta <hmaletta@fibertel.com.ar>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Hector Maletta <hmaletta@fibertel.com.ar>
Subject:      Re: Factor Analysis on dichotomous variables
Comments: To: "Angelina S. MacKewn" <amackewn@utm.edu>
In-Reply-To:  <6F692813E2221C42B90DA62B4021FBD007DF7FB6@EXVS1.utm.edu>
Content-Type: text/plain; charset="us-ascii"

Angelina The number of factors (or components) worth retaining largely depends on the degree of linear correlation or association between the observed variables, either dichotomous or otherwise. If all variables are highly correlated among them, possibly one (or two) factors would explain most of the total or common variance, regardless of the type of variable involved. Besides, there is not a single unequivocal criterion to ascertain the number of factors worth retaining, and much depends on the purpose of the analysis. Sometimes you are after one factor only (which should explain a large fraction of total variance), sometimes you look for various underlying dimensions, either orthogonal to each other or correlated among them (this latter case is obtained through oblique rotation). The common criterion of using only factors with eigenvalue above 1, or using the scree curve to identify the cutoff factor, are only rules of thumb that not always are useful. One has, besides, to understand that factors are mathematical constructs, not real objects, and therefore one can heuristically select the most useful variant. I am of course speaking of exploratory factor analysis. What is called confirmatory factor analysis should more properly be treated as structural equation models with latent variables. However, in my humble opinion, these "confirmatory" analyses cannot "confirm" that the model is right, nor "prove" causal links between variables. Factor analysis simply replaces observed variables with a (possibly smaller) number of underlying scales, all of which are linear functions of the observed variables.

Hector

-----Original Message----- From: Angelina S. MacKewn [mailto:amackewn@utm.edu] Sent: 13 January 2010 20:32 To: Hector Maletta Subject: RE: Factor Analysis on dichotomous variables

Hector,

I have read the argument that dichotomous variables in a PCA produces too many components? Do you think this is something that one would get nailed on when we go to publish this?

Thanks for an answer I could understand. I am not a statistician, just a researcher trying to write a paper.

Cheers, Angie

-----Original Message----- From: Hector Maletta [mailto:hmaletta@fibertel.com.ar] Sent: Wed 1/13/2010 5:29 PM To: Angelina S. MacKewn; SPSSX-L@LISTSERV.UGA.EDU Subject: RE: Factor Analysis on dichotomous variables

Any factor analysis can be run on dichotomous variables, because these variables can legitimately be considered as interval measures. As only one interval is involved (from 0 to 1), there is no question of comparing unequal intervals. Their mean is the proportion (p) of the value 1, and the variance is p(1-p). There is a specific SPSS procedure, CATPCA, for principal component analysis of categorical variables (ordinal or nominal, any number of categories). However, for dichotomous variables CATPCA gives the same solution as classical Principal Components Analysis of interval variables (PCA is one of the variants of factor analysis). Purists insist that dichotomous variables cannot be used in anything related to regression, because their residuals are not normally distributed. To see this, one has to see that the predicted value for a dichotomous variable is either a value between 0 and 1, or a value outside that interval. In the first case, the actual values will be either 1 or 0, and the residuals would therefore be piled at the ends of the 0,1 interval, and not around the predicted value. In the second case, the residuals will all be at one side of the predicted value. In any case, their distribution would not be normal.

However, dummy variables (i.e. variables with value 0 or 1) are routinely used in regression. Factor analysis is a variant of linear regression (or, more widely, a variant of the Generalized Linear Model) and therefore this habitual use applies also to it.

-----Original Message----- From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of Angelina S. MacKewn Sent: 13 January 2010 19:41 To: SPSSX-L@LISTSERV.UGA.EDU Subject: Factor Analysis on dichotomous variables

What is the factor analysis (PCA) equivalent that can be run on dichotomous variables. I have 50 exhibited behaviours (yes/no) that I want to factor together. I have a sample size of about 500. I would be using SPSS and could use syntax if it is available.

Thanks, Angie

===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


Back to: Top of message | Previous page | Main SPSSX-L page