|Date: ||Thu, 20 Apr 2006 16:46:37 -0400|
|Reply-To: ||"Thompson, Carol" <CThompson@anteon.com>|
|Sender: ||"SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>|
|From: ||"Thompson, Carol" <CThompson@anteon.com>|
|Subject: ||FW: PCA on Categorical and continous data|
|Content-Type: ||text/plain; charset="us-ascii"|
A reference I used a couple of years ago (Principal Component Analysis
by I. T. Jolliffe, Springer-Verlag 2000 (newer edition in 2003))
indicates the following about PCA that may or may not be useful in your
situation. Please note the distinction in use for the results --
descriptive versus inferential. Our use was of the descriptive type for
part of a sorting out process among a very large number of variables.
"Section 11.1 Principal Component Analysis for Discrete Data When PCA is
used as descriptive technique, there is no reason for the variables in
the analysis to be of any particular type. At one extreme, x may have a
multivariate normal distribution, in which case all the relevant
inferential results mentioned in Section 3.7 may be used. At the
opposite extreme, the variables could be a mixture of continuous,
ordinal, or even binary (0-1) variables. It is true that variances,
covariances and correlations have especial relevance for multivariate
normal x, and that linear functions of binary variables are less readily
interpretable than linear functions of continuous variables. However,
the basic objective of PCA, to summarize most of the 'variation' which
is present in the original set of p variables, using a smaller number of
derived variables, can be achieved regardless of the nature of the
This section is one of several addressing the use of PCA for special
types of data including non-independent, times-series, compositional, in
presence of missing data ....
There was also a short discussion thread on 11/3/04 with the subject:
"Normal distribution and factor Analysis & Dichotomous Items" where I
brought up this reference and Hector Maletta discussed the issues
relating to an inferential situation. Perhaps these will also help in
Carol B. Thompson
181 N. Arroyo Grande Blvd, Ste 105
Henderson, NV 89074-1624
Ph: (702) 731-5550 x 111
Fax: (702) 731-4027
From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of
Sent: Wednesday, April 19, 2006 4:11 PM
Subject: PCA on Categorical and continous data
Recently, my colleague appeared her PhD viva.
Her examiner criticised the principal component analysis on both
categorical and continuous data taken together.
Can we employ PCA on these datasets? If yes then what are the advantages
and disadvantages of using PCA on categorical and continuous data?
Is there any article or book citing such analyses?
Thanks in advance