Date: Thu, 20 Apr 2006 16:46:37 -0400 "Thompson, Carol" "SPSSX(r) Discussion" "Thompson, Carol" FW: PCA on Categorical and continous data text/plain; charset="us-ascii"

A reference I used a couple of years ago (Principal Component Analysis by I. T. Jolliffe, Springer-Verlag 2000 (newer edition in 2003)) indicates the following about PCA that may or may not be useful in your situation. Please note the distinction in use for the results -- descriptive versus inferential. Our use was of the descriptive type for part of a sorting out process among a very large number of variables.

"Section 11.1 Principal Component Analysis for Discrete Data When PCA is used as descriptive technique, there is no reason for the variables in the analysis to be of any particular type. At one extreme, x may have a multivariate normal distribution, in which case all the relevant inferential results mentioned in Section 3.7 may be used. At the opposite extreme, the variables could be a mixture of continuous, ordinal, or even binary (0-1) variables. It is true that variances, covariances and correlations have especial relevance for multivariate normal x, and that linear functions of binary variables are less readily interpretable than linear functions of continuous variables. However, the basic objective of PCA, to summarize most of the 'variation' which is present in the original set of p variables, using a smaller number of derived variables, can be achieved regardless of the nature of the original variables."

This section is one of several addressing the use of PCA for special types of data including non-independent, times-series, compositional, in presence of missing data ....

There was also a short discussion thread on 11/3/04 with the subject: "Normal distribution and factor Analysis & Dichotomous Items" where I brought up this reference and Hector Maletta discussed the issues relating to an inferential situation. Perhaps these will also help in your considerations.

Carol

Carol B. Thompson Sr. Programmer/Analyst Anteon Corporation 181 N. Arroyo Grande Blvd, Ste 105 Henderson, NV 89074-1624 Ph: (702) 731-5550 x 111 Fax: (702) 731-4027

-----Original Message----- From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of Jatender mohal Sent: Wednesday, April 19, 2006 4:11 PM To: SPSSX-L@LISTSERV.UGA.EDU Subject: PCA on Categorical and continous data

HI List.

Recently, my colleague appeared her PhD viva.

Her examiner criticised the principal component analysis on both categorical and continuous data taken together.

Can we employ PCA on these datasets? If yes then what are the advantages and disadvantages of using PCA on categorical and continuous data?

Is there any article or book citing such analyses?