LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (June 2009)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Sun, 7 Jun 2009 08:43:03 +0800
Reply-To:     Eins Bernardo <einsbernardo@yahoo.com.ph>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Eins Bernardo <einsbernardo@yahoo.com.ph>
Subject:      Re: detecting linear combinations/high correlations in a data set
Comments: To: Art@DrKendall.org
Content-Type: multipart/alternative;

Hi Mark, Art, etc

What do you mean by "singularities"?   Thank you. Eins

--- On Sat, 6/6/09, Art Kendall <Art@DrKendall.org> wrote:

From: Art Kendall <Art@DrKendall.org> Subject: Re: detecting linear combinations/high correlations in a data set To: SPSSX-L@LISTSERV.UGA.EDU Date: Saturday, 6 June, 2009, 12:04 PM

When I pseudorandomly generate 150 cases with 550 variables, I of course get singularities.

Please describe the nature of your data. Then we may be able to make suggestions. Are these some sort of repeated measures, e.g., items intended to be in scales, prices over time, energy at different wave-lengths, etc?

RELIABILITY can be useful for tracking down singularities. Open a new instance of SPSS. Copy the syntax below to a syntax file. Click <run>. Click <all>. Then go back to the syntax and put fewer items into the scale. Finally try using just 150. You will see that the SMC squared multiple correlation column now has entries, But they are all 1.000. You can edit the RELIABILITY syntax to produce the whole correlation matrix, but in this instance that would be futile.

new file. input program. vector x (550,f3). loop id = 1 to 150. loop #p = 1 to 550. compute x(#p) = rnd(rv.normal(50,10)). end loop. end case. end loop. end file. end input program. reliability variables= x1 to x550 /scale (bigbunch) = x1 to x550 /SUMMARY =all.

Art Kendall Social Research Consultants

M wrote:

#yiv853264627 .hmmessage P { margin:0px;padding:0px;} #yiv853264627 { font-size:10pt;font-family:Verdana;}

Hi - I've got a large dataset (over 500 variables, 150K rows) and would like to detect

a) variables that are highly correlated with one another b) linear combinations of variables likely to cause conditioning problems/failed pos.def. correlation matrices.

Whether I'm sampling or not, CORRELATIONS procedure won't take more than 100 variables, and wouldn't help with b), so I'm working with FACTOR and / EXTRACTION PC.

Question: ---------

Before chiseling the wheel, does someone have the code handy to produce the linear combination coefficients of the input variables leading to singularities? Thanks.

Marc.

Hotmail® has ever-growing storage! Don’t worry about storage limits. Check it out.===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD Fast, Ad-free, Unlimited Storage - Yahoo! Mail allows you to have it all at http://ph.mail.yahoo.com


[text/html]


Back to: Top of message | Previous page | Main SPSSX-L page