LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (January 2005, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 24 Jan 2005 11:15:52 -0800
Reply-To:     cassell.david@EPAMAIL.EPA.GOV
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "David L. Cassell" <cassell.david@EPAMAIL.EPA.GOV>
Subject:      Re: Proc CORR running out of memory
In-Reply-To:  <FE56865C459CD6468D6465A15B908FC0052272CC@exchny54.ny.ssmb.com>
Content-type: text/plain; charset=US-ASCII

"Chaudhury, Jayati [IT]" <jayati.chaudhury@CITIGROUP.COM> wrote: > I am getting an out of memory error when I try to run proc corr on a dataset > with 784 observations and 6122 variables. The code is provided below. > > Is there a limitation on the size that Proc CORR can handle? I am > using SAS 9.1 on Linux. > > This is urgent. So a prompt response will be greatly appreciated.

[1] This is a mailing list/newsgroup of SAS users who have jobs and deadlines themselves. I'm sure you didn't mean to sound so rude as to demand a 'prompt response' when we all have jobs just as time-oriented as you do.

[2] The PROC CORR documentation has specifics on the memory usage given the number of variables, etc. It would have been much faster for you to check that yourself. If you don't have a copy locally, you can read the docs at sas.com .

[3] 6122 variables is ridiculous. How can you possibly work with a 6122x6122 matrix of correlations? That's 18,736,381 unique correlations, just considering the upper triangular matrix and ignoring all the main diagonal. If you want to pick out 'large' correaltions, you'll get so many spuriously significant results that you won't be able to separate the valuable from the random. With that many correlations to wade through, under the null hypothesis you would *expect* to find over 900,000 false positives.

[4] With 6122 variables, you cannot manage when you have only 784 records. What do you expect to achieve with this sort of approach? Perhaps you should re-think your business process here.

HTH, David -- David Cassell, CSC Cassell.David@epa.gov Senior computing specialist mathematical statistician


Back to: Top of message | Previous page | Main SAS-L page