LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (September 2010)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 8 Sep 2010 14:50:22 -0700
Reply-To:     Justin Carroll <jrc.csus@GMAIL.COM>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Justin Carroll <jrc.csus@GMAIL.COM>
Subject:      Re: Large Data Files
Comments: cc: "Pirritano, Matthew" <MPirritano@ochca.com>
In-Reply-To:  <97D6F0A82A6E894DAF44B9F575305CC90F05790C@HCAMAIL03.ochca.com>
Content-Type: multipart/alternative;

I haven't looked into it too much, but I would imagine that a RAID-0 setup with faster RPM's HD's (faster read/write) or 'stroked HDs' (faster seek time I believe) would also increase 'speed'.

Question: Also does anyone know if "Set Workspace" increases performance (I know the help files say to only use when SPSS says it's out of memory, and it only works for 'certain procedures').

*Also I've read somewhere that SPSS single-client version can only utilize a single 'core' of a multi-cored processor. Meaning, your home/work computer is probably running at a fraction of the processing speed that it is capable of. For instance, my home DIY-computer has 6 cores, and SPSS can only utilize one of them and the other 5 are used by other applications. Whereas the SPSS-Server edition can utilize all cores of a processor - dramatically increasing processing speed.* I am not sure, and was unable to find any "google derived" evidence to support this claim, but I 100% positive I read it just a few months back. Can anyone confirm this?

My files are not as large as the ones you guys are using (measured in GB), but mine range in the 100k's of cases, and 1000+ variables sometimes (400-600mb in file sizes). I use both SPSS 15 and 17, and both client and server editions, and I know that if I run a procedure like Crosstabs on the single-client version it takes about 5-10 min for it to run, whereas if I run it on the server edition it takes about 30 seconds.

////// * Some quick references:*

Forum post on RAIDs: * http://forums.hexus.net/hexus-hardware/130603-how-much-speed-difference-there-raid-0-a.html * Article by SPSS on Hardware recommendations (dated 2008): * http://www.spss.com/media/collateral/SSSWP-0608.pdf* Discussion a few months back on this Listserv: * http://spssx-discussion.1045642.n5.nabble.com/Quad-Core-Processors-td1092004.html *

//////

HTH,

J. R. Carroll Grad. Student in Pre-Doc Psychology at CSUS Research Assistant for Just About Everyone. Email: jrc.csus@gmail.com -or- jrcarroll@jrcresearch.net Phone: (916) 628-4204

On Wed, Sep 8, 2010 at 2:14 PM, Pirritano, Matthew <MPirritano@ochca.com>wrote:

> I work with large files > 3 GB, > 4 million lines. > > > > 1. For big data processing jobs use python without the spss front end. > Much faster. > 2. Start with the main file. Eliminate all unnecessary variables and > cases for each analysis. Or if possible use aggregate to pare down the size > of the file. The first step or two will take some time, but then the file > gets smaller and things speed up. > 3. I’ve not tried this one but have read on the list that 64 bit > processor with multiple cpu’s and max ram majorly speeds things up. > > > > Thanks > > Matt > > > > Matthew Pirritano, Ph.D. > > Research Analyst IV > > Medical Services Initiative (MSI) > > Orange County Health Care Agency > > (714) 568-5648 > ------------------------------ > > *From:* SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] *On Behalf > Of *Marcos Sanches > *Sent:* Wednesday, September 08, 2010 2:04 PM > *To:* SPSSX-L@LISTSERV.UGA.EDU > *Subject:* Large Data Files > > > > Hi all, > > > > I wonder if anybody has any suggestion for working with large data file in > SPSS. My data has around 10 millions observation and 30 variables and > everything I do takes a looooooong time... > > > > Thanks a lot! > > > > Marcos > > > > > > > > >


[text/html]


Back to: Top of message | Previous page | Main SPSSX-L page