LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (January 2005)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Sat, 15 Jan 2005 19:53:00 -0500
Reply-To:     Raynald Levesque <rlevesque@videotron.ca>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Raynald Levesque <rlevesque@videotron.ca>
Subject:      Re: spss performance issues (workspace, cache, execute)
In-Reply-To:  <200501152333.j0FNXfVM024299@listserv.cc.uga.edu>
Content-type: text/plain; charset=iso-8859-1

Hi Alexander,

I ran the following code using v12.0.2 on a 1.5 Ghz PC with 512 Meg of RAM: Time required 5 minutes.

I ran it also using v13 on a 2.6 Ghz PC with 1.5 Gig of RAM: Time required 2.7 minutes.

*///////////. DEFINE !test(!POS=!TOKENS(1)) TITLE nb vars = !1 . NEW FILE. INPUT PROGRAM. VECTOR v(!1). LOOP cnt=1 TO 40. - LOOP idx=1 TO !1. - COMPUTE v(idx)=UNIFORM(1000). - END LOOP. - END CASE. END LOOP. END FILE. END INPUT PROGRAM.

DO REPEAT var=v1 TO !CONCAT('v',!1) /log=vlog1 TO !CONCAT('vlog',!1). - COMPUTE log=LG10(var). END REPEAT. EXECUTE. !ENDDEFINE. *///////////.

SET MPRINT=YES. !test 5000.

I have the following comments on the problem

***** 1. The following extract is from the spssbase.pdf section on SET WORKSPACE command: "WORKSPACE is used to allocate more memory for some procedures when you receive a message that the available memory has been used up or that only a given number of variables can be processed. ... WORKSPACE allocates workspace memory in kilobytes for some procedures that allocate only one block of memory, such as Crosstabs or Frequencies. The default is 4096. • Do not increase either the workspace memory allocation unless the program issues a message that there is not enough memory to complete a procedure."

So you should keep the WORKSPACE to the default value of 4096 UNLESS you receive an message to do otherwise.

***** 2. The default value of CACHE is 20. This is probably already too high for most situations. It is expected that using a higher number of CACHE would hinder performance rather than improve it. You can use the command SHOW CACHE. at any time to display the current number of CACHE (that is the # of files) currently used by SPSS.

Suppose you run the following code with the default value for CACHE (=20). GET FILE='c:\program files\spss\employee data.sav'. SHOW CACHE. COMPUTE x1=1. EXECUTE. COMPUTE x2=2. EXECUTE. COMPUTE x2=3. EXECUTE. SHOW CACHE.

You will see that there is one cache at the beginning and 4 when the code has finished running. A new cache is created each time a data pass was done. In this case data passes are forced by the EXECUTE commands. Any procedure that reads the data, would have the same effect.

The following code is more efficient because it it has only one data passes and hence it creates only one CACHE. (Total # of caches is 2 once program has finished.)

GET FILE='c:\program files\spss\employee data.sav'. SHOW CACHE. COMPUTE x1=1. COMPUTE x2=2. COMPUTE x2=3. EXECUTE. SHOW CACHE.

So you should try to have blocks of transformations commands and then do the procedures.

***** 3. From 2. above, you can see that adding EXECUTE does not help.

***** 4.

I am pretty sure that using a DO REPEAT block is more efficient than having 5000 lines of COMPUTE vlog=LN10(varname). It is also much more compact to write and maintain.

Hope the above helps.

Regards

Raynald Levesque Raynald@spsstools.net Visit my SPSS site: http://www.spsstools.net

-----Original Message----- From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU]On Behalf Of Alexander J. Shackman Sent: January 15, 2005 6:34 PM To: SPSSX-L@LISTSERV.UGA.EDU Subject: Re: spss performance issues (workspace, cache, execute)

PS: I'm using spss v12 (with patches)...

On Sat, 15 Jan 2005 18:27:35 -0500, Alexander J. Shackman <shackman@PSYPHW.PSYCH.WISC.EDU> wrote:

>Listmembers -- > >I have encountered slow (hours of CPU time) performance trying to run a >simple compute statement (see below) on a moderate sized file (4788 numeric >vars X 40 cases; <5Mb; stored on the local harddisk) on my desktop and >laptop PCs (>1.75GHz; >425Mb RAM). By contrast, I observe quite reasonable >performance when I run a Student's t test (2394 condition A's vs. 2394 >condition B's) on the same data. > >In attempting to diagnose the problem, I've scoured Raynald Levesque's site >and book, as well as spss's corporate site. From these sources, I've >experimented with modifying (1) the workspace size -- increasing it to >400Mb, (b) frequency of caches -- increasing it from the default n=20 to >n=5000, and number of EXECUTE statements -- from 1 per 4388 COMPUTE >statements to 1 per 100 COMPUTEs (see below). I have also confirmed that >the syntax runs on a small subset of variables (1st 100) to test whether >there was a simple syntax error. > >The syntax for modifying the spss settings looks like: > >CACHE. >SET WORKSPACE=399000. >SET CACHE 4788. >show all. > >The syntax for the transformation looks like: > >COMPUTE lgsf1=LG10(safe1) . >COMPUTE lgsf2=LG10(safe2) . >COMPUTE lgsf3=LG10(safe3) . >COMPUTE lgsf4=LG10(safe4) . >COMPUTE lgsf5=LG10(safe5) . >COMPUTE lgsf6=LG10(safe6) . >COMPUTE lgsf7=LG10(safe7) . >COMPUTE lgsf8=LG10(safe8) . >COMPUTE lgsafe9=LG10(safe9) . >. >. >. >COMPUTE lgth2391=LG10(thrt2391) . >COMPUTE lgth2392=LG10(thrt2392) . >COMPUTE lgth2393=LG10(thrt2393) . >COMPUTE lgth2394=LG10(thrt2394) . >EXECUTE . > >As I said, I've experimented with the frequency of interspersing EXECUTE >statements. > >If anyone has any suggestions for either improving performance or >diagnosing the problem, I would much appreciate it. Perhaps by condensing >the code into a more elegant form, performance would be improved?? > >Thanks, >Alex Shackman >------------------------------------------------------------------ >Alexander J. Shackman >Laboratory for Affective Neuroscience | W.M. Keck Laboratory for >Functional Brain Imaging & Behavior >University of Wisconsin-Madison >1202 West Johnson Street >Madison, Wisconsin 53706 > >PH: +1 (608) 358-5025 (cell) >FAX: +1 (608) 265-2875 >EMAIL: ajshackman@gmail.com >WWW: http://psyphz.psych.wisc.edu/~shackman | >http://brainimaging.waisman.wisc.edu/~shackman/


Back to: Top of message | Previous page | Main SPSSX-L page