LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (June 2008)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 20 Jun 2008 10:53:44 -0400
Reply-To:     Art@DrKendall.org
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Art Kendall <Art@DrKendall.org>
Organization: Social Research Consultants
Subject:      Re: FAQ: Avoid using EXECUTE
Comments: To: Christian Ganser <Christian.Ganser@soziologie.uni-muenchen.de>
In-Reply-To:  <485BBE80.8050507@soziologie.uni-muenchen.de>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

There are *additional *things that save time. The FAQ posted is not the only thing that could b e a FAQ. See Raynald's book that is free from SPSS.

What else would save time, depends on the nature of your data, the nature of your transformations, and the nature of your analysis. For example, it is a rare occasion that you will not want to refine your syntax in some way. You can save time IN THE LONG RUN, by being sure that you can go back and redo any or all of your work, e.g., by never saving with the same filename, never writing over variables you transform, etc.

It is also a rare occasion that you will do the whole project with no interruptions such as meals, bathroom breaks, interruptions, etc. You can save time if you try to write in such a way that you yourself, or someone else at another time, will know what you were thinking.

Art

Christian Ganser wrote: > Well, I didn't want to question Richard Ristow's experience. What I > thought of weren't spredsheets but rather Stata as an example, which I > don't know to have a similar command and where I don't eyeball the > results, as I don't in SPSS - just seems to me that in many cases > there are other things that would save more time than dropping some > executes. But I might be thinking too much of my own problems with up > to 100,000 cases, not more, mostly less. > > Art Kendall wrote: >> There are some programming practices that don't do a lot of harm on >> small exercise type application, but that can be harmful on many real >> problems. Over use of EXECUTE is one that we see quite often is posts >> on lists. >> It is helpful for beginners to be aware of things that people with >> decades of experience like Richard Ristow have seen to be problematic >> for themselves and clients. >> Unfortunately, is often true that people telling us where some of the >> pitfalls are is an exercise in futility. Teachers and consultants often >> feel like Cassandra must have felt. >> >> Spreadsheets are optimized for small scale problems. Aside: There are >> reasons why spreadsheets are not accepted by ISO for accounting >> purposes. They are great for what they do well and I use them all the >> time to send clients things output from SPSS and small problems with >> small amounts of data transformation. >> >> A major part of optimization for software designed for small problems is >> to put all of the data in memory. >> Also the amount of transformation done is typically limited compared >> to what is done with a stat package. Statistical applications in >> spreadsheets, especially those involving matrix inversion, and >> probability functions are well known to have numerical analysis problems >> resulting in unrealistic results. >> >> Packages like SPSS are optimized to handle a wide variety of file >> sizes. Part of SPSS's efficiency is that for most procedures it keeps >> only one case at a time in memory and can therefore use memory for >> summary information. >> >> If you have only a few cases (e.g, 1000) and a very up to date machine, >> the time saving by dropping execute commands will be small. That is one >> reason it is a common practice to use a test data set with something >> like 1000 cases while developing the application. >> >> Transformation syntax can end up being very long. For example, if one >> has a mid-sized set of syntax of 300 lines, eliminating 50 or 60 >> executes (and therefore a read pass , a write pass, then another read >> pass for each execute) can be very time-saving. Although it is true >> that this is no longer in terms of days or even many hours, it can be a >> substantial savings. >> >> During the development and debugging phase, it often more useful to look >> at intermediate results to check the logic of transformations, by using >> additional transformation test, doing descriptive stats, etc., than to >> eyeball the results. >> >> Art Kendall >> Social Research Consultants >> >> >> >> Christian Ganser wrote: >>> Funny discussion imho which arises from time to time. Why is there a >>> command one shouldn't use? Why aren't transformations carried out >>> immediately as in other software-packages? And: Is EXECUTE realy a >>> major >>> reason for slow behaviour of SPSS? This depends on the size of the >>> dataset, and the new interface definitely wastes much more time than >>> some executes on some 1000 cases. So to me it seems the behaviour of >>> SPSS should be improved in this respect, not the behaviour of the >>> users. >>> >>> Richard Ristow wrote: >>>> I haven't posted this for a long time, but >>>> several recent postings have EXECUTE or exe. >>>> statements in example code. None of those >>>> recently posted are needed, and it's important to >>>> know this; unnecessary EXECUTEs can slow processing badly. >>>> >>>> (For a recent EXECUTE that is needed, see my "Re: >>>> Question: print or list in if condition", Wed, 11 Jun 2008.) >>>> >>>> >>>> FAQ: Avoid using EXECUTE >>>> >>>> An occasional reminder: there are very few occasions when EXECUTE is >>>> needed. >>>> >>>> EXECUTE is not needed after a transformation, or >>>> several transformations; the transformations are >>>> carried out when they are needed, when the next procedure or SAVE is >>>> executed. >>>> >>>> It's confusing that you don't *see* >>>> transformation results in the Data Editor, unless >>>> you run EXECUTE, or click "Run Pending >>>> Transformations" (which is the same thing). It's >>>> often worth doing that, just to see what you've >>>> done. But if you don't, the next procedure or >>>> save will still get the results of the transformations. >>>> >>>> EXECUTE is treated very well in section "Use >>>> EXECUTE Sparingly" in any edition of Raynald >>>> Levesque's book: Levesque, Raynald, "SPSSŪ >>>> Programming and Data Management, A Guide for >>>> SPSSŪ and SASŪ Users". SPSS, Inc., Chicago, IL, 2005. >>>> (Downloadable free from the SPSS, Inc., Web site.) >>>> >>>> And EXECUTE isn't harmless. EXECUTE makes SPSS >>>> read the whole data file; multiple EXECUTEs can >>>> badly slow processing of big files. >>>> >>>> ..................... >>>> The logic of EXECUTE: >>>> >>>> In the transformations, >>>> >>>> COMPUTE C = A + B. >>>> EXECUTE. >>>> COMPUTE D = E/C. >>>> EXECUTE. >>>> >>>> At the first EXECUTE, the file is read; the value >>>> of C is computed for every case; and the >>>> resulting file (with all variables) is saved, as >>>> a scratch file. At the second EXECUTE, the file >>>> is read again; D is computed for every case, >>>> using the computed value of C; and the file is >>>> saved again. Five passes through the data: >>>> reading twice, writing once. (Recent versions of >>>> SPSS do optimizations that will save some of this.) >>>> >>>> If you write, instead >>>> >>>> COMPUTE C = A + B. >>>> COMPUTE D = E/C. >>>> >>>> and then whatever procedure or SAVE is desired, >>>> the computations are done when the file is read >>>> for the procedure or SAVE, needing no data passes >>>> for the computation. In this logic, SPSS computes >>>> the value of C for every case, then computes the >>>> value of D for the same case, and then proceeds to the next case. >>>> >>>> ===================== >>>> To manage your subscription to SPSSX-L, send a message to >>>> LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text >>>> except the >>>> command. To leave the list, send the command >>>> SIGNOFF SPSSX-L >>>> For a list of commands to manage subscriptions, send the command >>>> INFO REFCARD >>>> >>> >>> ===================== >>> To manage your subscription to SPSSX-L, send a message to >>> LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except >>> the >>> command. To leave the list, send the command >>> SIGNOFF SPSSX-L >>> For a list of commands to manage subscriptions, send the command >>> INFO REFCARD >>> >>> >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to >> LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the >> command. To leave the list, send the command >> SIGNOFF SPSSX-L >> For a list of commands to manage subscriptions, send the command >> INFO REFCARD >> > >

===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


Back to: Top of message | Previous page | Main SPSSX-L page