| Date: | Wed, 27 Sep 2006 16:47:52 -0400 |
| Reply-To: | Richard Ristow <wrristow@mindspring.com> |
| Sender: | "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU> |
| From: | Richard Ristow <wrristow@mindspring.com> |
| Subject: | Re: quick question |
|
|
| In-Reply-To: | <5290ED5BE539294F9C1534FBC9C28E9D049A1239@EXCHANGE.ad.umass p.edu> |
| Content-Type: | text/plain; charset="iso-8859-1"; format=flowed |
OK, you knew you were waving a red flag at THIS bull (grin).
At 10:42 AM 9/27/2006, Hoover, Matthew wrote:
>I'm unsure of what procedures require an execute
>command. For example, for code like this:
>
>
>DO IF (SYSMIS(gradef03)).
>. COMPUTE gradef03=d_grf03.
>END IF.
>EXECUTE.
>
>DO IF (SYSMIS(grades04)).
>. COMPUTE grades04=d_grs04.
>END IF.
>EXECUTE.
>
>Could I re-write this [without either EXECUTE]?
Yes, you could; it would work fine, and faster.
By way of a short tutorial, and skipping Python
(it's just in the way, for this discussion), the
standard order of SPSS programs is a
'transformation program' followed by a
'procedure', then another transformation program and procedure, etc.
(Before this starts, there needs to be a file, so
there needs to be a command that defines one:
DATA LIST, GET FILE, MATCH FILES, ...)
The transformation program is made up of
transformation commands: COMPUTE, IF, RECODE, DO
IF, LOOP, ... They explicitly define the values,
or properties, of variables.
The 'procedures' are the statistical and
reporting commands: FREQUENCIES, REGRESSION, all
the others. EXECUTE is a procedure. Procedures
use the whole file, and require that the whole
file be read. Transformation programs work on a
case (or record) at a time; the program is
'handed' one case at a time, does its work, and
'hands' the result to the procedure, until the
procedure has 'seen' the whole file.
If you write
COMPUTE C = A + B.
EXECUTE.
COMPUTE D = C * E.
then the first COMPUTE statement is a
transformation program by itself. For each record
of the file, it computes the new value of C, then
passes the record to EXECUTE (which does nothing
with it). The new values of C remain in the file,
and the second COMPUTE, which is also a
transformation program, 'sees' the new values and them to compute D.
If you leave out the EXECUTE, and write
COMPUTE C = A + B.
COMPUTE D = C * E.
then the two COMPUTEs together are a single
transformation program. For each record, the
program first computes the new value of C; then,
using that new value, it computes the new value
of D; then, it passes that record to whatever procedure follows.
Put in EXECUTE, and SPSS runs through the file
doing the first computation, then again doing the
second. Leave it out, and it runs through once,
doing both computations.
(Think of the file as a table, with the variables
being columns, and the cases or records being
rows. If you have an EXECUTE between those
COMPUTEs, SPSS moves down the rows making the
first computation; then, starts back at the top
and makes the second. If you have no EXECUTE,
SPSS moves *once* down the rows, making all the
computations row by row.)
Now, to comment on comments:
>using the syntax below will reduce the number of
>required transformations and should give you the
>same results as the DO IF's you're using currently.
>
>IF (SYSMIS(gradef03)) gradef03=d_grf03 .
>IF (SYSMIS(grades04)) grades04=d_grs04 .
>EXE .
It will give the same results. It is fewer
transformation lines than using the DO IFs, but
which you prefer is mainly taste; you'll never
see any difference in the speed. But this "EXE."
is not needed, either; it, too, reads the whole
file to no purpose. The next procedure, or SAVE
FILE, will cause the file to be read and the computations performed.
>>Will SPSS automatically run execute commands
>>when it needs to calculate a value for a new
>>variable (therefore, you can eliminate them
>>completely in syntax except perhaps at the end)?
>
>I believe SPSS executes commands when necessary
>to generate a variable so it can be used, but not until then.
That last is it. It isn't that SPSS runs
'EXECUTE' commands; it's that it doesn't need
them. SPSS runs the whole transformation program
when it 'needs' the results: for a procedure, or a SAVE.
>My mental checklist for using .exe is:
>
>a.) immediate after the lag function is used to compute a var
No; almost never needed. If you use the lagged
value of a variable WHICH IS MODIFIED IN THE SAME
TRANSFORMATION PROGRAM, then EXECUTE may be
needed; see section "Use EXECUTE sparingly" in Raynald Levesque's book.
>b.) after writing out to an .sps file and before
>inserting that file within the same syntax
Yes, but it may not be for the reason you're
thinking. If you're using PRINT or WRITE or XSAVE
in a transformation program, they won't be
executed until the transformation program is run,
which is concurrently with the next procedure.
You do need the do-nothing procedure "EXECUTE" to
cause the transformation program to be run and
the results made available.
>c.) before using the delete vars command
That's needed; DELETE VARS needs to be at the
head of a transformation program. However, you
can almost always rearrange your logic to avoid
the EXECUTE; for example, put DELETE VARS
directly after the next procedure you were going to run anyway.
Finally, do see "Use EXECUTE Sparingly" in Raynald Levesque's book:
Levesque, Raynald, "SPSSŪ Programming and Data
Management, 3rd Edition/A Guide for SPSSŪ and
SASŪ Users". SPSS, Inc., Chicago, IL, 2005.
You can download it free as a PDF file, from
http://www.spss.com/spss/SPSS_programming_data_mgmt.pdf.
(The third edition includes a lot on Python
programming. For any SPSS version earlier 14, the
second edition will be at least as good.)
.......
Cheers, and onward. My apologies for any place
I'm not clear - a tutorial may not come out
right, the first time you write it.
|