LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (January 2008)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 2 Jan 2008 14:08:10 +0100
Reply-To:     Vicent Giner Bosch <vigibos@eio.upv.es>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Vicent Giner Bosch <vigibos@eio.upv.es>
Organization: =?us-ascii?Q?Universitat_Politecnica_de_Valencia?=
Subject:      Re: Creating a new dataset with Python
Content-Type: text/plain; charset="us-ascii"

Hello again.

I managed to solve it following the suggestion from Jon:

> To make your logic work in SPSS 15, you need to add the new variable > to the active dataset, and then if you want a subset of cases, select > on an appropriate indicator variable after you close the cursor.

Well, in fact, it was an "inspiration" -I did what I needed using SPSS syntax, neither Python-SPSS nor cursors.

This is more or less what I've done:

* Using the "big" dataset as active dataset .

COMPUTE in_200606 = ... create/compute a variable that amounts to 1 if the row belongs to 2006-06, and "blank/null" if not ... .

COMPUTE in_200607 = ... idem for this other date ... .

.. idem for all the months to be considered ...

EXECUTE .

* So, now we have several variables in_YYYYMM-like in the big dataset, * so that in_YYYYMM takes the value 1 for a given row if and only that row * has DATE = YYYY-MM .

* Now, we create a new dataset, as an agregation of the previous one, by * custormer ID. For each custormer, we compute the maximum of the variables * in_YYYYMM .

DATASET DECLARE customers_months. AGGREGATE /OUTFILE='customers_monts' /PRESORTED /BREAK= ID /in_200606 = MAX(in_200606) /in_200607 = MAX(in_200607) ... the same for all the months to be considered... .

This worked for me. Now, I have to delete the auxiliary columns in the big dataset.

This way of doing things is not very smart (you spend lot of time and disk space for computing the auxiliary variables, when the "big" dataset is really big), but it worked, in this case.

Suggestions or comments will be welcomed, anyway.

Thank you.

-- Vicent

===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD

===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


Back to: Top of message | Previous page | Main SPSSX-L page