Date: Mon, 3 Jan 2005 16:43:49 -0300
Reply-To: Hector Maletta <hmaletta@fibertel.com.ar>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: Hector Maletta <hmaletta@fibertel.com.ar>
Subject: Re: Forward lag
In-Reply-To: <DAEJJEJKGDKPDEDHOIELEELKEEAA.rlevesque@videotron.ca>
Content-Type: text/plain; charset="us-ascii"
Dear Ray,
Thanks so much. I have got several suggestions, but this one from you is
undoubtedly the best. Your advice is that, if one wants to compare each
case's value for a variable with the value of that variable for the next
case in the file, one should copy the variable in question (say salary) for
cases 2 to n into a separate file, change the name of the variable there,
and then (mis)match those n-1 cases in the new file with cases 1 to n-1 in
the original file (the nth case, of course, has no possibility of a forward
lag because there are no cases later in the file, and in the resulting
matched file it would be system-missing for the shifted variable, just as
the first case is missing for a backward lag). In case one doesn't need the
next case but the second next, or in general the kth next, one should just
drop the first two (in general the first k) cases and match cases 3 to n to
cases 1 to n-2 (or in general cases k+1 to n with cases 1 to n-k). Newcomers
should note that MATCH without a BY clause just matches the ith case in a
file with the ith case in the other file, whatever they are. Brilliant, Ray,
as usual. It is a great blessing having you in this list.
Time is ripe, I should suggest on the other hand, for SPSS to include a LEAD
function in the COMPUTE command (and similar commands like IF) or just
admitting a negative number of steps in LAG, which now is impossible.
Hector
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU]
> On Behalf Of Raynald Levesque
> Sent: Monday, January 03, 2005 4:13 PM
> To: SPSSX-L@LISTSERV.UGA.EDU
> Subject: Re: Forward lag
>
>
> Hi
>
> Personnaly (because I often work with large files) I avoid
> the use of CREATE. I use the following alternative syntax
> that is not memory dependent.
>
> GET FILE='c:\program files\spss\employee data.sav'.
> * Alternative syntax to CREATE sal2=LEAD(salary,1).
> COMPUTE flag=$CASENUM>1.
> FILTER BY flag.
> SAVE OUTFILE='c:\temp\lead1.sav'
> /KEEP=salary
> /RENAME (salary=sal2)
> /UNSELECTED=DELETE.
> FILTER OFF.
> MATCH FILES FILE=*
> /FILE='c:\temp\lead1.sav'
> /DROP=flag.
> EXECUTE.
>
> Regards
>
> Raynald Levesque Raynald@spsstools.net
> Visit my SPSS site: http://www.spsstools.net
>
>
>
> -----Original Message-----
> From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU]On
> Behalf Of Hector Maletta
> Sent: January 3, 2005 12:04 PM
> To: SPSSX-L@LISTSERV.UGA.EDU
> Subject: Re: Forward lag
>
>
> I see. I haven't thought of CREATE. But even with WORKSPACE
> at its maximum (what is its maximum, by the way?) it could
> hardly accommodate my dataset of eigth-plus million cases
> with many variables, could it? The dataset is currently about
> 4 Gb. Even holding the single variable affected could take a
> lot of memory space. Why can't CREATE hold in memory just the
> few next cases necessary for LEAD to work, and not the entire
> dataset? Seems a very clumsy and memory-wasteful way of
> handling the problem, isn't it? I suppose this limitation is
> an unwanted legacy suffered by CREATE from its former life
> under TRENDS.
>
> Hector
>
> > -----Original Message-----
> > From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU]
> On Behalf
> > Of Raynald Levesque
> > Sent: Monday, January 03, 2005 1:52 PM
> > To: SPSSX-L@LISTSERV.UGA.EDU
> > Subject: Re: Forward lag
> >
> >
> > Hi Hector
> >
> > The following syntax assigns the salary of the following
> case to sal2:
> >
> > GET FILE='c:\program files\spss\employee data.sav'.
> > CREATE sal2=LEAD(salary,1).
> >
> > Note that CREATE needs to have all data in memory in order
> to work, so
> > there can be memory problems with very large files. (In
> those cases,
> > you need to increase SET WORKSPACE allowance)
> >
> > Regards
> >
> > Raynald Levesque Raynald@spsstools.net
> > Visit my SPSS site: http://www.spsstools.net
> >
> >
> >
> >
> > -----Original Message-----
> > From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU]On
> > Behalf Of Hector Maletta
> > Sent: January 3, 2005 11:41 AM
> > To: SPSSX-L@LISTSERV.UGA.EDU
> > Subject: Forward lag
> >
> >
> > The cross-case function LAG(varname, n) returns the value
> of <varname>
> > n cases earlier in the file. Is there any inverse or forward
> > LAG-equivalent function returning the value of a variable n cases
> > later? I know I could sort the file in reverse order to
> achieve this,
> > but sorting and soting back millions of cases is a hassle I
> prefer to
> > avoid. According to the Syntax reference, LAG is the sole
> cross-case
> > function available for COMPUTE, but perhaps there is another way.
> >
> > Thanks for any help on this.
> >
> > Hector
> >
>
|