Date: Mon, 26 Dec 2011 04:25:45 -0800
Reply-To: David Marso <david.marso@gmail.com>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: David Marso <david.marso@gmail.com>
Subject: Re: create cumulative count & percent for a repeating term table
In-Reply-To: <1324902157238-5101466.post@n5.nabble.com>
Content-Type: text/plain; charset=us-ascii
My last example was *NOT* exactly illustrative of what I was bending towards.
I was playing around with different idioms ;-)
--
This is more in line:
data list free /id id2 .
begin data
1 1 1 1 1 2 1 2 1 3 1 3
2 1 2 1 2 2 2 2 2 2 2 3
end data.
compute z=SUM(1,
(id EQ LAG(id))*(id2 EQ LAG(id2))
* LAG(z)).
David Marso wrote
>
>> COMPUTE C= SUM(1, LAG(C) *(LAG(ID) EQ ID)).
> --
> What I find interesting about this is that this particular construction
> (according to my intuitions) *SHOULD* fail with the LAG(C) because at that
> point C is not defined. SPSS must be doing something to initialize C at
> the onset of the COMPUTE .
> To unpack this a bit.
> Note
> 1. SUM(X,$SYSMIS)=X.
> 2. LAG(C) *(LAG(ID) EQ ID) = SYSMIS until ID changes and then = 0.
> ----
> So first record gets assigned 1 [ SUM(1,SYSMIS) ], second record (assuming
> same value of ID) gets SUM(1, 1*1)=2 .
> AT the point where the value of ID changes:
> SUM(1,WhoCares*0)=1.
> So, we get a very compact technique for building counters within cases
> WITHOUT using $CASENUM or LAG(I) NE ID as means of reinitializing the
> value within each strata.
> Taking this approach one step further (let's say we have multiple strata):
> data list free /id id2 .
> begin data
> 1 1 1 1 1 2 1 2 1 3 1 3
> 2 1 2 1 2 2 2 2 2 2 2 3
> end data.
> compute z=MAX(1,
> (id EQ LAG(id))*(id2 EQ LAG(id2))
> * LAG(z)+1).
> list
> ID ID2 Z
>
> 1.00 1.00 1.00
> 1.00 1.00 2.00
> 1.00 2.00 1.00
> 1.00 2.00 2.00
> 1.00 3.00 1.00
> 1.00 3.00 2.00
> 2.00 1.00 1.00
> 2.00 1.00 2.00
> 2.00 2.00 1.00
> 2.00 2.00 2.00
> 2.00 2.00 3.00
> 2.00 3.00 1.00
>
>
> Number of cases read: 12 Number of cases listed: 12
>
>
>
>
>
>
> David Wright-6 wrote
>>
>> David,
>>
>> What does the "1," in the second compute line do? Is this serving the
>> same as your earlier "DO IF $CASENUM=1" to get around the starting case
>> in the file? If so, without the DO IF, how does SPSS treat the 1 after
>> the first case processing?
>>
>>
>> On Sun, Dec 25, 2011 at 4:52 PM, David Marso wrote:
>>
>>> You are welcome!
>>> Even leaner and meaner (if perhaps slightly obscure ;-)
>>> ** Alternatively **.
>>> COMPUTE CN=SUM(prob,LAG(CN)*(LAG(ID) EQ ID)).
>>> COMPUTE C= SUM(1, LAG(C) *(LAG(ID) EQ ID)).
>>> COMPUTE ppct=cn/c.
>>> exe.
>>>
>>> David Wright-6 wrote
>>>>
>>>> Thanks David, that should do the trick. Happy Holidays...
>>>>
>>>>
>>>> On Sat, Dec 24, 2011 at 11:48 AM, David Marso wrote:
>>>>
>>>>> DATA LIST FREE / ID term prob pcntX ppctX.
>>>>> begin data
>>>>> 12 19973 0 0 0
>>>>> 12 19981 0 0 0
>>>>> 12 19982 1 1 .33
>>>>> 12 19993 0 1 .25
>>>>> 12 20001 1 2 .40
>>>>> 12 20003 1 3 .50
>>>>> 12 20011 0 3 .43
>>>>> 12 20023 1 4 .50
>>>>> 12 20031 0 4 .44
>>>>> 13 19973 0 0 0
>>>>> 13 19981 0 0 0
>>>>> 13 19982 1 1 .33
>>>>> 13 19993 0 1 .25
>>>>> 13 20001 1 2 .40
>>>>> 13 20003 1 3 .50
>>>>> 13 20011 0 3 .43
>>>>> 13 20023 1 4 .50
>>>>> 13 20031 0 4 .44
>>>>> end data.
>>>>> SPLIT FILE BY ID.
>>>>> COMPUTE C=1.
>>>>> CREATE CN CP=CSUM(C prob).
>>>>> COMPUTE ppct=cp/cn.
>>>>>
>>>>> ** Alternatively **.
>>>>> DO IF $CASENUM=1 OR LAG(ID) NE ID.
>>>>> + COMPUTE CN=prob.
>>>>> + COMPUTE c=1.
>>>>> ELSE.
>>>>> + COMPUTE CN=SUM(LAG(CN),prob).
>>>>> + COMPUTE c=lag(c)+1.
>>>>> END IF.
>>>>> COMPUTE ppct=cn/c.
>>>>> exe.
>>>>>
>>>>> David Wright-6 wrote
>>>>>>
>>>>>> I have a table that contains repeating terms (term) by student ID
>>>>>> (ID)
>>>>>> which also contains a binary (prob) of whether for a specific term
>>>>>> the
>>>>>> student has been on probation. I want to create two new columns
>>>>>> one
>>>>>> of
>>>>>> which is a cumulative count (pcnt) and the other a cumulative
>>>>>> percent
>>>>>> (ppct) of all probation activity for the student from any term
>>>>>> backwards
>>>>>> in time. I can easily create the total count & percent for all
>>>>>> time
>>>>>> of
>>>>>> the student via the aggregate command & append to each record but I
>>>>>> would like a cumulative count/percent to appear at each term. I
>>>>>> would
>>>>>> appreciate any suggestions (no python please). Listed below are
>>>>>> the
>>>>>> 3
>>>>>> original columns & the two desired columns when coding is complete.
>>>>>>
>>>>>> ID, term, prob, pcnt, ppct
>>>>>> 12 19973 0 0 0
>>>>>> 12 19981 0 0 0
>>>>>> 12 19982 1 1 .33
>>>>>> 12 19993 0 1 .25
>>>>>> 12 20001 1 2 .40
>>>>>> 12 20003 1 3 .50
>>>>>> 12 20011 0 3 .43
>>>>>> 12 20023 1 4 .50
>>>>>> 12 20031 0 4 .44
>>>>>>
>>>>>> (term is defined as calendar year digits 1-4 and semester code 5th
>>>>>> digit
>>>>>> in which 1=spring, 2=summer,3=fall)
>>>>>>
>>>>>> =====================
>>>>>> To manage your subscription to SPSSX-L, send a message to
>>>>>> LISTSERV@.UGA (not to SPSSX-L), with no body text except the
>>>>>> command. To leave the list, send the command
>>>>>> SIGNOFF SPSSX-L
>>>>>> For a list of commands to manage subscriptions, send the command
>>>>>> INFO REFCARD
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> View this message in context:
>>>>>
>>>>> http://spssx-discussion.1045642.n5.nabble.com/create-cumulative-count-percent-for-a-repeating-term-table-tp5099383p5099491.html
>>>>> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>>>>>
>>>>> =====================
>>>>> To manage your subscription to SPSSX-L, send a message to
>>>>> LISTSERV@.UGA (not to SPSSX-L), with no body text except
>>>>> the
>>>>> command. To leave the list, send the command
>>>>> SIGNOFF SPSSX-L
>>>>> For a list of commands to manage subscriptions, send the command
>>>>> INFO REFCARD
>>>>
>>>> =====================
>>>> To manage your subscription to SPSSX-L, send a message to
>>>> LISTSERV@.UGA (not to SPSSX-L), with no body text except the
>>>> command. To leave the list, send the command
>>>> SIGNOFF SPSSX-L
>>>> For a list of commands to manage subscriptions, send the command
>>>> INFO REFCARD
>>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://spssx-discussion.1045642.n5.nabble.com/create-cumulative-count-percent-for-a-repeating-term-table-tp5099383p5100722.html
>>> Sent from the SPSSX Discussion mailing list archive at Nabble.com.
>>>
>>> =====================
>>> To manage your subscription to SPSSX-L, send a message to
>>> LISTSERV@.UGA (not to SPSSX-L), with no body text except
>>> the
>>> command. To leave the list, send the command
>>> SIGNOFF SPSSX-L
>>> For a list of commands to manage subscriptions, send the command
>>> INFO REFCARD
>>
>> =====================
>> To manage your subscription to SPSSX-L, send a message to
>> LISTSERV@.UGA (not to SPSSX-L), with no body text except the
>> command. To leave the list, send the command
>> SIGNOFF SPSSX-L
>> For a list of commands to manage subscriptions, send the command
>> INFO REFCARD
>>
>
--
View this message in context: http://spssx-discussion.1045642.n5.nabble.com/create-cumulative-count-percent-for-a-repeating-term-table-tp5099383p5101472.html
Sent from the SPSSX Discussion mailing list archive at Nabble.com.
=====================
To manage your subscription to SPSSX-L, send a message to
LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the
command. To leave the list, send the command
SIGNOFF SPSSX-L
For a list of commands to manage subscriptions, send the command
INFO REFCARD
|