```Date: Mon, 26 Dec 2011 04:25:45 -0800 Reply-To: David Marso Sender: "SPSSX(r) Discussion" From: David Marso Subject: Re: create cumulative count & percent for a repeating term table In-Reply-To: <1324902157238-5101466.post@n5.nabble.com> Content-Type: text/plain; charset=us-ascii My last example was *NOT* exactly illustrative of what I was bending towards. I was playing around with different idioms ;-) -- This is more in line: data list free /id id2 . begin data 1 1 1 1 1 2 1 2 1 3 1 3 2 1 2 1 2 2 2 2 2 2 2 3 end data. compute z=SUM(1, (id EQ LAG(id))*(id2 EQ LAG(id2)) * LAG(z)). David Marso wrote > >> COMPUTE C= SUM(1, LAG(C) *(LAG(ID) EQ ID)). > -- > What I find interesting about this is that this particular construction > (according to my intuitions) *SHOULD* fail with the LAG(C) because at that > point C is not defined. SPSS must be doing something to initialize C at > the onset of the COMPUTE . > To unpack this a bit. > Note > 1. SUM(X,\$SYSMIS)=X. > 2. LAG(C) *(LAG(ID) EQ ID) = SYSMIS until ID changes and then = 0. > ---- > So first record gets assigned 1 [ SUM(1,SYSMIS) ], second record (assuming > same value of ID) gets SUM(1, 1*1)=2 . > AT the point where the value of ID changes: > SUM(1,WhoCares*0)=1. > So, we get a very compact technique for building counters within cases > WITHOUT using \$CASENUM or LAG(I) NE ID as means of reinitializing the > value within each strata. > Taking this approach one step further (let's say we have multiple strata): > data list free /id id2 . > begin data > 1 1 1 1 1 2 1 2 1 3 1 3 > 2 1 2 1 2 2 2 2 2 2 2 3 > end data. > compute z=MAX(1, > (id EQ LAG(id))*(id2 EQ LAG(id2)) > * LAG(z)+1). > list > ID ID2 Z > > 1.00 1.00 1.00 > 1.00 1.00 2.00 > 1.00 2.00 1.00 > 1.00 2.00 2.00 > 1.00 3.00 1.00 > 1.00 3.00 2.00 > 2.00 1.00 1.00 > 2.00 1.00 2.00 > 2.00 2.00 1.00 > 2.00 2.00 2.00 > 2.00 2.00 3.00 > 2.00 3.00 1.00 > > > Number of cases read: 12 Number of cases listed: 12 > > > > > > > David Wright-6 wrote >> >> David, >> >> What does the "1," in the second compute line do? Is this serving the >> same as your earlier "DO IF \$CASENUM=1" to get around the starting case >> in the file? If so, without the DO IF, how does SPSS treat the 1 after >> the first case processing? >> >> >> On Sun, Dec 25, 2011 at 4:52 PM, David Marso wrote: >> >>> You are welcome! >>> Even leaner and meaner (if perhaps slightly obscure ;-) >>> ** Alternatively **. >>> COMPUTE CN=SUM(prob,LAG(CN)*(LAG(ID) EQ ID)). >>> COMPUTE C= SUM(1, LAG(C) *(LAG(ID) EQ ID)). >>> COMPUTE ppct=cn/c. >>> exe. >>> >>> David Wright-6 wrote >>>> >>>> Thanks David, that should do the trick. Happy Holidays... >>>> >>>> >>>> On Sat, Dec 24, 2011 at 11:48 AM, David Marso wrote: >>>> >>>>> DATA LIST FREE / ID term prob pcntX ppctX. >>>>> begin data >>>>> 12 19973 0 0 0 >>>>> 12 19981 0 0 0 >>>>> 12 19982 1 1 .33 >>>>> 12 19993 0 1 .25 >>>>> 12 20001 1 2 .40 >>>>> 12 20003 1 3 .50 >>>>> 12 20011 0 3 .43 >>>>> 12 20023 1 4 .50 >>>>> 12 20031 0 4 .44 >>>>> 13 19973 0 0 0 >>>>> 13 19981 0 0 0 >>>>> 13 19982 1 1 .33 >>>>> 13 19993 0 1 .25 >>>>> 13 20001 1 2 .40 >>>>> 13 20003 1 3 .50 >>>>> 13 20011 0 3 .43 >>>>> 13 20023 1 4 .50 >>>>> 13 20031 0 4 .44 >>>>> end data. >>>>> SPLIT FILE BY ID. >>>>> COMPUTE C=1. >>>>> CREATE CN CP=CSUM(C prob). >>>>> COMPUTE ppct=cp/cn. >>>>> >>>>> ** Alternatively **. >>>>> DO IF \$CASENUM=1 OR LAG(ID) NE ID. >>>>> + COMPUTE CN=prob. >>>>> + COMPUTE c=1. >>>>> ELSE. >>>>> + COMPUTE CN=SUM(LAG(CN),prob). >>>>> + COMPUTE c=lag(c)+1. >>>>> END IF. >>>>> COMPUTE ppct=cn/c. >>>>> exe. >>>>> >>>>> David Wright-6 wrote >>>>>> >>>>>> I have a table that contains repeating terms (term) by student ID >>>>>> (ID) >>>>>> which also contains a binary (prob) of whether for a specific term >>>>>> the >>>>>> student has been on probation. I want to create two new columns >>>>>> one >>>>>> of >>>>>> which is a cumulative count (pcnt) and the other a cumulative >>>>>> percent >>>>>> (ppct) of all probation activity for the student from any term >>>>>> backwards >>>>>> in time. I can easily create the total count & percent for all >>>>>> time >>>>>> of >>>>>> the student via the aggregate command & append to each record but I >>>>>> would like a cumulative count/percent to appear at each term. I >>>>>> would >>>>>> appreciate any suggestions (no python please). Listed below are >>>>>> the >>>>>> 3 >>>>>> original columns & the two desired columns when coding is complete. >>>>>> >>>>>> ID, term, prob, pcnt, ppct >>>>>> 12 19973 0 0 0 >>>>>> 12 19981 0 0 0 >>>>>> 12 19982 1 1 .33 >>>>>> 12 19993 0 1 .25 >>>>>> 12 20001 1 2 .40 >>>>>> 12 20003 1 3 .50 >>>>>> 12 20011 0 3 .43 >>>>>> 12 20023 1 4 .50 >>>>>> 12 20031 0 4 .44 >>>>>> >>>>>> (term is defined as calendar year digits 1-4 and semester code 5th >>>>>> digit >>>>>> in which 1=spring, 2=summer,3=fall) >>>>>> >>>>>> ===================== >>>>>> To manage your subscription to SPSSX-L, send a message to >>>>>> LISTSERV@.UGA (not to SPSSX-L), with no body text except the >>>>>> command. To leave the list, send the command >>>>>> SIGNOFF SPSSX-L >>>>>> For a list of commands to manage subscriptions, send the command >>>>>> INFO REFCARD >>>>>> >>>>> >>>>> >>>>> -- >>>>> View this message in context: >>>>> >>>>> http://spssx-discussion.1045642.n5.nabble.com/create-cumulative-count-percent-for-a-repeating-term-table-tp5099383p5099491.html >>>>> Sent from the SPSSX Discussion mailing list archive at Nabble.com. >>>>> >>>>> ===================== >>>>> To manage your subscription to SPSSX-L, send a message to >>>>> LISTSERV@.UGA (not to SPSSX-L), with no body text except >>>>> the >>>>> command. To leave the list, send the command >>>>> SIGNOFF SPSSX-L >>>>> For a list of commands to manage subscriptions, send the command >>>>> INFO REFCARD >>>> >>>> ===================== >>>> To manage your subscription to SPSSX-L, send a message to >>>> LISTSERV@.UGA (not to SPSSX-L), with no body text except the >>>> command. To leave the list, send the command >>>> SIGNOFF SPSSX-L >>>> For a list of commands to manage subscriptions, send the command >>>> INFO REFCARD >>>> >>> >>> >>> -- >>> View this message in context: >>> http://spssx-discussion.1045642.n5.nabble.com/create-cumulative-count-percent-for-a-repeating-term-table-tp5099383p5100722.html >>> Sent from the SPSSX Discussion mailing list archive at Nabble.com. >>> >>> ===================== >>> To manage your subscription to SPSSX-L, send a message to >>> LISTSERV@.UGA (not to SPSSX-L), with no body text except >>> the >>> command. To leave the list, send the command >>> SIGNOFF SPSSX-L >>> For a list of commands to manage subscriptions, send the command >>> INFO REFCARD >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to >> LISTSERV@.UGA (not to SPSSX-L), with no body text except the >> command. To leave the list, send the command >> SIGNOFF SPSSX-L >> For a list of commands to manage subscriptions, send the command >> INFO REFCARD >> > -- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/create-cumulative-count-percent-for-a-repeating-term-table-tp5099383p5101472.html Sent from the SPSSX Discussion mailing list archive at Nabble.com. ===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD ```

Back to: Top of message | Previous page | Main SPSSX-L page