LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (December 2011)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 26 Dec 2011 04:25:45 -0800
Reply-To:     David Marso <david.marso@gmail.com>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         David Marso <david.marso@gmail.com>
Subject:      Re: create cumulative count & percent for a repeating term table
In-Reply-To:  <1324902157238-5101466.post@n5.nabble.com>
Content-Type: text/plain; charset=us-ascii

My last example was *NOT* exactly illustrative of what I was bending towards. I was playing around with different idioms ;-) -- This is more in line: data list free /id id2 . begin data 1 1 1 1 1 2 1 2 1 3 1 3 2 1 2 1 2 2 2 2 2 2 2 3 end data. compute z=SUM(1, (id EQ LAG(id))*(id2 EQ LAG(id2)) * LAG(z)).

David Marso wrote > >> COMPUTE C= SUM(1, LAG(C) *(LAG(ID) EQ ID)). > -- > What I find interesting about this is that this particular construction > (according to my intuitions) *SHOULD* fail with the LAG(C) because at that > point C is not defined. SPSS must be doing something to initialize C at > the onset of the COMPUTE . > To unpack this a bit. > Note > 1. SUM(X,$SYSMIS)=X. > 2. LAG(C) *(LAG(ID) EQ ID) = SYSMIS until ID changes and then = 0. > ---- > So first record gets assigned 1 [ SUM(1,SYSMIS) ], second record (assuming > same value of ID) gets SUM(1, 1*1)=2 . > AT the point where the value of ID changes: > SUM(1,WhoCares*0)=1. > So, we get a very compact technique for building counters within cases > WITHOUT using $CASENUM or LAG(I) NE ID as means of reinitializing the > value within each strata. > Taking this approach one step further (let's say we have multiple strata): > data list free /id id2 . > begin data > 1 1 1 1 1 2 1 2 1 3 1 3 > 2 1 2 1 2 2 2 2 2 2 2 3 > end data. > compute z=MAX(1, > (id EQ LAG(id))*(id2 EQ LAG(id2)) > * LAG(z)+1). > list > ID ID2 Z > > 1.00 1.00 1.00 > 1.00 1.00 2.00 > 1.00 2.00 1.00 > 1.00 2.00 2.00 > 1.00 3.00 1.00 > 1.00 3.00 2.00 > 2.00 1.00 1.00 > 2.00 1.00 2.00 > 2.00 2.00 1.00 > 2.00 2.00 2.00 > 2.00 2.00 3.00 > 2.00 3.00 1.00 > > > Number of cases read: 12 Number of cases listed: 12 > > > > > > > David Wright-6 wrote >> >> David, >> >> What does the "1," in the second compute line do? Is this serving the >> same as your earlier "DO IF $CASENUM=1" to get around the starting case >> in the file? If so, without the DO IF, how does SPSS treat the 1 after >> the first case processing? >> >> >> On Sun, Dec 25, 2011 at 4:52 PM, David Marso wrote: >> >>> You are welcome! >>> Even leaner and meaner (if perhaps slightly obscure ;-) >>> ** Alternatively **. >>> COMPUTE CN=SUM(prob,LAG(CN)*(LAG(ID) EQ ID)). >>> COMPUTE C= SUM(1, LAG(C) *(LAG(ID) EQ ID)). >>> COMPUTE ppct=cn/c. >>> exe. >>> >>> David Wright-6 wrote >>>> >>>> Thanks David, that should do the trick. Happy Holidays... >>>> >>>> >>>> On Sat, Dec 24, 2011 at 11:48 AM, David Marso wrote: >>>> >>>>> DATA LIST FREE / ID term prob pcntX ppctX. >>>>> begin data >>>>> 12 19973 0 0 0 >>>>> 12 19981 0 0 0 >>>>> 12 19982 1 1 .33 >>>>> 12 19993 0 1 .25 >>>>> 12 20001 1 2 .40 >>>>> 12 20003 1 3 .50 >>>>> 12 20011 0 3 .43 >>>>> 12 20023 1 4 .50 >>>>> 12 20031 0 4 .44 >>>>> 13 19973 0 0 0 >>>>> 13 19981 0 0 0 >>>>> 13 19982 1 1 .33 >>>>> 13 19993 0 1 .25 >>>>> 13 20001 1 2 .40 >>>>> 13 20003 1 3 .50 >>>>> 13 20011 0 3 .43 >>>>> 13 20023 1 4 .50 >>>>> 13 20031 0 4 .44 >>>>> end data. >>>>> SPLIT FILE BY ID. >>>>> COMPUTE C=1. >>>>> CREATE CN CP=CSUM(C prob). >>>>> COMPUTE ppct=cp/cn. >>>>> >>>>> ** Alternatively **. >>>>> DO IF $CASENUM=1 OR LAG(ID) NE ID. >>>>> + COMPUTE CN=prob. >>>>> + COMPUTE c=1. >>>>> ELSE. >>>>> + COMPUTE CN=SUM(LAG(CN),prob). >>>>> + COMPUTE c=lag(c)+1. >>>>> END IF. >>>>> COMPUTE ppct=cn/c. >>>>> exe. >>>>> >>>>> David Wright-6 wrote >>>>>> >>>>>> I have a table that contains repeating terms (term) by student ID >>>>>> (ID) >>>>>> which also contains a binary (prob) of whether for a specific term >>>>>> the >>>>>> student has been on probation. I want to create two new columns >>>>>> one >>>>>> of >>>>>> which is a cumulative count (pcnt) and the other a cumulative >>>>>> percent >>>>>> (ppct) of all probation activity for the student from any term >>>>>> backwards >>>>>> in time. I can easily create the total count & percent for all >>>>>> time >>>>>> of >>>>>> the student via the aggregate command & append to each record but I >>>>>> would like a cumulative count/percent to appear at each term. I >>>>>> would >>>>>> appreciate any suggestions (no python please). Listed below are >>>>>> the >>>>>> 3 >>>>>> original columns & the two desired columns when coding is complete. >>>>>> >>>>>> ID, term, prob, pcnt, ppct >>>>>> 12 19973 0 0 0 >>>>>> 12 19981 0 0 0 >>>>>> 12 19982 1 1 .33 >>>>>> 12 19993 0 1 .25 >>>>>> 12 20001 1 2 .40 >>>>>> 12 20003 1 3 .50 >>>>>> 12 20011 0 3 .43 >>>>>> 12 20023 1 4 .50 >>>>>> 12 20031 0 4 .44 >>>>>> >>>>>> (term is defined as calendar year digits 1-4 and semester code 5th >>>>>> digit >>>>>> in which 1=spring, 2=summer,3=fall) >>>>>> >>>>>> ===================== >>>>>> To manage your subscription to SPSSX-L, send a message to >>>>>> LISTSERV@.UGA (not to SPSSX-L), with no body text except the >>>>>> command. To leave the list, send the command >>>>>> SIGNOFF SPSSX-L >>>>>> For a list of commands to manage subscriptions, send the command >>>>>> INFO REFCARD >>>>>> >>>>> >>>>> >>>>> -- >>>>> View this message in context: >>>>> >>>>> http://spssx-discussion.1045642.n5.nabble.com/create-cumulative-count-percent-for-a-repeating-term-table-tp5099383p5099491.html >>>>> Sent from the SPSSX Discussion mailing list archive at Nabble.com. >>>>> >>>>> ===================== >>>>> To manage your subscription to SPSSX-L, send a message to >>>>> LISTSERV@.UGA (not to SPSSX-L), with no body text except >>>>> the >>>>> command. To leave the list, send the command >>>>> SIGNOFF SPSSX-L >>>>> For a list of commands to manage subscriptions, send the command >>>>> INFO REFCARD >>>> >>>> ===================== >>>> To manage your subscription to SPSSX-L, send a message to >>>> LISTSERV@.UGA (not to SPSSX-L), with no body text except the >>>> command. To leave the list, send the command >>>> SIGNOFF SPSSX-L >>>> For a list of commands to manage subscriptions, send the command >>>> INFO REFCARD >>>> >>> >>> >>> -- >>> View this message in context: >>> http://spssx-discussion.1045642.n5.nabble.com/create-cumulative-count-percent-for-a-repeating-term-table-tp5099383p5100722.html >>> Sent from the SPSSX Discussion mailing list archive at Nabble.com. >>> >>> ===================== >>> To manage your subscription to SPSSX-L, send a message to >>> LISTSERV@.UGA (not to SPSSX-L), with no body text except >>> the >>> command. To leave the list, send the command >>> SIGNOFF SPSSX-L >>> For a list of commands to manage subscriptions, send the command >>> INFO REFCARD >> >> ===================== >> To manage your subscription to SPSSX-L, send a message to >> LISTSERV@.UGA (not to SPSSX-L), with no body text except the >> command. To leave the list, send the command >> SIGNOFF SPSSX-L >> For a list of commands to manage subscriptions, send the command >> INFO REFCARD >> >

-- View this message in context: http://spssx-discussion.1045642.n5.nabble.com/create-cumulative-count-percent-for-a-repeating-term-table-tp5099383p5101472.html Sent from the SPSSX Discussion mailing list archive at Nabble.com.

===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


Back to: Top of message | Previous page | Main SPSSX-L page