```========================================================================= Date: Thu, 20 Jul 2006 15:28:12 -0500 Reply-To: "Beadle, ViAnn" Sender: "SPSSX(r) Discussion" From: "Beadle, ViAnn" Subject: Re: 10 most frequent occurring values of a multiple response set Comments: To: Edward Boadi Content-Type: text/plain; charset="us-ascii" Then ignore the whole concept of a multiple response set and just compute some variable which is a combination of all three values. For example if z1, z2, and z3 take on two values you'll need some thing like: Compute z=z1 + z2*1000 + z3*100000. The second step is to rank occurrences, not values. You need to use aggregate to capture the occurrences into a variable, using the N function and z as your break variable. This will give you a dataset with one row for each unique value of z and N. Sort that dataset in descending order on N and then compute nrank= \$casenum after the sort. So your aggregated dataset has z, N, and nrank. You have to get nrank onto your original dataset through a table match. But to do so, you need to sort both the aggregated dataset and the original dataset on z and use z as the matching key. Once nrank is on your dataset then you can either filter or select cases with rankz less than or equal to 10. Here's some syntax that I pasted from SPSS, release 14+ that might do the trick: GET FILE='C:\Program Files\SPSS\orginaldata.sav'. DATASET NAME DataSet1 WINDOW=FRONT. COMPUTE Z=z1+z2*1000+z3*100000. DATASET DECLARE ranked_data. AGGREGATE /OUTFILE='ranked_data' /BREAK=z /N=N. DATASET ACTIVATE ranked_data. SORT CASES BY N (D) . COMPUTE nrank = \$casenum . EXECUTE . DATASET ACTIVATE DataSet1. SORT CASES BY z (A) . DATASET ACTIVATE ranked_data. SORT CASES BY z (A) . DATASET ACTIVATE DataSet1. SAVE OUTFILE='C:\Program Files\SPSS\originaldata.sav' /COMPRESSED. MATCH FILES /FILE=* /TABLE='ranked_data' /BY z. EXECUTE. USE ALL. COMPUTE filter_\$=(nrank <= 10). VARIABLE LABEL filter_\$ 'nrank <= 10 (FILTER)'. VALUE LABELS filter_\$ 0 'Not Selected' 1 'Selected'. FORMAT filter_\$ (f1.0). FILTER BY filter_\$. ... rest of analysis goes here I think the big issue here is what to do about ties. In my example the 10 most frequently occurring value was shared by 5 values and this code takes the first 10 frequencies which happen to be sorted on the z variable. -----Original Message----- From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of Edward Boadi Sent: Thursday, July 20, 2006 2:25 PM To: SPSSX-L@LISTSERV.UGA.EDU Subject: Re: 10 most frequent occurring values of a multiple response set This is not RFM analysis. Yes Iam looking for 10 most frequently occurring combinations of the three variables as my initial step. Then select X , y1 , y2 , z1, z2 and z3 where (z1,z2,z3) = z ie where z1,z2, and z3 corresponds to the 10 most frequent occurring combinations of z1,z2 and z3. Regards. -----Original Message----- From: Beadle, ViAnn [mailto:viann@spss.com] Sent: Thursday, July 20, 2006 3:15 PM To: Edward Boadi; SPSSX-L@LISTSERV.UGA.EDU Subject: RE: 10 most frequent occurring values of a multiple response set I'm not quite sure what it means to rank z since it is a set of 3 values. Are you looking for the most frequently occurring combinations of the three variables? Is this some sort of RFM analysis? -----Original Message----- From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of Edward Boadi Sent: Thursday, July 20, 2006 2:03 PM To: SPSSX-L@LISTSERV.UGA.EDU Subject: 10 most frequent occurring values of a multiple response set Dear List, I have a data file with variables : X , y1 , y2 , z1, z2 and z3 I wont to a accomplish the following task : 1. create a multiple response set z from z1,z2 and z3 . 2. Rank z and select cases for rank z <= 10 3. select cases from my original data file where z = z1, z2 or z3 My objective is to create a new dataset restricted to 10 most frequent occurring values of a multiple response set created from z1 , z2 and z3 Any ideas on how to accomplish this will be most welcome. ```

Back to: Top of message | Previous page | Main SPSSX-L page