Date: Wed, 2 Mar 2011 20:12:45 -0800
Reply-To: JASPAL CHEEMA <email@example.com>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: JASPAL CHEEMA <firstname.lastname@example.org>
Subject: Split a variable into 5 groups and compute usual stats
SPSS 18.0 (Windows XP)
Would you please help me figure out how to split a sorted variable SOLD$ into 5 groups (20% each, rounded up. The first 4 groups from lowest to highest should be equal, and the last, highest priced, could be higher or lower), and then compute usual stats (e.g., mean, N, median, min, max, SD, etc.) for each group within each year month and district.
These results are based on a sample data file for 3 months in 2010 (file is available if needed). However, my actual data file will have monthly data for the last 20 years with over a 100 districts. I cannot imagine recoding the SOLD$ variable manually. There must be an elegant solution for this.
Here's my code so far, and the result I get is at the bottom. The SECOND line shows that in October 2010 there were 227 homes sold in District W24. Average List Price was $361,060.49 and average Sold Price was $351,995.95. I'd like to somehow split 305 homes into 5 groups: 46, 46, 46, 46, and highest priced group would have 43 homes (total=227). The objective is to compute the usual stats (mean, median, SD, min, max etc.) for each of these groups automatically, without having to recode manually.
I've struggled with this for quite a while now and your help would be greatly appreciated.
GET FILE 'F:\New Folder\STORE N GO (F)\EXCEL Documents\SampleFile9.sav'.
SORT CASES BY YEAR MONTH DIST SOLD$.
/OUTFILE='F:\New Folder\STORE N GO (F)\EXCEL Documents\W23W24.sav'
/BREAK=YEAR MONTH DIST
/HOMESW 'Number of Homes' = N(SOLD$)
/AV_LIST$W 'Average List Price' = MEAN(LIST$)
/AV_SOLD$W 'Average Sold Price' = MEAN(SOLD$)
/MED_SOLD$W 'Median Sold Price' = MEDIAN(SOLD$)
/AV_DOMW 'Average Days on Market' = MEAN(DOM)
/MIN_W = MIN(SOLD$)
/MAX_W = MAX(SOLD$)
/SD_SOLDW = SD(SOLD$)
/SD_LISTW = SD(LIST$)
/SD_DOMW = SD(DOM)
/VOLUMEW = SUM(SOLD$).
GET FILE 'F:\New Folder\STORE N GO (F)\EXCEL Documents\W23W24.sav'.
Here is the data I get from the GET FILE (partial - file is cut off from right).
Year Month Dist HOMESW AV_LIST$W AV_SOLD$W
2010 10 W23 305 350646.74 341276.27
2010 10 W24 227 361060.49 351995.95
2010 11 W23 278 350934.32 341229.87
2010 11 W24 241 362561.73 352203.74
2010 12 W23 207 342913.62 332607.20
2010 12 W24 152 388466.22 377617.38