LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (January 2008)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 16 Jan 2008 03:34:19 -0500
Reply-To:     Richard Ristow <wrristow@mindspring.com>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
Comments:     RFC822 error: <W> MESSAGE-ID field duplicated. Last occurrence
              was retained.
From:         Richard Ristow <wrristow@mindspring.com>
Subject:      Re: drawing samples for hundreds of workers
Comments: To: "Raffe, Sydelle, SSA" <DRaffe@acgov.org>
Comments: cc: King Douglas <king_d@swbell.net>
Content-Type: text/plain; charset="us-ascii"; format=flowed

At 08:33 PM 1/14/2008, Raffe, Sydelle, SSA wrote:

>In my file, there are unique case records. These are apportioned to >hundreds of different workers such that each worker has multiple cases. > >We want to make a random selection of each workers cases. I don't >think that's what I led [John Norton] to understand.

And at 01:40 PM 1/15/2008, Raffe, Sydelle, SSA wrote:

>Actually, we want 6 cases randomly selected for each worker.

King Douglas gave a nice implementation using SORT CASES and RANK. As an alternative, here's the implementation with AGGREGATE and 'k/n' logic. (It requires that the file be grouped, but not necessarily sorted, by ID.) I'm selecting three records per worker.

|-----------------------------|---------------------------| |Output Created |16-JAN-2008 03:32:16 | |-----------------------------|---------------------------| ID Fname Lname RecdDate

A35 Aaron Aardvark 18-DEC-2004 A35 Aaron Aardvark 25-MAY-2005 A35 Aaron Aardvark 16-JUL-2005 A42 Bethany Birkinwell 30-OCT-2004 A42 Bethany Birkinwell 05-DEC-2004 A42 Bethany Birkinwell 24-DEC-2004 A42 Bethany Birkinwell 25-DEC-2004 C19 Charles Cubbage 25-JUL-2003 C19 Charles Cubbage 02-SEP-2003 C21 Dorothy Dickens 14-NOV-2002 D98 Ellis Etheridge 19-SEP-2000

Number of cases read: 11 Number of cases listed: 11

AGGREGATE OUTFILE=* MODE=ADDVARIABLES /BREAK=ID /NRecords 'Number of records for employee'=NU.

NUMERIC #K #N (F3).

DO IF $CASENUM EQ 1 OR ID NE LAG(ID). . COMPUTE #N = NRecords /* Total records, per worker */. . COMPUTE #K = MIN(3,#N) /* Number to sample, per worker */. END IF.

. /*-- PRINT / 'Record ' ID Fname Lname RecdDate ': ' /*-*/ /*-- 'K=' #K ', N=' #N /*-*/.

COMPUTE #Take_It = RV.BERNOULLI(#K/#N). COMPUTE #K = #K - #Take_It. COMPUTE #N = #N - 1.

. /*-- PRINT / ' TAKE=' #Take_It /*-*/.

SELECT IF #Take_It.

. /*-- EXECUTE /*-*/.

LIST.

List |-----------------------------|---------------------------| |Output Created |16-JAN-2008 03:32:17 | |-----------------------------|---------------------------| ID Fname Lname RecdDate NRecords

A35 Aaron Aardvark 18-DEC-2004 3 A35 Aaron Aardvark 25-MAY-2005 3 A35 Aaron Aardvark 16-JUL-2005 3 A42 Bethany Birkinwell 30-OCT-2004 4 A42 Bethany Birkinwell 05-DEC-2004 4 A42 Bethany Birkinwell 25-DEC-2004 4 C19 Charles Cubbage 25-JUL-2003 2 C19 Charles Cubbage 02-SEP-2003 2 C21 Dorothy Dickens 14-NOV-2002 1 D98 Ellis Etheridge 19-SEP-2000 1

Number of cases read: 10 Number of cases listed: 10 =================== APPENDIX: Test data =================== * ................................................................. . * ................. Test data ..................... . SET RNG = MT /* 'Mersenne twister' random number generator */ . SET MTINDEX = 3605 /* Providence, RI telephone book */ .

INPUT PROGRAM. . DATA LIST LIST /ID Fname Lname (A4,A8, A12). . LEAVE ID Fname Lname. . NUMERIC RecdDate (DATE11). . LEAVE RecdDate. . COMPUTE RecdDate=RV.UNIFORM(DATE.MDY(01,01,2000), DATE.MDY(01,01,2005)). . COMPUTE RecdDate=XDATE.DATE(RecdDate).

. NUMERIC #NRecrds #RecdNum (F3). . COMPUTE #NRecrds = TRUNC(RV.UNIFORM(1,5)). . LOOP #RecdNum = 1 TO #NRecrds. . COMPUTE RecdDate = RecdDate + RV.EXP(1/TIME.DAYS(45)). . COMPUTE RecdDate=XDATE.DATE(RecdDate). . END CASE. . END LOOP. END INPUT PROGRAM.

BEGIN DATA A35 Aaron Aardvark A42 Bethany Birkinwell C19 Charles Cubbage C21 Dorothy Dickens D98 Ellis Etheridge END DATA.

LIST.

===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


Back to: Top of message | Previous page | Main SPSSX-L page