LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 2008)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 20 May 2008 23:37:26 -0400
Reply-To:     Richard Ristow <wrristow@mindspring.com>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Richard Ristow <wrristow@mindspring.com>
Subject:      Re: Random Sample by Counselor
Comments: To: Allen Frommelt <rfrommelt@NURTURHEALTH.COM>
In-Reply-To:  <51F45499BCFE674F8A4EA4841F9CE6920B98E1B9B2@STLEXVN01P.cent
              ene.com>
Content-Type: text/plain; charset="us-ascii"; format=flowed

At 02:57 PM 5/20/2008, Allen Frommelt wrote:

>I have a database of participant id, and health counselor. I need >to create a 25% random sample by counselor. Is there a way to do >this in SPSS? Thanks!

All the participants for 25% of the counselors, or 25% of the participants for all counselors?

If you want all participants for 25% of the counselors, you get a list of the counselors, sample 25% of them, and merge back with the original file. As for the sampling, use any method you please; though the SAMPLE command requires you to hard-code the exact number of counselors, if you want a sample as near as possible to an exact 25%.

The following code is tested, but this is simply the code, not a listing. (The LIST commands should be removed for production use.) It uses dataset logic (SPSS 14 and later), and assumes that the data is in an active dataset named PartiList.

DATASET DECLARE CounsList.

AGGREGATE OUTFILE=CounsList /BREAK=Counselor /CaseLoad 'No. of clients for counselor' = NU.

DATASET ACTIVATE CounsList WINDOW=FRONT.

* Sample 25% of the counselors by the 'K/N' metnod: .

COMPUTE NOBREAK = 1. DATASET DECLARE CounsCount. AGGREGATE OUTFILE=CounsCount /BREAK=NOBREAK /N 'Number of counselors' = NU. DATASET ACTIVATE CounsCount WINDOW=FRONT.

NUMERIC K (F3). VAR LABEL K 'Counselors to sample'. COMPUTE K = RND(0.25*N). FORMATS N K (F3).

DATASET ACTIVATE CounsList WINDOW=FRONT. MATCH FILES /FILE =* /TABLE=CounsCount /BY NOBREAK.

DO IF $CASENUM EQ 1. . COMPUTE #K = K. . COMPUTE #N = N. END IF.

NUMERIC InSample (F2). VAR LABELS InSample 'Indicator: Counselor is in sample'.

COMPUTE InSample = RV.BERNOULLI(#K/#N). COMPUTE #K = #K - InSample. COMPUTE #N = #N - 1.

LIST Counselor InSample.

DATASET ACTIVATE PartiList WINDOW=FRONT.

* Attach 'Sampled' flag to participant records . MATCH FILES /FILE =PartiList /TABLE=CounsList /BY Counselor /DROP = NOBREAK K N CaseLoad.

. /**/ LIST /*-*/. SELECT IF InSample. ============================ APPENDIX: Test data and code ============================ * ................................................................. . * ................. Test data ..................... . SET RNG = MT /* 'Mersenne twister' random number generator */ . SET MTINDEX = 9518 /* A phone number in Maryland */ .

INPUT PROGRAM. . NUMERIC Counselor (N3) Participant(F5). . LEAVE Counselor. . LOOP #I_Couns = 1 TO 12. . COMPUTE Counselor = TRUNC(RV.UNIFORM(100,1000)). . COMPUTE #N_Client = RV.POISSON(5). . LOOP #I_Client = 1 TO #N_Client. . COMPUTE Participant = TRUNC(RV.UNIFORM(1E4,1E5)). . END CASE. . END LOOP. . END LOOP. END FILE. END INPUT PROGRAM. SORT CASES BY Counselor Participant. DATASET NAME PartiList WINDOW=FRONT. LIST.

* ................. Post after this point ..................... . * ................................................................. .

DATASET DECLARE CounsList.

AGGREGATE OUTFILE=CounsList /BREAK=Counselor /CaseLoad 'No. of clients for counselor' = NU.

DATASET ACTIVATE CounsList WINDOW=FRONT.

* ................. Post after this point ..................... . * Sample 25% of the counselors by the 'K/N' metnod: .

COMPUTE NOBREAK = 1. DATASET DECLARE CounsCount. AGGREGATE OUTFILE=CounsCount /BREAK=NOBREAK /N 'Number of counselors' = NU. DATASET ACTIVATE CounsCount WINDOW=FRONT.

NUMERIC K (F3). VAR LABEL K 'Counselors to sample'. COMPUTE K = RND(0.25*N). FORMATS N K (F3).

DATASET ACTIVATE CounsList WINDOW=FRONT. MATCH FILES /FILE =* /TABLE=CounsCount /BY NOBREAK.

DO IF $CASENUM EQ 1. . COMPUTE #K = K. . COMPUTE #N = N. END IF.

NUMERIC InSample (F2). VAR LABELS InSample 'Indicator: Counselor is in sample'.

COMPUTE InSample = RV.BERNOULLI(#K/#N). COMPUTE #K = #K - InSample. COMPUTE #N = #N - 1.

LIST Counselor InSample.

DATASET ACTIVATE PartiList WINDOW=FRONT.

* Attach 'Sampled' flag to participant records . MATCH FILES /FILE =PartiList /TABLE=CounsList /BY Counselor /DROP = NOBREAK K N CaseLoad.

. /**/ LIST /*-*/. SELECT IF InSample.

===================== To manage your subscription to SPSSX-L, send a message to LISTSERV@LISTSERV.UGA.EDU (not to SPSSX-L), with no body text except the command. To leave the list, send the command SIGNOFF SPSSX-L For a list of commands to manage subscriptions, send the command INFO REFCARD


Back to: Top of message | Previous page | Main SPSSX-L page