LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (July 2009, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 1 Jul 2009 11:47:19 -0700
Reply-To:     "Pardee, Roy" <pardee.r@GHC.ORG>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Pardee, Roy" <pardee.r@GHC.ORG>
Subject:      Re: Using SURVEYSELECT for random assignment?
Comments: To: Joe Matise <snoopy369@gmail.com>
In-Reply-To:  <b7a7fa630906302050t734d28d1q381e87f4166fe09c@mail.gmail.com>
Content-Type: text/plain; charset="us-ascii"

Thanks much for this Joe--very helpful.

I wonder whether method=seq would do what I want, but I'm not at all familiar w/the lit cited in the help docs, and I really shouldn't guess, so I very much appreciate the code you've shared.

Thanks!

-Roy

________________________________

From: Joe Matise [mailto:snoopy369@gmail.com] Sent: Tuesday, June 30, 2009 8:50 PM To: Pardee, Roy Cc: SAS-L@listserv.uga.edu Subject: Re: Using SURVEYSELECT for random assignment?

If it were me, with my account groups, I'd not use PROC SURVEYSELECT directly for this, because of rule 4. I am not a PROC SURVEYSELECT expert, but reading over the sampling methods, none of them do what that requests. Specifically, SYS takes every other observation, exactly; not one from each pair randomly. It might work perfectly fine for what you want, but I don't think it is precisely what 4. calls for.

You could do this pretty trivially in a data step, of course. I'll actually use PROC SURVEYSELECT at the end just for fun, though you could just as easily (and probably easier) do this in a data step.

data docs ; input @1 doc $char4. @7 clinic $char9. @17 pts_over_50 @23 category $char6. ; datalines ; bob central 398 small mary central 400 small erin central 505 small john central 1000 medium lori central 1400 medium suzy central 2000 large raul central 2500 large jill central 3100 large roy central 5000 large joe central 8000 large jan central 8000 large jim central 8000 large stan central 8500 large jack central 8500 large carl eastside 391 small jane eastside 4000 large jess eastside 3999 large ; run ;

proc sort data=docs; by clinic category DESCENDING pts_over_50 ; *taking advantage of large,medium,small being in correct order already - if numeric, then DESCENDING category; run ;

data docs_stratified / view=docs_stratified; set docs; by clinic category descending pts_over_50; if first.category then do; *or possibly first.clinic?; strata = 0; rowcount = 0; end; rowcount+1; strata + mod(rowcount,2); run;

proc surveyselect data=docs_stratified out=tx_docs rate=0.5 seed=987654; strata clinic category strata; run;

*one way to do it using datastep - this does NOT guarantee the extra strata have a Tx selection, like the PROC SURVEYSELECT does;

data tx_docs cn_docs; set docs_stratified; retain selected; by clinic category strata; if first.strata then do; if round(ranuni(987654),1) = 1 then selected = 1; else selected = 0; end; else selected = 1 - selected; if selected = 1 then output tx_docs; else output cn_docs; run;

-Joe

On Tue, Jun 30, 2009 at 7:56 PM, Pardee, Roy <pardee.r@ghc.org> wrote:

Hey All,

I've got a set of about 100 physicians I need to randomize into treatment & control conditions, blocking on clinic and the # of patients in their panel over age 50. The instructions I have are:

1. Assign each doc a into large/medium/small panel size category, on the basis of # of patients over age 50 (pts_over_50).

2. Add some extra, pretend randomization slots to each clinic/category for future docs to slot into if/when new docs are added later.

3. Sort the docs by clinic & then descending pts_over_50.

4. Proceeding from the top of the list, randomize two docs at a time--so for example, flip a coin and if the result is heads, the first doc becomes tx & second is control, if result is tails, then first doc is control & second is tx.

I'm wondering if SURVEYSELECT can do the basic randomization for me & save me the row-by-row programming? (I'm content to deal w/adding the extra slots in step 2).

Looking at the sample in the help file entry for SURVEYSELECT, the below call seems promising.

* =============================== ;

data docs ; input @1 doc $char4. @7 clinic $char9. @17 pts_over_50 @23 category $char6. ; datalines ; bob central 398 small mary central 400 small erin central 505 small john central 1000 medium lori central 1400 medium suzy central 2000 large roy central 5000 large carl eastside 391 small jane eastside 4000 large jess eastside 3999 large ; run ;

proc sort ; by clinic DESCENDING pts_over_50 ; run ;

proc print ; run ;

proc surveyselect data = docs method = sys rate = 0.5 seed = 987654 out = tx_docs ; strata clinic category ; control pts_over_50 ; run ;

proc print data = tx_docs ; run ;

* =============================== ;

(So--anybody in the tx_docs dataset gets assigned to the treatment condition, and the balance are controls.)

Do any of you understand SURVEYSELECT well enough to say whether that call is equivalent to the instructions below? Is there a better way? Or should I just suck it up & try to literally carry out the instructions I have?

I realize this is a long-shot, but figured I'd try...

Many thanks!

-Roy

GHC Confidentiality Statement

This message and any attached files might contain confidential information protected by federal and state law. The information is intended only for the use of the individual(s) or entities originally named as addressees. The improper disclosure of such information may be subject to civil or criminal penalties. If this message reached you in error, please contact the sender and destroy this message. Disclosing, copying, forwarding, or distributing the information by unauthorized individuals or entities is strictly prohibited by law.


Back to: Top of message | Previous page | Main SAS-L page