Date:         Mon, 24 Mar 2003 10:40:07 -0800
Reply-To:     cassell.david@EPAMAIL.EPA.GOV
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "David L. Cassell" <cassell.david@EPAMAIL.EPA.GOV>
Subject:      Re: Help with making SQL/Data Step More Efficient
Content-type: text/plain; charset=us-ascii

"Gerstle, John" <yzg9@CDC.GOV> replied: > Dale,

Oops, you appear to have confused me with MIXEDmaster McLerran. But it's a common mistake. All us statisticians look alike. Like in the old Patty Duke show, where Patty Duke played identical cousin statisticians. Remember the old theme song? " could lose your mind, when statisticians are two of a kind!" Hey, that's why, whenever we get together, wackiness ensues. :-)

> The "later sampling analyses", as I've succinctly put it, will be using the > straight forward sampling and PROC SURVEYSELECT. But I would still have to > create the large dataset.

Okay, here's my question. Why? What is the reason you need the larger Cartesian-product data set in order to do these sampling exercises? My originla point was that, if you explicated more fully, we might be able to find a way out of the need for the Cartesian product.

> Are you suggesting that it would be wiser to NOT create/save smaller > datasets, but just save the large one? I can easily add an indicator > variable that would distinguish the groups to sample for use later.

Yes. If all you need is stratified sampling, then you can do that from a single data set with your 'indicator variable' serving as your stratum variable. But I *still* would liek to hear why the Cartesian product is needed for the sampling.

> BTW...thanks for the front-line reporting. I'm quite excited to visit your > fair city. My brother has mentioned that it's my kind of town. Coffee, mmm > good.

There's a Starbucks every thirty feet in Seattle (some sort of city ordinance), so you can't miss the coffee. In fact, if it's raining, just cut through the Starbucks stores, one after the other, until you reach your intended destination. There are enough of them now that they're nearly adjoining. I can't wait to find out whether there's a Starbucks in every meeting room at the convention center.

:-) David -- David Cassell, CSC Senior computing specialist mathematical statistician

