Date: Wed, 29 Mar 2000 13:47:06 -0500
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Mark Moran <Mark.K.Moran@CCMAIL.CENSUS.GOV>
Subject: Recognition Challenges from Two Piles
Content-type: text/plain; charset=us-ascii
This is an extra credit optional classroom exercise, which I am trying to
develop for a class I will eventually teach involving quality assurance.
We have 2 nonoverlapping piles of manufactured objects. The piles can be called
Pile ACCEPT and Pile REJECT. These piles were artificially arranged in place
one by one where they now are, not so much accumulated necessarily in a
"natural" way ... and, again, the piles have no overlap. Every object belongs
in one pile or the other and can be described by its x, y location along with
other variables z_1, z_2, . . . , z_i.
As part of the validation/recognition challenge for a quality inspection
program, objects are drawn from the two piles at random (with replacement, I
think) and then logged into a database that stores x, y, z_1, z_2, . . . , z_i.
The variables z_1, z_2, . . . , z_i for the object (which are the only measures
readily identified in the manufacturing process) are presented to an inspector
without the x, y position info. (Had the inspector known the x,y positions, his
or her job would become much easier at this point because of the non-overlap.)
After many trials with different objects sampled for the inspector to label on
the basis of the z_1, z_2, . . . , z_i variables, the question to be analysed is
simply whether the inspector's method of attaching REJECT/ACCEPT labels based on
z_vars surpasses, exactly equals, or bombs worse than someone randomly guessing
which was which. Ideally a p-value and standard deviation would go with this
If each of the two piles can be set for specific bivariate distributions
[Bivariate N(sigma,mu), chi square, etc.] with unique bivariate parameters for
each pile, and the locations of the two piles separated by specific
two-dimensional distances apart,* is there a logical way to simulate and answer
this question in SAS 6.12? A more advanced version of the SAS would accommodate
not only different distributions and spacings apart of the piles, but also
different values of i (e.g., if there are 5 z vars then z_i = z_5) and sampling
with or without replacement.
How much of a charlie horse between the ears have I given you? :) It's not so
easy, is it?
*if both piles are bivariate N(sigma,mu) distributions and one of the two piles
is at (x=0,y=0) then for a distance d apart the other distribution will have to
satisfy d^2=[diff in x]^2 + [diff in y]^2 by the pythagorean theorem.