Date: Mon, 4 Aug 2008 15:48:38 -0400
Reply-To: Muthia Kachirayan <muthia.kachirayan@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Muthia Kachirayan <muthia.kachirayan@GMAIL.COM>
Subject: Re: Using a non-standard Cumulative Distribution
In-Reply-To: <37c08ecf-9ab2-42f9-8b33-4548be4f4c4f@z72g2000hsb.googlegroups.com>
Content-Type: text/plain; charset=ISO-8859-1
On Mon, Aug 4, 2008 at 1:19 PM, FPAStatman <brentvtimothy@gmail.com> wrote:
> I have a data set that contains a cumulative distribution:
> x cumprob
> 1 1
> 1 2
> 1 3
> 2 4
> 3 5
> 6 6
> . .
> . .
> . .
> 60 90
> 61 91
> 61 92
> 65 93
> 66 94
> 68 95
> 70 96
> 71 97
> 72 98
> 72 99
> 75 100
> I need to match this with another data set
> x y z
> 50 10 100
> 62 20 100
> 72 30 200
> etc.
>
> I am having problems in either I cannot match enough values (62 is not
> in the original table, so result is .) or too many values (72 occurs
> in cum prob table twice, so I get two rows instead of one).
> I would like to get the maximum cumulative percent for each value.
> What would be the best way to do this?
>
> Thank you for the help.
>
FPAStatman,
After seeing the result of Mike's SQL solution, I am not sure whether that
is what you require.
I thought that records of the FIRST(ONE) dataset matching with the
SECOND(TWO) dataset is required with the provision to select the records of
FIRST with maximum cumprob.
I borrow the example datasets of Mike and I use an array to store cumprob
from ONEand then use TWO to lookup.
data one;
input x cumprob @@;
datalines;
1 1 1 2 1 3 2 4 3 5 6 6 50 4 60 90
61 91 61 92 65 93 66 94 68 95 70 96 71 97 72 98
72 99 75 100
;
run;
data two;
input x y z @@;
datalines;
50 10 100 62 20 100 72 30 200
;
run;
data need;
set one (in = a) two (in = b);
array k[100] _temporary_; *choose dim of k as to the Maximum of X in dataset
ONE;
if a then do;
k[x] = cumprob;
end;
if b then do;
if k[x] then do;
cumprob = k[x];
output;
end;
end;
run;
The output comes to be:
x cumprob y z
50 4 10 100
72 99 30 200
Is this the result you expect ?
Regards,
Muthia Kachirayan