LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (August 2008, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Mon, 4 Aug 2008 15:48:38 -0400
Reply-To:     Muthia Kachirayan <muthia.kachirayan@GMAIL.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Muthia Kachirayan <muthia.kachirayan@GMAIL.COM>
Subject:      Re: Using a non-standard Cumulative Distribution
Comments: To: FPAStatman <brentvtimothy@gmail.com>
In-Reply-To:  <37c08ecf-9ab2-42f9-8b33-4548be4f4c4f@z72g2000hsb.googlegroups.com>
Content-Type: text/plain; charset=ISO-8859-1

On Mon, Aug 4, 2008 at 1:19 PM, FPAStatman <brentvtimothy@gmail.com> wrote:

> I have a data set that contains a cumulative distribution: > x cumprob > 1 1 > 1 2 > 1 3 > 2 4 > 3 5 > 6 6 > . . > . . > . . > 60 90 > 61 91 > 61 92 > 65 93 > 66 94 > 68 95 > 70 96 > 71 97 > 72 98 > 72 99 > 75 100 > I need to match this with another data set > x y z > 50 10 100 > 62 20 100 > 72 30 200 > etc. > > I am having problems in either I cannot match enough values (62 is not > in the original table, so result is .) or too many values (72 occurs > in cum prob table twice, so I get two rows instead of one). > I would like to get the maximum cumulative percent for each value. > What would be the best way to do this? > > Thank you for the help. >

FPAStatman,

After seeing the result of Mike's SQL solution, I am not sure whether that is what you require.

I thought that records of the FIRST(ONE) dataset matching with the SECOND(TWO) dataset is required with the provision to select the records of FIRST with maximum cumprob.

I borrow the example datasets of Mike and I use an array to store cumprob from ONEand then use TWO to lookup.

data one; input x cumprob @@; datalines; 1 1 1 2 1 3 2 4 3 5 6 6 50 4 60 90 61 91 61 92 65 93 66 94 68 95 70 96 71 97 72 98 72 99 75 100 ; run;

data two; input x y z @@; datalines; 50 10 100 62 20 100 72 30 200 ; run;

data need; set one (in = a) two (in = b); array k[100] _temporary_; *choose dim of k as to the Maximum of X in dataset ONE; if a then do; k[x] = cumprob; end; if b then do; if k[x] then do; cumprob = k[x]; output; end; end; run;

The output comes to be:

x cumprob y z

50 4 10 100 72 99 30 200

Is this the result you expect ?

Regards,

Muthia Kachirayan


Back to: Top of message | Previous page | Main SAS-L page