```Date: Thu, 12 Jun 2008 19:43:20 -0400 Reply-To: "Howard Schreier " Sender: "SAS(r) Discussion" From: "Howard Schreier " Subject: Re: group count and plot On Tue, 10 Jun 2008 14:27:18 -0400, Amit Sharma wrote: >Dear All, > >As I am new to the SAS world, am again stuck and need help. > >There is a dataset (name = ds) having var 'A' and 'B'. >'A' is a binary var (attribute = 0 or 1). I want to plot a graph of >fraction of '0's versus fraction of '1's of var A. These values of 'A' are >already arranged (and have to be kept like that only) in increasing order >of a var B (var B is a continous variable). > >So, I want to find that "when one of the attributes(say '0') reaches a >multiple of decile (10%, 20%, 30%...100%), what part of the other >attribute had been covered till then". > >For plotting the graph, I would want this kind of a table: >PLEASE NOTE THAT one of the attribute will surely reach 100% before the >other. So, there have to to be 11 pair of x and y values. > >Dummy Table: plot_this >Attribute=0 Attribute=1 >10% 3% >20% 7% >30% 12% >40% 19% >50% 24% >60% 30% >70% 37% >80% 57% >90% 74% >100% 89% >100% 100% > >Thanks and Regards, >Amit Test data: %let size=16; data ds; do _n_ = 1 to &size; A = round(ranuni(23) + _n_ / (2 * &size) ); B + floor(ranuni(23) * 5); output; end; run; Note that you might have had more replies, and sooner, if you had provided such data. Next count the 1's and 0's: proc summary data=ds nway; class a; output out=count0(rename = (_freq_=count0) where = (a=0) ); output out=count1(rename = (_freq_=count1) where = (a=1) ); run; Now it's possible to compute the cumulative counts and percentages: data plotpoints(drop = count0 count1); if _n_=1 then do; set count0(keep = count0); set count1(keep = count1); end; seq + 1; set ds; A_0cum + (1 - a); A_1cum + a ; A_0cumpct = 100 * a_0cum / count0; A_1cumpct = 100 * a_1cum / count1; run; Results: seq A B A_0cum A_1cum A_0cumpct A_1cumpct 1 0 1 1 0 20 0.000 2 0 5 2 0 40 0.000 3 0 6 3 0 60 0.000 4 1 7 3 1 60 9.091 5 0 9 4 1 80 9.091 6 1 9 4 2 80 18.182 7 1 11 4 3 80 27.273 8 0 15 5 3 100 27.273 9 1 19 5 4 100 36.364 10 1 20 5 5 100 45.455 11 1 22 5 6 100 54.545 12 1 25 5 7 100 63.636 13 1 27 5 8 100 72.727 14 1 29 5 9 100 81.818 15 1 32 5 10 100 90.909 16 1 35 5 11 100 100.000 Then you can subset and interpolate, but I just let the last two columns go into the plot: proc gplot data=plotpoints; plot A_1cumpct * A_0cumpct; run; quit; ```

Back to: Top of message | Previous page | Main SAS-L page