|
Is there not a simpler way of using a retain statement?
So evenwhile the elements in the Group column will always be random, they
are always ordered [ 95 .... something<100 ]. So in one data step
data _null_;
set mygrouptabnew;
/*97 is in the range of group and retain_group*/
if (retain_group<97<=group) then do;
if group=97 then symput("x97",y);
else symput("x97",retainy);
end;
run;
Since my original table is already sorted by Group, so that the values are
ordered (95 < 95.3 <....), when I do two retain statements, then the
previous (group,y)=(96.9,y969) is joined with (97.2,y972) for a lateral
search and save in the above.
To create the two retain statements?
data mygrouptabnew;
set mygrouptab;
????
run;
Is this alright?
On Mon, Aug 17, 2009 at 6:20 AM, Muthia Kachirayan <
muthia.kachirayan@gmail.com> wrote:
> On Sun, Aug 16, 2009 at 5:27 PM, OR Stats <stats112@gmail.com> wrote:
>
> > Ok, I will try that to see if it runs faster. Another question that is
> > similar in flavor is
> >
> > Group Y
> > 95 y95
> > 95.1 y951
> > .
> > 96.9 y969
> > 97.2 y972
> > 98 y98
> > 99.1 y991
> >
> > I would like to store as macro variables y's for specified values in the
> > Group column. However, when that value is not found in the group and is
> in
> > between two values in the group, I would like the associated y to be the
> > smaller of the two y's. For example,
> > x(95)=y95
> > x(97)=y969
> > x(98)=y98
> > x(99)=y98
> >
> > Note how x(98) and x(99) would be the same. Since my Group is random, I
> > would be repeated this over and over. What is the fastest data step to
> > search and store these x's from the original dataset? Thank you!
> >
> >
>
> OR Stats,
>
> The most efficient method would be the use of Key Indexing. But the use of
> array to do
> it requires the index to be integer. For example, group 95 and 95.1 will
> end
> up to 95
> when it is used as array index as the decimals are truncated automatically.
> Assuming
> that you have no more than one decimal in group, it is multiplied by 10 for
> the use of
> array.
>
> The second assumption that Lookup value does not lie outside the limits of
> group.
> Your values are 95, 97 98 and 99. Suppose you want 10 to be looked up. It
> can be searched
> iteratively until a limit, say 1.0. Any such value will eventually takes up
> running time.
>
> All your lookup values go into a dataset LOOKUP as below. I have added 10
> to
> it for
> checking purpose.
>
>
> data have;
> input group Y $;
> cards;
> 95 y95
> 95.1 y951
> 96.9 y969
> 97.2 y972
> 98 y98
> 99.1 y991
> ;
> run;
>
> data lookup;
> input x;
> cards;
> 10
> 95
> 97
> 98
> 99
> ;
> run;
>
>
> data need(keep = group Y);
> do until(eof);
> set have end = eof;
> newg = group * 10;
> array k[1000] $ _temporary_;
> if k[newg] = ' ' then k[newg] = Y;
> end;
> do until(last);
> set lookup end = last;
> found = 1;
> group = x;
> *** Look in the left neighborhood of X in steps of 1 until found or reaches
> 1.0;
> do i = x * 10 to 1 by -1 while (found);
> if k[i] ne ' ' then do;
> Y = k[i];
> output;
> found = 0;
> end;
> else if i = 1 then
> put group = ' not found';
> end;
> end;
> run;
>
> The next best method is Hash method. There is no need to convert group
> value
> to an
> integer by multiplying by 10. However, to travel to next lower to match,
> decrement
> by 0.1 ends up with a floating point and FIND() method fails to detect the
> match.
> To avoid this problem,it is multiplied by 10 and then the integer-part is
> divided
> by 10.
>
>
> data need(keep = group Y);
> if _n_ = 1 then do;
> if 0 then set have;
> declare hash h(dataset:'have');
> h.definekey('group');
> h.definedata('Y');
> h.definedone();
> end;
> do until(last);
> set lookup end = last;
> found = 1;
> m = 0;
> group = x;
> if h.find() = 0 then output ;
> *** Look in the left neighborhood of X in steps of 0.1 until found or
> reaches 1.0;
> else do x = x - 0.1 to 1.0 by - 0.1 ;
> m = int(x * 10) / 10;
> rc = h.find(key:m);
> if rc = 0 then do;
> output;
> leave;
> end;
> end;
> if m = 1.0 then put group = ' not found';
>
> end;
> run;
>
> Muthia Kachirayan
>
|