Date: Wed, 6 Sep 2000 21:38:45 GMT
Reply-To: sashole@mediaone.net
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Paul Dorfman <paul_dorfman@HOTMAIL.COM>
Subject: Re: selecting the second smallest value
Content-Type: text/plain; format=flowed
Kevin,
After some testing, I have found that the fastest solution is the simplest
one:
35 data _null;
36 array v(22) (0 2 0 2 0 3 0 4 0 5 0 6 0 7 0 8 9 0 4 0 6 0);
37 minv = 1e99;
38 do i=1 to 22;
39 if v(i) > 0 then if v(i) < minv then minv = v(i);
40 end;
41 put minv=;
42 run;
MINV=2
However, as an intersting instance of using the direct-addressing functions,
consider the following:
990 data _null;
991 array v(22) (0 2 0 2 0 3 0 4 0 5 0 6 0 7 0 8 9 0 4 0 6 0);
992 call poke(tranwrd(peekc(addr(v1),22*8),
993 put(0,rb8.), put(.,rb8.)), addr(v1),22*8);
994 minv = min(of v(*));
995 put minv=;
996 put v(*);
997 run;
MINV=2
. 2 . 2 . 3 . 4 . 5 . 6 . 7 . 8 9 . 4 . 6 .
The idea above is simple. Fetch the entire array representation from memory
as a 22*8-byte string. The latter contains some 8-byte chunks representing
zeros as put(0,rb8.). Use TRANWRD to replace all such chunks with the
numeric represention of a missing value, put(.,rb8.), and stick the
transformed string back into memory at the original address addr(v1). Now
that CALL POKE is complete, the array contains missing values where it
previously had zeros, which makes MIN operate the way we need. Note that in
actuality, the array is not even needed - I only used it for the sake of
initialization. However, for the method to work, it is *paramount* that the
V-variables be contiguous in PDV. Usually, series of variables like V1-V22
are, but there might be exceptions, so this needs to be ascertained.
Kind regards,
========================
Paul M. Dorfman
Jacksonville, Fl
========================
>From: Kevin Brunson <kbrunson@ND.EDU>
>Reply-To: Kevin Brunson <kbrunson@ND.EDU>
>To: SAS-L@LISTSERV.UGA.EDU
>Subject: selecting the second smallest value
>Date: Wed, 6 Sep 2000 12:52:34 -0500
>
>Configuration: SAS 6.12, NT, 600+ MhZ Pentium, many gigs of disk.
>Data: 250,000 obs X 22 variables
>
>I'm looking for the best (I realize the implications of this adjective) way
>to
>select the smallest nonzero value (negative values don't exist) among
>several
>variables. For instance, if I want the smallest of v1-v3 and all are
>positive
>then the min function works. However, if any of v1-v3 are zero (it's
>possible
>all can be zero) then it fails. For now, I resorted to simple if-then
>statements that recode 0's to a large value temporarily. I can think of
>ways
>to accomplish this using retain, select, indexing, inverting and then using
>the max function, etc. but maybe someone has experience with this problem?
>
>Oh yeah, I'm still a data step programmer and want to avoid introducing
>proc
>sql, but I wouldn't ignore a slick solution using it.
>
>As you can gather from the configuration and data, resources generally
>won't
>be a problem, but right now 2 different combinations of the 22 variables
>has
>to be analyzed and in the future it will likely expand to more groupings
>and
>dozens of different datasets (I'm programming for a new prof and don't know
>how it will expand).
>
>Here's an example of the data (it's interday bid/ask quotes)
>
>v1 v2 v3
>0 3.75 0
>0 3.75 0
>3.125 3.625 0
>3.125 3.75 3.0
>
>--
>-----------------------------------------------------------
>Kevin Brunson, Research Analyst 219-631-9448
>314 Mendoza College of Business FAX
>219-631-5255
>Notre Dame, Indiana 46556
>-----------------------------------------------------------
>---- One way smokers pay for themselves is by dying early.
>---- Thomas Humber, President, National Smokers Alliance
_________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com.
Share information about yourself, create your own public profile at
http://profiles.msn.com.
|