Date: Sat, 30 Apr 2005 10:30:49 -0400
Reply-To: Peter Crawford <peter.crawford@BLUEYONDER.CO.UK>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Peter Crawford <peter.crawford@BLUEYONDER.CO.UK>
Subject: Re: Count
MEA CULPA
correction:
if someone doesn't like missing value treated as valid "value",
then the compare loop can be revised, from
> do i= 2 to dim( _mthd1_a );
> if _mthd1_a(i) ne _mthd1_a(i-1) then uniques +1 ;
> end;
to
do i= dim( _mthd1_a ) to 2 by -1 while( _mthd1_a(i-1) ne . ) ;
uniques +( _mthd1_a(i) ne _mthd1_a(i-1) ) ;
end;
( changing the order of the compare to "top down" allows
the until/while feature to terminate counting once any
missings are discovered )
MEA CULPA
let me know if you want log results to prove this improvement
Peter Crawford
On Sat, 30 Apr 2005 09:55:42 -0400, Peter Crawford
<peter.crawford@BLUEYONDER.CO.UK> wrote:
>>-----Original Message-----
>>Date: Sat, 30 Apr 2005 01:14:10 -0400
>>Reply-To: Thomas tythong@YAHOO.COM
>>Subject: Count
>>
>>Hi,
>>
>>May I know how to count the non-missing variables with different numbers
>>but treat the same numbers as one count?
>>
>>*data;
>>id x1 x2 x3 x4 x5 x6
>>a 1234 2345 0349 1234 5678 1234
>>b 7890 0123 7890 7890 . 0123
>>c 1234 4567 8901 3456 2456 0012
>>d 0987 8765 3456 . . .
>>
>>*expected;
>>id x1 x2 x3 x4 x5 x6 count_diff
>>a 1234 2345 0349 1234 5678 1234 4
>>b 7890 0123 7890 7890 . 0123 2
>>c 1234 4567 8901 3456 2456 0012 6
>>d 0987 8765 3456 . . . 3
>--------------------------------------------------------------------
>
>*the data ;
>data starting ;
> input id $ x1-x6 ;
> list; cards;
>a 1234 2345 0349 1234 5678 1234
>b 7890 0123 7890 7890 . 0123
>c 1234 4567 8901 3456 2456 0012
>d 0987 8765 3456 . . .
>;/*
>The following approach uses complexity in order to support
>extremes of data volume. Probably no solution is universal
>and the may be less suitable for cases where step-
>boundaries can be crossed.
>
>Having suffered from data quantities too substantial to sort,
>I prefer a single pass solution, to this query:
> "count distinct values within the row"
>I guess I should be advocating the sas9 hash table and iteration
>objects, but since the sample data leaves little to the
>imagination, may I offer a array based solution, along with
>this novel feature of SAS9
>
>NOTE: The SORTN function or routine is experimental in release 9.1.
>
>*/
>data wanted ;
> set starting ;
>
>
> array _mthd1_a(*) x: ;
>
> * keep a copy of the original array values "in order" ;
> array _mthd1_b(10000) _temporary_;
> do i= 1 to dim( _mthd1_a) ;
> _mthd1_b( i ) = _mthd1_a( i );
> end; *could be faster with call poke !! ;
>
> * sort those columns;
> call sortn( of _mthd1_a(*) );
>
> * count the unique values ;
> uniques = 1 ;
> do i= 2 to dim( _mthd1_a );
> if _mthd1_a(i) ne _mthd1_a(i-1) then uniques +1 ;
> end;
>
> do i= 1 to dim( _mthd1_a) ;
> put _mthd1_b( i ) 6. @;
> end;
> put +2 uniques= ;
>
>
>
>run;
>
> =======================================================================
>
>That data step generated these lines in my log
>NOTE: The SORTN function or routine is experimental in release 9.1.
> 1234 2345 349 1234 5678 1234 uniques=4
> 7890 123 7890 7890 . 123 uniques=3
> 1234 4567 8901 3456 2456 12 uniques=6
> 987 8765 3456 . . . uniques=4
>NOTE: There were 4 observations read from the data set WORK.STARTING.
>NOTE: The data set WORK.WANTED has 4 observations and 9 variables.
>NOTE: DATA statement used (Total process time):
> real time 0.52 seconds
> cpu time 0.03 seconds
>===================================================================
>
>On the issue of "experimental" ...............................
>
>The sortn() call routine can be found documented in
>http://support.sas.com/rnd/base/index-datastep.html
>at
> Papers
> DATA Step in Version 9: What's New? (updated for SUGI 29)
>
>whose link points to
>http://support.sas.com/rnd/base/topics/datastep/dsv9-sugi-v3.pdf
>
>"SUGI29" implies that the paper defines the situation a year ago.
>
>If we make use of these facilities, they'll move from
> experimental to production
> more quickly.
>So, please let me encourage your use of these ::
> call sortn()
>and
> call sortc()
>
>
>The more we experiment, the sooner they'll be "production"
>
>(ensure sas customer support hear your feedback on issues)
>
> I think there is nothing as "testing" as "live use" by
> this community !!
>
>
>"sort-on sas_L -ers" !!
>
>
>
>Peter Crawford
|