LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (April 2005, week 5)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Sat, 30 Apr 2005 10:30:49 -0400
Reply-To:     Peter Crawford <peter.crawford@BLUEYONDER.CO.UK>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Peter Crawford <peter.crawford@BLUEYONDER.CO.UK>
Subject:      Re: Count

MEA CULPA correction:

if someone doesn't like missing value treated as valid "value", then the compare loop can be revised, from

> do i= 2 to dim( _mthd1_a ); > if _mthd1_a(i) ne _mthd1_a(i-1) then uniques +1 ; > end;

to

do i= dim( _mthd1_a ) to 2 by -1 while( _mthd1_a(i-1) ne . ) ; uniques +( _mthd1_a(i) ne _mthd1_a(i-1) ) ; end;

( changing the order of the compare to "top down" allows the until/while feature to terminate counting once any missings are discovered )

MEA CULPA let me know if you want log results to prove this improvement

Peter Crawford

On Sat, 30 Apr 2005 09:55:42 -0400, Peter Crawford <peter.crawford@BLUEYONDER.CO.UK> wrote:

>>-----Original Message----- >>Date: Sat, 30 Apr 2005 01:14:10 -0400 >>Reply-To: Thomas tythong@YAHOO.COM >>Subject: Count >> >>Hi, >> >>May I know how to count the non-missing variables with different numbers >>but treat the same numbers as one count? >> >>*data; >>id x1 x2 x3 x4 x5 x6 >>a 1234 2345 0349 1234 5678 1234 >>b 7890 0123 7890 7890 . 0123 >>c 1234 4567 8901 3456 2456 0012 >>d 0987 8765 3456 . . . >> >>*expected; >>id x1 x2 x3 x4 x5 x6 count_diff >>a 1234 2345 0349 1234 5678 1234 4 >>b 7890 0123 7890 7890 . 0123 2 >>c 1234 4567 8901 3456 2456 0012 6 >>d 0987 8765 3456 . . . 3 >-------------------------------------------------------------------- > >*the data ; >data starting ; > input id $ x1-x6 ; > list; cards; >a 1234 2345 0349 1234 5678 1234 >b 7890 0123 7890 7890 . 0123 >c 1234 4567 8901 3456 2456 0012 >d 0987 8765 3456 . . . >;/* >The following approach uses complexity in order to support >extremes of data volume. Probably no solution is universal >and the may be less suitable for cases where step- >boundaries can be crossed. > >Having suffered from data quantities too substantial to sort, >I prefer a single pass solution, to this query: > "count distinct values within the row" >I guess I should be advocating the sas9 hash table and iteration >objects, but since the sample data leaves little to the >imagination, may I offer a array based solution, along with >this novel feature of SAS9 > >NOTE: The SORTN function or routine is experimental in release 9.1. > >*/ >data wanted ; > set starting ; > > > array _mthd1_a(*) x: ; > > * keep a copy of the original array values "in order" ; > array _mthd1_b(10000) _temporary_; > do i= 1 to dim( _mthd1_a) ; > _mthd1_b( i ) = _mthd1_a( i ); > end; *could be faster with call poke !! ; > > * sort those columns; > call sortn( of _mthd1_a(*) ); > > * count the unique values ; > uniques = 1 ; > do i= 2 to dim( _mthd1_a ); > if _mthd1_a(i) ne _mthd1_a(i-1) then uniques +1 ; > end; > > do i= 1 to dim( _mthd1_a) ; > put _mthd1_b( i ) 6. @; > end; > put +2 uniques= ; > > > >run; > > ======================================================================= > >That data step generated these lines in my log >NOTE: The SORTN function or routine is experimental in release 9.1. > 1234 2345 349 1234 5678 1234 uniques=4 > 7890 123 7890 7890 . 123 uniques=3 > 1234 4567 8901 3456 2456 12 uniques=6 > 987 8765 3456 . . . uniques=4 >NOTE: There were 4 observations read from the data set WORK.STARTING. >NOTE: The data set WORK.WANTED has 4 observations and 9 variables. >NOTE: DATA statement used (Total process time): > real time 0.52 seconds > cpu time 0.03 seconds >=================================================================== > >On the issue of "experimental" ............................... > >The sortn() call routine can be found documented in >http://support.sas.com/rnd/base/index-datastep.html >at > Papers > DATA Step in Version 9: What's New? (updated for SUGI 29) > >whose link points to >http://support.sas.com/rnd/base/topics/datastep/dsv9-sugi-v3.pdf > >"SUGI29" implies that the paper defines the situation a year ago. > >If we make use of these facilities, they'll move from > experimental to production > more quickly. >So, please let me encourage your use of these :: > call sortn() >and > call sortc() > > >The more we experiment, the sooner they'll be "production" > >(ensure sas customer support hear your feedback on issues) > > I think there is nothing as "testing" as "live use" by > this community !! > > >"sort-on sas_L -ers" !! > > > >Peter Crawford


Back to: Top of message | Previous page | Main SAS-L page