Date: Thu, 5 Oct 2006 08:20:46 -0600
Reply-To: "Workman, Rob" <Rob.Workman@SORIN-NA.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Workman, Rob" <Rob.Workman@SORIN-NA.COM>
Subject: Re: To check whether the value in a SAS dataset is in uppercase
Content-Type: text/plain; charset="us-ascii"
If the prx expression compilation is moved out of the loop with the
prxparse function, the two methods are on par with each other, at least
in terms of real time.
687 data _null_;
688
689 rc = prxparse("/[a-z]/");
690 do until (end);
691 set foo end=end;
692 if not prxmatch(rc,var) then allup_count+1;
693 end;
694
695 put allup_count;
696 stop;
697 run;
754828
NOTE: There were 5000000 observations read from the data set WORK.FOO.
NOTE: DATA statement used (Total process time):
real time 42.35 seconds
cpu time 5.96 seconds
698
699 data _null_;
700
701 do until (end);
702 set foo end=end;
703
704 if upcase(var) = var then allup_count+1;
705 end;
706
707 put allup_count;
708 stop;
709 run;
754828
NOTE: There were 5000000 observations read from the data set WORK.FOO.
NOTE: DATA statement used (Total process time):
real time 42.40 seconds
cpu time 3.30 seconds
Rob Workman
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
Richard A. DeVenezia
Sent: Wednesday, October 04, 2006 6:08 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: To check whether the value in a SAS dataset is in uppercase
plw213 wrote:
> If you are using SAS V9, you can use regular expresions to check for
> uppercase
> by using something like this:
>
> if not prxmatch("/[a-z]/",var) then result="all uppercase";
For small scale problems, the difference between prxmatch(pat,x) and
upcase(x)=x is pretty neglible. However, in a simple benchmark on my
system, as the data volume grows, the upcase way has speed advantages.
-------------------------------------
data foo;
seed = 1234;
do rowid = 1 to 5e6;
length var $100;
do until (length(var)=100);
var = cats
(var,repeat(byte(20+108*ranuni(seed)),30*ranuni(seed)));
end;
output;
var = '';
end;
keep rowid var;
run;
sasfile foo open;
data _null_;
set foo;
run;
data _null_;
set foo;
run;
data _null_;
set foo end=end;
if upcase(var) = var then allup_count+1;
if end then put allup_count;
run;
data _null_;
set foo end=end;
if not prxmatch("/[a-z]/",var) then allup_count+1;
if end then put allup_count;
run;
sasfile foo close;
-------------------------------------
187
188 data _null_;
189 set foo end=end;
190 if upcase(var) = var then allup_count+1;
191 if end then put allup_count;
192 run;
754828
NOTE: There were 5000000 observations read from the data set WORK.FOO.
NOTE: DATA statement used (Total process time):
real time 2.45 seconds
cpu time 2.45 seconds
193
194 data _null_;
195 set foo end=end;
196 if not prxmatch("/[a-z]/",var) then allup_count+1;
197 if end then put allup_count;
198 run;
754828
NOTE: There were 5000000 observations read from the data set WORK.FOO.
NOTE: DATA statement used (Total process time):
real time 4.35 seconds
cpu time 4.36 seconds
Richard A. DeVenezia
http://www.devenezia.com/downloads/sas/samples
----------------------------------------------------------------------------------------------
This message contains confidential information intended only for the use of the addressee(s).
If you are not the addressee, or the person responsible for delivering it to the addressee, you
are hereby notified that reading, disseminating, distributing or copying this message is strictly
prohibited. If you have received this message by mistake, please notify us, by replying to the
sender, and delete the original message immediately thereafter. Thank you.