Date: Fri, 6 Oct 2006 14:53:20 -0700
Reply-To: David L Cassell <davidlcassell@MSN.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: David L Cassell <davidlcassell@MSN.COM>
Subject: Re: To check whether the value in a SAS dataset is in uppercase
In-Reply-To: <A225EC41EB4D8644809E3C790763095D2A33E8@AUSMAIL.us.dmn>
Content-Type: text/plain; format=flowed
Rob.Workman@sorin-na.com replied:
>
>The documentation supports David's comments that a text literal is not
>repeatedly recompiled on each call to the prxmatch function. However,
>my tests repeatedly show that there still seems to be some overhead with
>the text literal. David's suggestion of the /o does work to eliminate
>the penalty. Great tip David!
Thanks. Useless information *is* my life. :-) :-)
Prior to SAS 9.1 , the /o modifier was more useful, since there was not
as much optimization under the hood with reafrd to when to pass stuff
off to the Perl regex engine and when not to.
>
>*** No prxparse or /o;
>
>22 data _null_;
>23 set foo end=end;
>24 if not prxmatch("/[a-z]/",var) then allup_count+1;
>25 if end then put allup_count;
>26 run;
>
>754828
>NOTE: There were 5000000 observations read from the data set WORK.FOO.
>NOTE: DATA statement used (Total process time):
> real time 1:02.50
> cpu time 6.12 seconds
>
>
>*** Prxparse;
>
>27
>28 data _null_;
>29
>30 rc = prxparse("/[a-z]/");
>31 do until (end);
>32 set foo end=end;
>33 if not prxmatch(rc,var) then allup_count+1;
>34 end;
>35
>36 put allup_count;
>37 stop;
>38 run;
>
>754828
>NOTE: There were 5000000 observations read from the data set WORK.FOO.
>NOTE: DATA statement used (Total process time):
> real time 36.46 seconds
> cpu time 6.16 seconds
>
>*** /o;
>
>39
>40
>41 data _null_;
>42 set foo end=end;
>43 if not prxmatch("/[a-z]/o",var) then allup_count+1;
>44 if end then put allup_count;
>45 run;
>
>754828
>NOTE: There were 5000000 observations read from the data set WORK.FOO.
>NOTE: DATA statement used (Total process time):
> real time 34.44 seconds
> cpu time 6.07 seconds
>
>Rob Workman
>
We have to anticipate some penalty associated with the underlying
process. The SAS developers have found a fast way of passing the
parsing and comparing off to the regex engine. But it's not as simple
as an INDEX() function.
HTCT,
David
--
David L. Cassell
mathematical statistician
Design Pathways
3115 NW Norwood Pl.
Corvallis OR 97330
_________________________________________________________________
Get today's hot entertainment gossip http://movies.msn.com/movies/hotgossip