LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (March 2009, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 19 Mar 2009 06:04:08 -0400
Reply-To:     Jim Groeneveld <jim.1stat@YAHOO.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Jim Groeneveld <jim.1stat@YAHOO.COM>
Subject:      Re: parsing delimited string to variables
Comments: To: jain.vineet.i@GMAIL.COM

Hi Vineet,

For that purpose we have the dat step function SCAN and the macro functions %SCAN and %QSCAN. If your * delimited values are part of a dataset (multiple records) then there are the following discriminated situations: 1. the number of delimited values in a variable is constant over all records; 2. the number of delimited values in a variable is varying between records.

In the first case the number of newly to create variables is known from the amount of delimiters plus 1. But we can not immediately define a necessary ARRAY with a runtime dimension, as ARRAYS are created at compile time already.

In the second case the number is initially unknown, at least the amount in the first record, and might be determined from a separate preceding pass through the dataset by taking the maximum amount (+ 1) of all the records. This maximum might be stored in a macro variable.

Suppose Val_List is the variable containing the delimited character values;

* (Untested) Example code for case 1: count delimiters from first record; DATA _NULL_; SET Old; Elements = LENGTH ( COMPRESS ( Val_List, '*', 'K' ) ) + 1; CALL SYMPUT ('Elements', TRIM ( LEFT ( PUT ( Elements, 8. ) ) ) ); STOP; * prevent further looping through the dataset; RUN;

* (Untested) Example code for case 2: count delimiters from longest record; DATA _NULL_; SET Old; RETAIN Elements; N_Values = LENGTH ( COMPRESS ( Val_List, '*', 'K' ) ) + 1; Elements = MAX ( Elements, N_Values); CALL SYMPUT ('Elements', TRIM ( LEFT ( PUT ( Elements, 8. ) ) ) ); RUN;

* (Untested) Example code for both cases where; * the size of the single dimension is known; %PUT (Maximum) Number of elements determined = &Elements;

DATA New (DROP=I Val_List); SET Old; ARRAY Element (&Elements); DO I = 1 TO DIM(Element); Element(I) = SCAN ( Val_List, I, '*'); END; RUN;

The results are of type character. If you only have digits and want to converthem into numeric variables then: Nelement = INPUT ( Element, BEST.) ;

If you have consecutive delimiters SCAN counts them as one and you may have a too large array, but that is not a problem. You won't get the embedded empty value though. If you yet would need that it would require a lot of additional SAS code, analyzing the string without SCAN (but e.g. using INDEX and SUBSTR); I will give some example code if you really need that.

If your * delimited values are part of a (single) macro variable's value then the code to split it up into parts needs a macro %DO loop. That can not be done in open code, so it needs a real macro to be defined and run. There are two alternative ways: 3. counting the number of delimiters (+ 1) and using %DO loop, 4. no counting, but continue to create a new macro value until the values run out, until a value part is empty. This would exclude non-empty values after two successive delimiters.

Suppose the macro variable Val_List contains the delimited macro values;

%* (Untested) Example code for case 3; %MACRO DoLoop; %LOCAL I; %LET Elements = %EVAL ( %LENGTH ( %SYSFUNC (COMPRESS ( Val_List, *, K ) ) ) + 1 ); %DO I = 1 %TO &Elements; %GLOBAL Element&I; %LET Element&I = %SCAN ( &Val_List, &I, *); %END; %MEND DoLoop;

%DoLoop

If you have consecutive delimiters, %SCAN (or %QSCAN) count them as a single one, so you'll miss the empty values end and up with an element less. If you would need to also find the empty values it would need a lot more macro code using %INDEX and %SUBSTR, of which I won't give an example here.

%* (Untested) Example code for case 4; %MACRO WhiLoop; %LOCAL I Element; %LET I = 1; %LET Element = %SCAN ( &Val_List, &I, *); %DO %WHILE (&Element NE); %GLOBAL Element&I; %LET Element&I = Element; %LET I = %EVAL ( &I + 1 ); %LET Element = %SCAN ( &Val_List, &I, *); %END; %MEND WhiLoop;

%WhiLoop

Good luck. Regards - Jim. -- Jim Groeneveld, Netherlands Statistician, SAS consultant home.hccnet.nl/jim.groeneveld

My computer, my wife and I will attend SGF 2009 near Washington.

On Wed, 18 Mar 2009 12:09:25 -0700, jain.vineet.i@GMAIL.COM wrote:

>I am trying to parse a delimited string into variables. I do not know >how many values are there in the string. >e.g. tempDimensionName= Measure1* Measure2* Measure3* Measure4* >Measure5...Measure<n> >DimensionCount = Occurances of '*' in the above string >I want to get: Variable1=Measure1 Variable2=Measure2 ... >Variable<n>=Measure<n> >This is what I wrote which is of course wrong. > >Data _NULL_ ; > %LET Y = 0; > %Global Looping; > %LET Looping = %EVAL(&DimensionCount+1) ; > DO I = 1 TO &Looping BY 1; > %LET y = %eval(&y +1); > %Let Xt = %SCAN(&tempDimensionName,&Y,*) ; > CALL SYMPUT (RESOLVE('Variable')||left(I), &Xt); > END; >run; > >A little help in writing an efficient code for this is greatly >appreciated. > >Thanks, > >Vineet.


Back to: Top of message | Previous page | Main SAS-L page