LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (November 2002, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Fri, 22 Nov 2002 10:10:58 -0500
Reply-To:   Larry Bertolini <bertolini.1@OSU.EDU>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   Larry Bertolini <bertolini.1@OSU.EDU>
Organization:   Ohio State University
Subject:   Re: Parsing a comma delimited column
Content-Type:   text/plain; charset=us-ascii

One caveat to keep in mind, when working with the SCAN function...

If you don't explicitly specify a length for character variables, SAS assigns a length, based on its "best guess". For example, in the following data step, variable "foo" is assigned a length of 6 bytes:

data; foo = substr("abcdefghijklmnop",1,6);

In my experience, variables created by the SCAN function tend to be assigned a length of 200 bytes.

So you might want to add:

length varA $6 varB $4 varC $2;

Philip_Crane@WORKCOVER.VIC.GOV.AU wrote:

> Michelle, > > You could use the scan function to seperate the variables. Scan returns > character variables so I included some lines to convert varB to numeric > because you refered to '_' as being missing value. > > Hope this gets you started. > > Philip > 1 data test; > 2 length var1 $ 16; > 3 var1 = "abcdab,1984,ca"; > 4 output; > 5 var1 = "abcdab,_,ca"; > 6 output; > 7 run; > 8 > 9 data new(drop=var1); > 10 set test; > 11 varA = scan(var1,1,','); > 12 varB = scan(var1,2,','); > 13 varC = scan(var1,3,','); > 14 if varB = '_' > 15 then varBn = .; > 16 else varBn = input(varB,$4.); > 17 run; > 18 > 19 proc print data=new; > 20 run; > > Michelle > Jellinghaus To: SAS-L@LISTSERV.UGA.EDU > <michelle@EMODE.C cc: > OM> Subject: Parsing a comma delimited column > Sent by: "SAS(r) > Discussion" > <SAS-L@LISTSERV.U > GA.EDU> > > 22/11/2002 09:44 > AM > Please respond to > Michelle > Jellinghaus > > Hi folks, > > I have a datafile, pulled from an oracle database with an SQL query, that > has a few columns which contain comma delimited value strings of varying > length. I would like to read the values within these columns into SAS as a > series of individual variables/columns by splitting the string on the > commas. > > Here's a sample: > > VAR1 is one of many variable of its kind within a single data set. > VAR1 has values like "abcdab,1984,ca" > The objective is to parse this string (housed within the existing VAR1 > column) into 3 separate variables: > varA: "abcdab" > varB: "1984" > varC: "ca" > If a value is missing, it is represented by a single underscore. For > example: > if VAR1 = "abcdab,_,ca", then > varA = "abcdab" > varB = "_" > varC = "ca" > thereby shortening the string from a max possible of 14 characters to > 11 > characters due to a missing value (varB) which could be up to 4 characters > in length. > > Does anyone have any suggestions? > > Thank you, > > Michelle


Back to: Top of message | Previous page | Main SAS-L page