LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (August 2007, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Sun, 12 Aug 2007 12:23:15 -0700
Reply-To:     Hari <excel_hari@YAHOO.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Hari <excel_hari@YAHOO.COM>
Organization: http://groups.google.com
Subject:      Attribute and transactional level data as single record
Comments: To: sas-l@uga.edu
Content-Type: text/plain; charset="iso-8859-1"

Hi,

I have received customer level data from a party as a flat-file (Character 29 is the delimite) and it has been delivered in a very unweildy format, and I cant ask them to re-do because of constraints in data pulls.

The customer level information like their unique ID, name etc has been unwittingly combined with all the transactional level data (lets say their purchases) and everything is put in a SINGLE record.

Basically, different records of each particular kind of transaction of a customer has been compressed together with a pipe delimiter seperating them and this has been put as a single field value as part of the customer table.

INPUT Sample data (# represents character 29 in the below data, 2 consecutive # represent missing field value)

ID#Name#JoinDate#TransactType1_ProdBought#TransactType2_OperatingSysUsed#LastInteractionDate 223#Foo#23-May-03#Product1|Product99|Product3#Windows XP|Windows Vista#18-Jun-06 56#Weq#30-Jun-02#Product97|Product2#Windows Vista#10-Apr-04 786#Rte#12-Sep-04#Product1|Product2|Product4|Product4##18-Aug-05

I want to parse the each of the pipe delimted values within the transactional kind of fields in to seperate new fields

Output desired within SAS (using # below to represent 2 different fields in SAS)

ID#Name#JoinDate#TransactType1_ProdBought1#TransactType1_ProdBought2#TransactType1_ProdBought3#TransactType1_ProdBought4#TransactType2_OperatingSysUsed1#TransactType2_OperatingSysUsed2#LastInteractionDate 223#Foo#23-May-03#Product1#Product99#Product3#Windows XP|Windows Vista#18-Jun-06 56#Weq#30-Jun-02#Product97#Product2#Windows Vista#10-Apr-04 786#Rte#12-Sep-04#Product1#Product2#Product4#Product4##18-Aug-05

I have 20 different kinds of transactional fields and the maximum number of values within each transaction field is unknown. Some customers might have just 1 or no value and others might have as much as 20 different values (like 20 different products bought). The number of new fields created for each transactional field will depend on the maximum of the count of pipe characters across all records and this will need to be done for every transactional field seperately.

What is an efficient way to do the above parsing/processing?

(After the parsing I intend to use Ron F's mutliple response macro to generate frequencies for the different values of each transaction type)

Please guide me.

regards, hp


Back to: Top of message | Previous page | Main SAS-L page