Date: Tue, 22 May 2001 16:36:28 -0400
Reply-To: "Fehd, Ronald J." <rjf2@CDC.GOV>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Fehd, Ronald J." <rjf2@CDC.GOV>
Subject: Re: Health Care Data, Arrays and Macros
Content-Type: text/plain; charset="iso-8859-1"
so, have you decided what your output structure(s) look like?
that is the key to reading this data.
lrecl= <logical record length>
pad %*believe that all records are lrecl in length;
retain ID/key demographics, etc.
input ... @;%*hold the record until decide what else to read;
if <condition> then input <other variables>;
else input;%*NOTE: closure of held input statement;
data CLAIMS REVENUE;
attrib ID1 ...
retain ID1 ID2 0;
infile ... lrecl=369 pad;
input @1 <all ID> @;
if <claim-1> then do; input Claim-1 data;
output CLAIMS; end;
if <claim-2> then do; input Claim-2 data;
output CLAIMS; end;
if <revenue-1> then do; input Revenue-1 data;
output REVENUE; end;
else input;%*remember to release the held record;
Ron Fehd the macro maven CDC Atlanta GA USA RJF2@cdc.gov
OpSys: WinNT Ver: 8.1
---> cheerful provider of UNTESTED SAS code!*! <---
e-mail your SAS improvements to: email@example.com
By using your intelligence you can sometimes make your problems twice as
-- Ashleigh Brilliant
> -----Original Message-----
> From: Foy, Thomas M. [mailto:foytho@PARKNICOLLET.COM]
> Sent: Tuesday, May 22, 2001 4:19 PM
> To: SAS-L@LISTSERV.UGA.EDU
> Subject: Health Care Data, Arrays and Macros
> Greetings to everyone. It's me again, with a convoluted
> question centered
> around a health care data set and ways to read and manipulate
> it with SAS.
> Here's the set up: I have a data set that contains inpatient
> data, sent to me in text format, that is of non-standard
> record length.
> Meaning that the individual records have varying numbers or repeating
> variables. The file layout looks like this:
> data inpat;
> infile 'c:\raw_data\inpat.txt'
> input @1 var1 $1.
> @2 var2 2.
> @4 var3 9.
> @231 number_of_claims_on_the_record 2.
> (this number varies between 1 and 21)
> @233 claim_number_on_the_record $2.
> More Variables __________
> @264 procedure_1 $6. |
> @270 date_of_procedure_1 yymmdd8. |
> @278 procedure_2 $6.
> @284 date_of_procedure_2 yymmdd8. |
> and so on... |
> @306 number_of_repeated_revenue_sections 2. |
> @308 revenue_code $3. |
> @311 sequence_number 3. |--Revenue
> @314 charge_amount 11.2 |
> @325 another_code $5. __________|
> @330 revenue_code $3. |
> @333 sequence_number 3. |--Revenue
> @336 charge_amount 11.2 |
> @347 another_code $5. __________|
> @352 revenue_code $3. |
> @355 sequence_number 3. |--Revenue
> @358 charge_amount 11.2 |
> @369 another_code $5. __________|
> All records are the same up through position #374. After
> that, the rest of
> the record is governed by the variable at position #231,
> number_of_claims_on_the_record, and the variable at position #306,
> number_of_repeated_revenue_sections. There is a maximum of
> 21 claims on
> each record, and, for each claim section there can be up to 42 revenue
> For example, one record in the data set has 6 claims on the
> record. The
> first and second claims have 1 repeated revenue section, the
> third claim has
> 5 repeated revenue sections, and the last two have 1 revenue section.
> My question is this: How do I read such a data set when the
> length of the
> record, and the placement of the variables, depend on two
> different and
> independent pieces if information contained in the record?
> I'm not quite sure where to begin with this.
> Any help/assistance with this vexing problem will be greatly
> If you throw me a bone, I'll certainly bark with appreciation.
> Thanks to all,
> Thomas M. Foy
> Park Nicollet Institute
> Minneapolis, MN