Date: Thu, 15 Jul 2010 14:27:18 -0400
Reply-To: Nat Wooding <nathani@VERIZON.NET>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Nat Wooding <nathani@VERIZON.NET>
Subject: Re: parsing infile from web URL
In-Reply-To: <89C159F45B13A24682D98BDEF58E451F29BCE4C9@TLRUSMNEAGMBX28.ERF.THOMSON.COM>
Content-Type: text/plain; charset="US-ASCII"
Or,
data msa_names (drop=flag:);
infile source firstobs=12 truncover ;
input flag2 $ 25-26 CBSA 1-5 @;
if flag2 ne " " ;
if _infile_ =: "*" then stop ;
input
@25 CBSA_nm $79.
Nat Wooding
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Matthew
Pettis
Sent: Thursday, July 15, 2010 1:39 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: parsing infile from web URL
This will do what you ask, I believe:
=== SAS code ===
data msa_names (drop=flag:);
infile source firstobs=12 truncover ;
input flag1 $ 1 @;
if flag1 = "*" then stop ;
input flag2 $ 25-26 CBSA 1-5 @;
if flag2 ne " " ;
input
@25 CBSA_nm $79. ;
run;
=== SAS code end ===
=== SAS log ===
57 filename source URL
58
"http://www.census.gov/population/www/metroareas/lists/2008/List4.txt"
59 proxy='http://xxx.xxx.xxx.xxx:80'
60 ;
61
62
63
64 data msa_names (drop=flag:);
65
66 infile source firstobs=12 truncover ;
67
68 input flag1 $ 1 @;
69 if flag1 = "*" then stop ;
70
71 input flag2 $ 25-26 CBSA 1-5 @;
72 if flag2 ne " " ;
73
74 input
75
76 @25 CBSA_nm $79. ;
77
78 run;
NOTE: The infile SOURCE is:
Filename=http://www.census.gov/population/www/metroareas/lists/2008/List
4.txt,
Local Host Name=xxxxxxxxxxx,
Local Host IP addr=xxx.xxx.xxx.xxx,
Service Hostname Name=xxx.xxx.xxx.xxx,
Service IP addr=xxx.xxx.xxx.xxx,
Service Name=N/A,Service Portno=80,Lrecl=256,
Recfm=Variable
NOTE: 1948 records were read from the infile SOURCE.
The minimum record length was 0.
The maximum record length was 104.
NOTE: The data set WORK.MSA_NAMES has 374 observations and 2 variables.
NOTE: Compressing data set WORK.MSA_NAMES increased size by 20.00
percent.
Compressed is 6 pages; un-compressed would require 5 pages.
NOTE: DATA statement used (Total process time):
real time 0.56 seconds
cpu time 0.04 seconds
=== SAS log end ===
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
Miller, Jeremy T. (CDC/OID/NCPDCID)
Sent: Thursday, July 15, 2010 12:12 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: parsing infile from web URL
I'm creating some files from a web URL to create a relational DB. I'm
creating a list of CBSA divisions, CBSA names, then a list of
FIPS/FIPS_C/county_names that I could use.
It's easy to cut off the top of a text document, with FIRSTOBS=, but,
what is the best way to use for stopping the processing further down the
text. For example, in the this URL, beginning on line 1959, there is a
note that I do not want to process, so I would like to STOP there.
Obviously, I could just download the file and strip the offending text
both above and below to make this a non-issue, but I WANT to know how to
do it the other way.
filename source URL
"http://www.census.gov/population/www/metroareas/lists/2008/List4.txt" ;
data msa_names (drop=flag:);
infile source firstobs=12 truncover ;
input flag1 $ 1 flag2 $ 25-26 CBSA 1-5 @;
if flag1 = "*" then stop ;
if flag2 ne " " ;
input
@25 CBSA_nm $79. ;
run;
This "works," but you'll notice in the log a note for invalid data.
Should I do some type of pre-parsing to stop the input before invalid
data can come in?
data msa_fips_list ;
infile source firstobs=12 truncover ;
do until (index(_infile_,'*') = 1 );
input FIPS 17-18 CBSA 1-5 CBSA_div 9-13 @;
if fips ne . ;
input
@19 FIPS_C 3.
@31 COUNTY $68. ;
output ;
end;
run;
Here, because my first var is numeric, I will immediately get an invalid
data stamp when SAS hits the "*" of the text at the bottom.
Again, I just don't want something that works, I would like to know the
appropriate method to stop parsing INPUT if you know that only a certain
portion of text has data WITHOUT altering the original text.