LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (July 2010, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 15 Jul 2010 14:27:18 -0400
Reply-To:     Nat Wooding <nathani@VERIZON.NET>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Nat Wooding <nathani@VERIZON.NET>
Subject:      Re: parsing infile from web URL
In-Reply-To:  <89C159F45B13A24682D98BDEF58E451F29BCE4C9@TLRUSMNEAGMBX28.ERF.THOMSON.COM>
Content-Type: text/plain; charset="US-ASCII"

Or,

data msa_names (drop=flag:);

infile source firstobs=12 truncover ;

input flag2 $ 25-26 CBSA 1-5 @; if flag2 ne " " ; if _infile_ =: "*" then stop ;

input

@25 CBSA_nm $79.

Nat Wooding

-----Original Message----- From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Matthew Pettis Sent: Thursday, July 15, 2010 1:39 PM To: SAS-L@LISTSERV.UGA.EDU Subject: Re: parsing infile from web URL

This will do what you ask, I believe:

=== SAS code === data msa_names (drop=flag:);

infile source firstobs=12 truncover ;

input flag1 $ 1 @; if flag1 = "*" then stop ;

input flag2 $ 25-26 CBSA 1-5 @; if flag2 ne " " ;

input

@25 CBSA_nm $79. ;

run; === SAS code end ===

=== SAS log === 57 filename source URL 58 "http://www.census.gov/population/www/metroareas/lists/2008/List4.txt" 59 proxy='http://xxx.xxx.xxx.xxx:80' 60 ; 61 62 63 64 data msa_names (drop=flag:); 65 66 infile source firstobs=12 truncover ; 67 68 input flag1 $ 1 @; 69 if flag1 = "*" then stop ; 70 71 input flag2 $ 25-26 CBSA 1-5 @; 72 if flag2 ne " " ; 73 74 input 75 76 @25 CBSA_nm $79. ; 77 78 run;

NOTE: The infile SOURCE is:

Filename=http://www.census.gov/population/www/metroareas/lists/2008/List 4.txt, Local Host Name=xxxxxxxxxxx, Local Host IP addr=xxx.xxx.xxx.xxx, Service Hostname Name=xxx.xxx.xxx.xxx, Service IP addr=xxx.xxx.xxx.xxx, Service Name=N/A,Service Portno=80,Lrecl=256, Recfm=Variable

NOTE: 1948 records were read from the infile SOURCE. The minimum record length was 0. The maximum record length was 104. NOTE: The data set WORK.MSA_NAMES has 374 observations and 2 variables. NOTE: Compressing data set WORK.MSA_NAMES increased size by 20.00 percent. Compressed is 6 pages; un-compressed would require 5 pages. NOTE: DATA statement used (Total process time): real time 0.56 seconds cpu time 0.04 seconds === SAS log end ===

-----Original Message----- From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Miller, Jeremy T. (CDC/OID/NCPDCID) Sent: Thursday, July 15, 2010 12:12 PM To: SAS-L@LISTSERV.UGA.EDU Subject: parsing infile from web URL

I'm creating some files from a web URL to create a relational DB. I'm creating a list of CBSA divisions, CBSA names, then a list of FIPS/FIPS_C/county_names that I could use.

It's easy to cut off the top of a text document, with FIRSTOBS=, but, what is the best way to use for stopping the processing further down the text. For example, in the this URL, beginning on line 1959, there is a note that I do not want to process, so I would like to STOP there.

Obviously, I could just download the file and strip the offending text both above and below to make this a non-issue, but I WANT to know how to do it the other way.

filename source URL "http://www.census.gov/population/www/metroareas/lists/2008/List4.txt" ;

data msa_names (drop=flag:);

infile source firstobs=12 truncover ;

input flag1 $ 1 flag2 $ 25-26 CBSA 1-5 @;

if flag1 = "*" then stop ;

if flag2 ne " " ;

input

@25 CBSA_nm $79. ;

run;

This "works," but you'll notice in the log a note for invalid data. Should I do some type of pre-parsing to stop the input before invalid data can come in?

data msa_fips_list ;

infile source firstobs=12 truncover ;

do until (index(_infile_,'*') = 1 );

input FIPS 17-18 CBSA 1-5 CBSA_div 9-13 @;

if fips ne . ;

input

@19 FIPS_C 3.

@31 COUNTY $68. ;

output ;

end;

run;

Here, because my first var is numeric, I will immediately get an invalid data stamp when SAS hits the "*" of the text at the bottom.

Again, I just don't want something that works, I would like to know the appropriate method to stop parsing INPUT if you know that only a certain portion of text has data WITHOUT altering the original text.


Back to: Top of message | Previous page | Main SAS-L page