LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (July 2010, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 15 Jul 2010 14:17:42 -0400
Reply-To:     Chang Chung <chang_y_chung@HOTMAIL.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Chang Chung <chang_y_chung@HOTMAIL.COM>
Subject:      Re: parsing infile from web URL
Comments: To: Jeremy Miller <zyp9@CDC.GOV>

On Thu, 15 Jul 2010 13:12:15 -0400, Miller, Jeremy T. (CDC/OID/NCPDCID) <zyp9@CDC.GOV> wrote:

>I'm creating some files from a web URL to create a relational DB. I'm >creating a list of CBSA divisions, CBSA names, then a list of >FIPS/FIPS_C/county_names that I could use. > >It's easy to cut off the top of a text document, with FIRSTOBS=, but, >what is the best way to use for stopping the processing further down the >text. For example, in the this URL, beginning on line 1959, there is a >note that I do not want to process, so I would like to STOP there. > >Obviously, I could just download the file and strip the offending text >both above and below to make this a non-issue, but I WANT to know how to >do it the other way. > >filename source URL >"http://www.census.gov/population/www/metroareas/lists/2008/List4.txt" ; > >data msa_names (drop=flag:); > infile source firstobs=12 truncover ; > input flag1 $ 1 flag2 $ 25-26 CBSA 1-5 @; > if flag1 = "*" then stop ; > if flag2 ne " " ; > input > @25 CBSA_nm $79. ; >run; > >This "works," but you'll notice in the log a note for invalid data. >Should I do some type of pre-parsing to stop the input before invalid >data can come in? ... >Again, I just don't want something that works, I would like to know the >appropriate method to stop parsing INPUT if you know that only a certain >portion of text has data WITHOUT altering the original text.

Hi, Jeremy: I rather think it is nicer to download the file once and work on it locally, instead of bothering the remote server to send you the data over and over. On avoiding error messages, I think it is easier to read the fields into character variables and then to convert it to numeric later. And take advantage of the structure of the input file. In the file, the first three columns are all in the fixed column, so use the same name and your select if statement becomes more readable. HTH. Cheers, Chang

%let metroareas = http://www.census.gov/population/www/metroareas; filename source url "&metroareas/lists/2008/List4.txt"; data cbsa(keep=cbsa name); infile source firstobs=12 truncover; input cbsa_ $ 1-5 div_ $ 9-13 fips_ $ 17-21 @; if not missing(cbsa_) and missing(div_) and missing(fips_); cbsa = input(cbsa_, 5.0); input @25 name $79.; keep cbsa name; run; filename source clear;

/* check */ proc print data=cbsa; run;


Back to: Top of message | Previous page | Main SAS-L page