LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (February 2006, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 7 Feb 2006 14:11:33 -0500
Reply-To:     "Howard Schreier <hs AT dc-sug DOT org>" <nospam@HOWLES.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Howard Schreier <hs AT dc-sug DOT org>" <nospam@HOWLES.COM>
Subject:      Re: Reading messy space delimited files

On Tue, 7 Feb 2006 04:22:06 -0800, ajs2004@BIGFOOT.COM wrote:

>> Let suppose I have the following >> data stored in a file called tre2.txt >> >> 234.563 345.675 89.789 >> 123.678 8.099 76.678 > >... and, as you've also said, you know there are going to be no missing >values - so there will always be three numbers on the line. > >This is the simplest possible case of list input. All you need is: > > data dsname; > infile 'filename'; > input number1 - number3; > run; > >It doesn't matter what the spacing is or whether all the numbers have >the same number of decimal places. (It doesn't even matter if the data >wrap onto the next line). > >If you use an informat such as 12.3, you're attempting to impose a >requirement that the field will be exactly 12 characters wide and will >be interpeted as having exactly three decimal places. That doesn't >apply with the data you're telling us.

Of course it is possible to have it both ways: use informats but employ list-type input to detect field boundaries.

> >If you have strings as well as numbers, as in your next example, > >> a s d 345.67 89 yui >> atyu wers id 12345.6789 8.099 76 > >it's just as easy. (Again, assuming there are no missing values, and no >spaces within the character values, so each blank-separated item is one >value): > > data dsname; > infile 'filename'; > input word1 $ word2 $ number1-number3 word3 $; > run; > >Sample 55: Reading delimited data using simple list input >http://support.sas.com/ctx/samples/index.jsp?sid=55 > >INPUT Statement >http://support.sas.com/onlinedoc/913/getDoc/en/lrdict.hlp/a000146292.htm > >List Input >http://support.sas.com/onlinedoc/913/getDoc/en/lrdict.hlp/a000146292.htm#a0 00146620 > > >Hari wrote: >> Ian, >> >> Thanx a lot for your following statement >> >12.3 adds divide by 1000 when there is no decimal point, so it is better to leave out the >> >decimal qualifier unless it is needed for reading 40 year old files that >> >saved space by assuming the decimal point. >> >> I wasnt aware of it and tried it with some test data. >> >> So I want to double-confirm this. In case my files are of recent origin >> then is it ALWAYS safe for me to OMIT decimal places in the INFOMAT >> statement (and use it only in FORMAT statement as needed)

"ALWAYS"? There is no way that anyone can provide such blanket assurance.

The point is that use of implicit decimal points was common in the past and is uncommon today.

But here's a scenario: You are told that a data file was prepared with the convention that distance measures are recorded as integers without decimal points if expressed in meters or with explicit decimals if expressed as kilometers. Then the w.3 informat is just what you need.

You have to know what's in your data file.

>> >> ****Test data and code >> >> a s d 345.67 89 yui >> atyu wers id 12345.6789 8.099 76 >> >> >> data Tre1_4; >> %let _EFIERR_ = 0; /* set the ERROR detection macro variable */ >> infile "C:\Documents and Settings\Hari Prasadh\Local >> Settings\Temp\SAS try\tre1.txt" >> delimiter = ' ' MISSOVER DSD lrecl=32767 >> firstobs=1 ; >> informat Media $4. ; >> informat Site_Name $4. ; >> informat Impressions $2. ; >> informat Number1 10.4 ; >> informat Number2 7.3 ; >> informat Space $3. ; >> format Media $4. ; >> format Site_Name $4. ; >> format Impressions $2. ; >> format Number1 10.4 ; >> format Number2 7.3 ; >> format Space $3. ; >> input >> Media $ >> Site_Name $ >> Impressions $ >> Number1 >> Number2 >> Space $ >> ; >> if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR detection >> macro variable >> */ >> run; >> >> >> When I export out the above file (tre1_4) using the following syntax: >> proc export data= Tre1_4 >> outfile="C:\Documents and Settings\Hari Prasadh\Local >> Settings\Temp\SAS try\tre1Out.txt" >> dbms=' ' replace; >> >> run; >> >> Then the result I get in the exported file tre1Out.txt is a follows >> >> >> Media Site_Name Impressions Number1 Number2 Space >> a s d 345.6700 0.089 yui >> atyu wers id 12345.6789 8.099 76 >> >> Number2 was originally read by SAS as 0.089 (It is actually 89). >> >> regards, >> Hari >> India


Back to: Top of message | Previous page | Main SAS-L page