LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (March 2011, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Sun, 6 Mar 2011 10:01:08 -0500
Reply-To:     Arthur Tabachneck <art297@ROGERS.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Arthur Tabachneck <art297@ROGERS.COM>
Subject:      Re: how to treat consecutive delimiters as one when importing txt
              files
Content-Type: text/plain; charset=ISO-8859-1

It's too early. I, of course, meant to say:

>dropping that combined with your use of the MISSOVER option on your infile >statement, may have corrected the problems.

not

>dropping that combined with your use of the missing option on your infile >statement, may have corrected the problems.

Art ------- On Sun, 6 Mar 2011 09:52:03 -0500, Arthur Tabachneck <art297@ROGERS.COM> wrote:

>Nuria, > >I'm still waiting to hear from others who are more familiar with how SAS >treats a mixed combination of spaces and tab characters. However, while I >only manually looked at a couple of your records, I did misspeak a little. > >Of course, you don't only have six variables. You have 4 numeric variables, >followed by a date, and then followed by 26 more numeric variables. And, >yes, every field is separated by a tab, but may also be separated by spaces >as well. > >I did, though, see some extra problems that you will have to account for. >You don't only have two hyphens representing missing values but, rather, two >hyphens followed by two tab characters (at least in the first record). That >seemed to how true for every variable except for the last variable. It had >2 hyphens, followed by 3 spaces, a tab character, and then a carriage return >and line feed. > >If tabs are treated as spaces when one doesn't specify dlm, then your >dropping that combined with your use of the missing option on your infile >statement, may have corrected the problems. > >I, though, would definitely hand inspect at least some randomly selected >records, from at least some of your files, to gain some validation that the >data is being read as anticipated. > >Art >-------- >On Sun, 6 Mar 2011 03:40:11 -0800, =?iso-8859-1?Q?N=FAria_Chapinal?= ><nchapinal@YAHOO.COM> wrote: > >>I hadn't realized that some fields where separated by just spaces and no >tabs. The file looks good to me as it is now, but I could be glossing over >some flaws, indeed.... Is there any way to specified the delimiter as either >spaces or tabs, so that if in any instance there�are spaces only separating >columns, those�2 columns are not merged in the SAS dataset? >>� >>thanks! >> >> >>� >>N�ria Chapinal >>Postdoctoral fellow >>Animal Welfare Program, University of British Columbia >>Department of Population Medicine, University of Guelph >> >> >> >>--- On Sat, 3/5/11, Arthur Tabachneck <art297@ROGERS.COM> wrote: >> >> >>From: Arthur Tabachneck <art297@ROGERS.COM> >>Subject: Re: how to treat consecutive delimiters as one when importing txt >files >>To: SAS-L@LISTSERV.UGA.EDU >>Received: Saturday, March 5, 2011, 11:06 PM >> >> >>I have seen a sample of Nuria's raw data, but can't find anything specific >>in the documentation. >> >>Nuria has six fields, four numeric, followed by a date, and then one >numeric >>field.� Actual missing data are represented by two hyphens. >> >>Each field is delimited by either one or more spaces, one or more spaces >and >>a tab character, or just a tab character. >> >>I was going to suggest the infile extendtabs option, but then Nuria >>mentioned that removing the dsd option solved the problem. >> >>Her code was: >> >>filename FT46F001 'C:\Users\nuria\Desktop\am\fc\*.txt' lrecl=256; >>data fresh; >>� length fname $256; >>� infile FT46F001 /*dsd*/ dlm='09'x eov=eov filename=fname noexpandtabs >>missover; >>� input @; >>� if _n_ eq 1 or eov then do; >>� � *filename=fname; >>� � date = input(scan(fname,-2,' '),mmddyy10.); >>� � retain filename date; >>� � eov = 0; >>� � input #5; >>� � delete; >>� end; >>� input (index cow lact group)(:??) cdate :mmddyy10. (dim c1-c26 )(:??); >>� format date cdate mmddyy10.; >>run; >> >>In the case where she is using noexpandtabs, are tabs considered the same >as >>blanks? >> >>Art >>------ >>On Sat, 5 Mar 2011 18:31:55 -0600, Data _null_; <iebupdte@GMAIL.COM> wrote: >> >>>Nat, >>> >>>Nuria, is use ?? in input statement to suppress the invalid data >>>reports.� From what I now understand use DLM without DSD should solve >>>the problem. >>> >>>Nuria,� you might what to investigate just which lines have two tabs >>>in a row.� Something like this untested code. >>> >>>input; >>>if index(_infile_,'0909'x) then list; >>> >>>On Sat, Mar 5, 2011 at 6:19 PM, Nat Wooding <nathani@verizon.net> wrote: >>>> Nuria >>>> >>>> If SAS is seeing -- in a numeric field and turns it into a decimal, you >>>> should see a note saying that there was invalid data between column ... >>and >>>> column ... >>>> >>>> Could you supply a few sample lines of input data and show what the >>result >>>> should look like? >>>> >>>> Nat Wooding >>>> >>>> -----Original Message----- >>>> From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of >>N�ria >>>> Chapinal >>>> Sent: Saturday, March 05, 2011 1:49 PM >>>> To: SAS-L@LISTSERV.UGA.EDU >>>> Subject: Re: how to treat consecutive delimiters as one when importing >>txt >>>> files >>>> >>>> >>>> >>>> I deleted the DSD option and it looks like it is working fine. I do >thave >>>> missing values, but they appear as '--' in the txt files and then as >>periods >>>> in the final SAS dataset. So I guess I can use that to differentiate >them >>>> from 2 consecutive tabulations ,right? I think it is what the code is >>doing >>>> right now.� Is there any specific piece in my code that is specifying >>>> "change -- into ."? I do not think so. Is it because I am specifying >>>> numerical format and then anything that is not numerical becomes a .? >>>> I am still worried that missing values and consecutive tabulations can >>get >>>> mixed up. Can we add something else to make use of the -- and avoid >>>> confusion? >>>> >>>> >>>> >>>> N�ria Chapinal >>>> Postdoctoral fellow >>>> Animal Welfare Program, University of British Columbia >>>> Department of Population Medicine, University of Guelph >>>> >>>> >>>> >>>> --- On Sat, 3/5/11, Arthur Tabachneck <art297@ROGERS.COM> wrote: >>>> >>>> >>>> From: Arthur Tabachneck <art297@ROGERS.COM> >>>> Subject: Re: how to treat consecutive delimiters as one when importing >>txt >>>> files >>>> To: SAS-L@LISTSERV.UGA.EDU >>>> Received: Saturday, March 5, 2011, 1:37 PM >>>> >>>> >>>> Nuria, >>>> >>>> Ignoring multiple delimiters also implies that you don't have any >missing >>>> values.� If you don't have any missing values, then you could probably >do >>>> what you want by including code to change the tabs into blanks and then >>use >>>> the & modifier. >>>> >>>> Art >>>> -------- >>>> On Sat, 5 Mar 2011 13:17:35 -0500, Nuria Chapinal <nchapinal@YAHOO.COM> >>>> wrote: >>>> >>>>>On Tue, 1 Mar 2011 11:54:33 -0500, Nuria Chapinal <nchapinal@YAHOO.COM> >>>>>wrote: >>>>> >>>>>>Thanks a lot for all your help. >>>>>> >>>>>>This is one of the codes for one of the batches: >>>>>> >>>>>> >>>>>>filename FT46F001 'C:\Users\nuria\Desktop\am\fc\*.txt' lrecl=256; >>>>>>data fresh; >>>>>>���length fname $256; >>>>>>���infile FT46F001 dsd dlm='09'x eov=eov filename=fname noexpandtabs >>>>>missover; >>>>>>���input @; >>>>>>���if _n_ eq 1 or eov then do; >>>>>>� � � *filename=fname; >>>>>>� � � date = input(scan(fname,-2,' '),mmddyy10.); >>>>>>� � � retain filename date; >>>>>>� � � eov = 0; >>>>>>� � � input #5; >>>>>>� � � delete; >>>>>>� � � end; >>>>>>input (index cow lact group)(:??) cdate :mmddyy10. (dim c1-c26 )(:??); >>>>>>format date cdate mmddyy10.; >>>>>>���run; >>>>> >>>>>Hi, >>>>> >>>>>I ran into a small trouble when using the codes above. Some times there >>>>>are 2 tabulations in a row separating columns. I would need an option >>that >>>>>did the same as the option "treat consecutive delimiters as one" in >>>>>Excel... >>>>> >>>>>Thanks in advance! >>>> >>


Back to: Top of message | Previous page | Main SAS-L page