LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (October 2001, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 3 Oct 2001 15:16:20 -0400
Reply-To:     muon33@nyc.rr.com
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Michael Stuart <muon33@HOTMAIL.COM>
Subject:      Re: Text File Import Problem
Comments: To: TerjeMW@dshs.wa.gov
Content-Type: text/plain; format=flowed

Mark (and SAS-L) -- thanks, I'm still having the same problem. I've adopted some of your coding. First, here's the program:

filename epilist "H:\Vendors\Ffx\MailingTest\EPI-REG-OPEN-100K.TXT" ;

data ffx.epilist (keep = email domain) ; length email $ 50 ; infile epilist length=lenvar ; input @1 email varying. lenvar ;

email = left(trim(lowcase(email))) ;

domain = substr(email,(index(email,'@')+1)) ;

run ;

proc print data=ffx.epilist (obs=10) ;

run ;

THIS GENERATES THE FOLLOWING LOG:

85 data ffx.epilist (keep = email domain) ; 86 length email $ 50 ; 87 infile epilist length=lenvar ; 88 input @1 email varying. lenvar ; 89 90 email = left(trim(lowcase(email))) ; 91 92 domain = substr(email,(index(email,'@')+1)) ; 93 94 run ;

NOTE: The infile EPILIST is: File Name=H:\Vendors\Ffx\MailingTest\EPI-REG-OPEN-100K.TXT, RECFM=V,LRECL=256

NOTE: 7771 records were read from the infile EPILIST. The minimum record length was 43. The maximum record length was 256. One or more lines were truncated. NOTE: The data set FFX.EPILIST has 7771 observations and 2 variables. NOTE: Compressing data set FFX.EPILIST increased size by 6.19 percent. Compressed is 103 pages; un-compressed would require 97 pages. NOTE: DATA statement used: real time 3.68 seconds

95 96 proc print data=ffx.epilist (obs=10) ; 97 98 run ;

NOTE: There were 10 observations read from the data set FFX.EPILIST. NOTE: PROCEDURE PRINT used: real time 0.10 seconds

A NOTE ABOUT THIS LOG -- there are 100K records in this file, not 7771.

Here's the output from the print step:

The SAS System 14:56 Wednesday, October 3, 2001 2

Obs email

1 tabernathy@andonet.com mjkarv@pacifier.com chrismc 2 kelbriney@earthlink.net justineschubert@hotmail.co 3 walterj@adelphia.net sheila_dubin@hotmail.com phca 4 dbparagon@msn.com rita@simalfa.com etambrose@yahoo 5 christmas@paradise.net.nz yvonne.graser@foodbrands 6 churchill25@rcn.com gabo@enter.net.mx nanzo@datasy 7 theresa_kuhlman@agilent.com zelgroup1@mindspring.c 8 nurse37@carolina.net gsirett@tiaa-cref.org htamvad 9 arouge@ait-applied.com bonneau@vvm.com perelk@eart 10 linhem@epix.net desertdawn@281.com jcsimo@netzero.

Obs domain

1 andonet.com mjkarv@pacifier.com chrismc 2 earthlink.net justineschubert@hotmail.co 3 adelphia.net sheila_dubin@hotmail.com phca 4 msn.com rita@simalfa.com etambrose@yahoo 5 paradise.net.nz yvonne.graser@foodbrands 6 rcn.com gabo@enter.net.mx nanzo@datasy 7 agilent.com zelgroup1@mindspring.c 8 carolina.net gsirett@tiaa-cref.org htamvad 9 ait-applied.com bonneau@vvm.com perelk@eart 10 epix.net desertdawn@281.com jcsimo@netzero.

A NOTE ABOUT THE OUTPUT - when I view in SAS software output window, there are no line break within each observation -- I see a series of email addresses 'glommed' together, separated by a hollow, square box (somethign non-printable). When I cut & paste the output into any other application (like IE/Hotmail), the hollow, square boxes become line breaks.

I looked at teh input file with a hex edit, each record on each line is followed by 0D (CR) 0A (LF). I've tried the infile statement with both truncover and missover options -- and I'm getting the same results.

This is driving me nuts! Any others ideas?

Thanks ...

>From: "Terjeson, Mark" <TerjeMW@dshs.wa.gov> >To: "'Mike Stuart'" <muon33@NYC.RR.COM>, SAS-L@LISTSERV.UGA.EDU >Subject: RE: Text File Import Problem >Date: Wed, 3 Oct 2001 10:57:47 -0700 > >Hi Mike, > > > * make sample data ; >data _null_; > file 'c:\temp\abc.txt'; > put 'test1@dkadk.com'; > put '3234dk@efg.com'; > put 'jieuw@lmdkadk.com'; > put '3234dk@efg.com'; >run; > > > * Read variable line length flat file ; >data table1(keep=myline); > length myline $ 200; > infile 'c:\temp\abc.txt' length=lenvar; > input @1 myline $varying. lenvar; >run; > > > * adding your goodies ; >data table1(keep=email domain); > length email $ 50; > infile 'c:\temp\abc.txt' length=lenvar; > input @1 email $varying. lenvar; > email = left(trim(lowcase(email))) ; > domain = substr(email,(index(email,'@')+1)) ; >run; > > > * another variation using SCAN() ; > * to replace INDEX() and SUBSTR() ; >data table1(keep=email domain); > length email $50; > infile 'c:\temp\abc.txt' length=lenvar; > input @1 email $varying. lenvar; > email = left(trim(lowcase(email))) ; > domain = scan(email,2,'@') ; >run; > > >Hope this is helpful, >Mark Terjeson >Washington State Department of Social and Health Services >Division of Research and Data Analysis (RDA) >mailto:terjemw@dshs.wa.gov > > > >-----Original Message----- >From: Mike Stuart [mailto:muon33@NYC.RR.COM] >Sent: Wednesday, October 03, 2001 9:29 AM >To: SAS-L@LISTSERV.UGA.EDU >Subject: Text File Import Problem > > >Having problems with a straight-forward import, I think the problem >has to do with non-viz characters, but I'm not sure. When viewed, the >file I'm trying to read in is relatively straight-forward, one email >address per line. I'm using the code below to read in this file. > >data ffx.epilist ; > infile epilist truncover ; > input @1 email $50. ; > > email = left(trim(lowcase(email))) ; > > domain = substr(email,(index(email,'@')+1)) ; > > run ; > >The output however looks like this: > > > obs email domain > 1 test1@dkadk.com 3234dk@efg.com jieuw@lm dkadk.com >3234dk@efg.com > >etc. > >It looks like the line delimiter is missing. Suggestions on how to >fix? I've tried the import wizard using a number of different >delimiter option and am getting the same result. > >Thanks - >

_________________________________________________________________ Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp


Back to: Top of message | Previous page | Main SAS-L page