Date: Sat, 29 May 2010 17:29:34 -0400
Reply-To: Arthur Tabachneck <art297@NETSCAPE.NET>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Arthur Tabachneck <art297@NETSCAPE.NET>
Subject: Re: 1 GB data sets=very messy!!!
Anna,
I never saw your original post, or Tom's response, so I may be repeating
things that have already been suggested.
First, while you indicate that your file is a GB in size, the message back
from SAS is indicating a much smaller file size.
And, upon looking at your log, every other field is getting a missing
value. And it appears to be quite consistent.
What kind of operating system are you on? Which version of SAS? Do you
know what kind of system the data file came from?
The first thing I typically do when having to input a text file is to see
what the file really looks like. If you do a search of the list for
reading hex data, you will find a number of examples of how use SAS to
actually see what the data looks like. If those 09 characters are
actually tabs, you may have a tab delimited file and have to add a dsd and
delimiter options to your infile statement.
Also, with the discrepancy in file size, you may have to use the
ignoredoseof option.
Unfortunately, there are many, many possibilities and you can only know
which to apply once you know what the data "really" look like.
HTH,
Art
---------
On Sat, 29 May 2010 17:12:22 -0400, Anna Supady <statistics2020@GMAIL.COM>
wrote:
>Hi Tom,
>
>Thanks for email. Here is more info:
>
>
>So now I see that I got to read only one row of the data and it reads one
>variable at the time. The dlm=09 means it is tabulating the spaces.
>
>How can I get it to read actual amount of characters in variable, not just
>one.
>
>I am getting somewhere...not sure how to get it work correctly.
>
>thanks a lot for your help
>
>Ania
>
>
>messy data 10:18 Saturday, May 29, 2010 7975
>
> OUTPUT:
>
>O x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
>
>b x x x x x x x x x 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3
3
>3 3 4 4
>
>s 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6
7
>8 9 0 1
>
>1 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 .
0
>. 0 . 5
>
>
>
>O x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
x
>x x x x
>
>b 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7
7
>7 8 8 8
>
>s 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7
8
>9 0 1 2
>
>1 1 0 6 2 . 2 1 7 6 2 4 . 0 . 0 . 0 . 0 . 0 . 2 5 9 2 0 0 6 . 0 . 0 . 0 .
0
>. 0 . 0
>
>x x x x x x x x x x x x x x x x x x x x x x x x
>
>O x x x x x x x x x x x x x x x x x 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1
>1 1 1 1
>
>b 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1
1
>2 2 2 2
>
>s 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8
9
>0 1 2 3
>
>1 . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 6 7 2 1 8 . 0 . 0 . 6
2
>9 9 0 6
>
>x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
x
>x x x
>
>O 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1
>1 1 1 1
>
>b 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5
6
>6 6 6 6
>
>s 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
0
>1 2 3 4
>
>1 0 . 0 . . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 .
0 .
>0 . 0 .
>
>x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
x
>x x x
>
>O 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2
2
>2 2 2 2
>
>b 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 0
0
>0 0 0 0
>
>s 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0
1
>2 3 4 5
>
>1 8 . 0 . 0 . 0 . 0 . 0 . 0 . 2 0 . 0 . 0 . 0 . 0 . 1 5 6 . 0 . 0 . 2 . 2
8
>. 0 . 0
>
>x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
x
>x
>
>O 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2
>2 2
>
>b 0 0 0 0 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4
4
>4 4
>
>s 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
2
>3 4
>
>1 . 0 . 0 . 0 . 1 0 3 . 4 2 8 4 . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 . 0 .
0
>. 0
>and the LOG:
>
>
>279 data one;
>
>280
>
>infile 'f:\Stark\first.txt';
>
>281 array x{244};
>
>282 input x(*) 1.;
>
>NOTE: The infile 'f:\Stark\first.txt' is:
>
>Filename=f:\Stark\first.txt,
>
>RECFM=V,LRECL=256,File Size (bytes)=3074,
>
>Last Modified=29May2010:11:44:22,
>
>Create Time=29May2010:11:44:21
>
>NOTE: Invalid data for x2 in line 1 2-2.
>
>NOTE: Invalid data for x4 in line 1 4-4.
>
>NOTE: Invalid data for x6 in line 1 6-6.
>
>NOTE: Invalid data for x8 in line 1 8-8.
>
>NOTE: Invalid data for x243 in line 1 243-243.
>
>RULE:
>----+----1----+----2----+----3----+----4----+----5----+----6----+----7----
+----8----+---
>
>1 CHAR
>0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.51062.217624.0.0.0.0.0.2592006.0.0
.0.0.0.0.0.0.0
>
>ZONE
>30303030303030303030303030303030303030303333303333330303030303033333330303
03030303030303
>
>NUMR
>09090909090909090909090909090909090909095106292176249090909090925920069090
90909090909090
>
>89
>.0.0.0.0.0.0.0.0.0.67218.0.0.6299060.0..0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.
0.8.0.0.0.0.0.
>
>ZONE
>03030303030303030303333303030333333303003030303030303030303030303030303030
30303030303030
>
>NUMR
>90909090909090909096721890909629906090990909090909090909090909090909090909
09890909090909
>
>177
>0.20.0.0.0.0.156.0.0.2.28.0.0.0.0.0.103.4284.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0
.0.0.0
>
>ZONE
>30330303030303330303030330303030303033323333030303030303030303030303030303
030303
>
>NUMR
>092090909090915690909292890909090909103E4284909090909090909090909090909090
909090
>
>x1=0 x2=. x3=0 x4=. x5=0 x6=. x7=0 x8=. x9=0 x10=. x11=0 x12=. x13=0 x14=.
>x15=0 x16=. x17=0 x18=.
>
>x19=0 x20=. x21=0 x22=. x23=0 x24=. x25=0 x26=. x27=0 x28=. x29=0 x30=.
>x31=0 x32=. x33=0 x34=.
>
>x35=0 x36=. x37=0 x38=. x39=0 x40=. x41=5 x42=1 x43=0 x44=6 x45=2 x46=.
>x47=2 x48=1 x49=7 x50=6
>
>x51=2 x52=4 x53=. x54=0 x55=. x56=0 x57=. x58=0 x59=. x60=0 x61=. x62=0
>x63=. x64=2 x65=5 x66=9
>
>x67=2 x68=0 x69=0 x70=6 x71=. x72=0 x73=. x74=0 x75=. x76=0 x77=. x78=0
>x79=. x80=0 x81=. x82=0
>
>x83=. x84=0 x85=. x86=0 x87=. x88=0 x89=. x90=0 x91=. x92=0 x93=. x94=0
>x95=. x96=0 x97=. x98=0
>
>x99=. x100=0 x101=. x102=0 x103=. x104=0 x105=. x106=0 x107=. x108=6
x109=7
>x110=2 x111=1 x112=8
>
>x113=. x114=0 x115=. x116=0 x117=. x118=6 x119=2 x120=9 x121=9 x122=0
x123=6
>x124=0 x125=. x126=0
>
>x127=. x128=. x129=0 x130=. x131=0 x132=. x133=0 x134=. x135=0 x136=.
x137=0
>x138=. x139=0 x140=.
>
>x141=0 x142=. x143=0 x144=. x145=0 x146=. x147=0 x148=. x149=0 x150=.
x151=0
>x152=. x153=0 x154=.
>
>x155=0 x156=. x157=0 x158=. x159=0 x160=. x161=0 x162=. x163=0 x164=.
x165=8
>x166=. x167=0 x168=.
>
>x169=0 x170=. x171=0 x172=. x173=0 x174=. x175=0 x176=. x177=0 x178=.
x179=2
>x180=0 x181=. x182=0
>
>x183=. x184=0 x185=. x186=0 x187=. x188=0 x189=. x190=1 x191=5 x192=6
x193=.
>x194=0 x195=. x196=0
>
>x197=. x198=2 x199=. x200=2 x201=8 x202=. x203=0 x204=. x205=0 x206=.
x207=0
>x208=. x209=0 x210=.
>
>x211=0 x212=. x213=1 x214=0 x215=3 x216=. x217=4 x218=2 x219=8 x220=4
x221=.
>x222=0 x223=. x224=0
>
>x225=. x226=0 x227=. x228=0 x229=. x230=0 x231=. x232=0 x233=. x234=0
x235=.
>x236=0 x237=. x238=0
>
>x239=. x240=0 x241=. x242=0 x243=. x244=0 _ERROR_=1 _N_=1
>
>NOTE: 1 record was read from the infile 'f:\Stark\first.txt'.
>
>The minimum record length was 256.
>
>The maximum record length was 256.
>
>One or more lines were truncated.
>
>NOTE: The data set WORK.ONE has 1 observations and 244 variables.
>
>NOTE: DATA statement used (Total process time):
>
>real time 0.03 seconds
>
>cpu time 0.01 seconds
>
>
>
>
>
>
>data one;
>infile 'f:\Stark\first.txt' dlm='09'x truncover;
>array x{244};
>input x(*) 5.;
>proc iml;
>use one;
>read all into x;
>proc print data= one;
>run;
>quit;
>
>283 proc iml;
>
>NOTE: IML Ready
>
>284 use one;
>
>285 read all into x;
>
>NOTE: Exiting IML.
>
>NOTE: PROCEDURE IML used (Total process time):
>
>real time 0.01 seconds
>
>cpu time 0.01 seconds
>
>
>
>286 proc print data= one;
>
>287 run;
>
>NOTE: There were 1 observations read from the data set WORK.ONE.
>
>NOTE: PROCEDURE PRINT used (Total process time):
>
>real time 0.00 seconds
>
>cpu time 0.00 seconds
>
>
>
>288 quit;
>
>
>
>or this one:
>
>
>379 data one;
>
>380 infile 'f:\Stark\first.txt' dlm='09'x truncover;
>
>381 array x{244};
>
>382 input x(*) 1.;
>
>NOTE: The infile 'f:\Stark\first.txt' is:
>
>Filename=f:\Stark\first.txt,
>
>RECFM=V,LRECL=256,File Size (bytes)=3074,
>
>Last Modified=29May2010:11:44:22,
>
>Create Time=29May2010:11:44:21
>
>NOTE: Invalid data for x2 in line 1 2-2.
>
>NOTE: Invalid data for x4 in line 1 4-4.
>
>NOTE: Invalid data for x6 in line 1 6-6.
>
>NOTE: Invalid data for x241 in line 1 241-241.
>
>NOTE: Invalid data for x243 in line 1 243-243.
>
>RULE:
>----+----1----+----2----+----3----+----4----+----5----+----6----+----7----
+----8----+---
>
>1 CHAR
>0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.51062.217624.0.0.0.0.0.2592006.0.0
.0.0.0.0.0.0.0
>
>ZONE
>30303030303030303030303030303030303030303333303333330303030303033333330303
03030303030303
>
>NUMR
>09090909090909090909090909090909090909095106292176249090909090925920069090
90909090909090
>
>89
>.0.0.0.0.0.0.0.0.0.67218.0.0.6299060.0..0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.
0.8.0.0.0.0.0.
>
>ZONE
>03030303030303030303333303030333333303003030303030303030303030303030303030
30303030303030
>
>NUMR
>90909090909090909096721890909629906090990909090909090909090909090909090909
09890909090909
>
>177
>0.20.0.0.0.0.156.0.0.2.28.0.0.0.0.0.103.4284.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0
.0.0.0
>
>ZONE
>30330303030303330303030330303030303033323333030303030303030303030303030303
030303
>
>NUMR
>092090909090915690909292890909090909103E4284909090909090909090909090909090
909090
>
>x1=0 x2=. x3=0 x4=. x5=0 x6=. x7=0 x8=. x9=0 x10=. x11=0 x12=. x13=0 x14=.
>x15=0 x16=. x17=0 x18=.
>
>x19=0 x20=. x21=0 x22=. x23=0 x24=. x25=0 x26=. x27=0 x28=. x29=0 x30=.
>x31=0 x32=. x33=0 x34=.
>
>x35=0 x36=. x37=0 x38=. x39=0 x40=. x41=5 x42=1 x43=0 x44=6 x45=2 x46=.
>x47=2 x48=1 x49=7 x50=6
>
>x51=2 x52=4 x53=. x54=0 x55=. x56=0 x57=. x58=0 x59=. x60=0 x61=. x62=0
>x63=. x64=2 x65=5 x66=9
>
>x67=2 x68=0 x69=0 x70=6 x71=. x72=0 x73=. x74=0 x75=. x76=0 x77=. x78=0
>x79=. x80=0 x81=. x82=0
>
>x83=. x84=0 x85=. x86=0 x87=. x88=0 x89=. x90=0 x91=. x92=0 x93=. x94=0
>x95=. x96=0 x97=. x98=0
>
>x99=. x100=0 x101=. x102=0 x103=. x104=0 x105=. x106=0 x107=. x108=6
x109=7
>x110=2 x111=1 x112=8
>
>x113=. x114=0 x115=. x116=0 x117=. x118=6 x119=2 x120=9 x121=9 x122=0
x123=6
>x124=0 x125=. x126=0
>
>x127=. x128=. x129=0 x130=. x131=0 x132=. x133=0 x134=. x135=0 x136=.
x137=0
>x138=. x139=0 x140=.
>
>x141=0 x142=. x143=0 x144=. x145=0 x146=. x147=0 x148=. x149=0 x150=.
x151=0
>x152=. x153=0 x154=.
>
>x155=0 x156=. x157=0 x158=. x159=0 x160=. x161=0 x162=. x163=0 x164=.
x165=8
>x166=. x167=0 x168=.
>
>x169=0 x170=. x171=0 x172=. x173=0 x174=. x175=0 x176=. x177=0 x178=.
x179=2
>x180=0 x181=. x182=0
>
>x183=. x184=0 x185=. x186=0 x187=. x188=0 x189=. x190=1 x191=5 x192=6
x193=.
>x194=0 x195=. x196=0
>
>x197=. x198=2 x199=. x200=2 x201=8 x202=. x203=0 x204=. x205=0 x206=.
x207=0
>x208=. x209=0 x210=.
>
>x211=0 x212=. x213=1 x214=0 x215=3 x216=. x217=4 x218=2 x219=8 x220=4
x221=.
>x222=0 x223=. x224=0
>
>x225=. x226=0 x227=. x228=0 x229=. x230=0 x231=. x232=0 x233=. x234=0
x235=.
>x236=0 x237=. x238=0
>
>x239=. x240=0 x241=. x242=0 x243=. x244=0 _ERROR_=1 _N_=1
>
>NOTE: 1 record was read from the infile 'f:\Stark\first.txt'.
>
>The minimum record length was 256.
>
>The maximum record length was 256.
>
>One or more lines were truncated.
>
>NOTE: The data set WORK.ONE has 1 observations and 244 variables.
>
>NOTE: DATA statement used (Total process time):
>
>real time 0.03 seconds
>
>cpu time 0.01 seconds
>
>
>
>383 proc iml;
>
>NOTE: IML Ready
>
>384 use one;
>
>385 read all into x;
>
>NOTE: Exiting IML.
>
>NOTE: PROCEDURE IML used (Total process time):
>
>real time 0.01 seconds
>
>cpu time 0.01 seconds
>
>
>
>386 proc print data= one;
>
>387 run;
>
>NOTE: There were 1 observations read from the data set WORK.ONE.
>
>NOTE: PROCEDURE PRINT used (Total process time):
>
>real time 0.00 seconds
>
>cpu time 0.00 seconds
>
>
>
>388 quit;
>
>
>
>and its output:
>
>messy data 10:18 Saturday, May 29, 2010 7980
>
>
>
>O x x x x x x x x x x x x x x x x x x x x x x x x x x x x
>
>b x x x x x x x x x 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3
3
>
>s 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6
7
>
>1 . . . . . . . . 51062 2176 . . .
92006 . . . . . . . . . . . . . . . . . .
>. . . . .
>
>
>
>O x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
x
>x x x
>
>b 3 3 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 7 7 7 7
7
>7 7 7
>
>s 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3
4
>5 6 7
>
>1 . . . . . . 0.4284 . . . . . . .
0 . . . . . . . . . . . . . . . . . . . .
>. . . . .
>
>x x x x x x x x x x x x x x x x x x x x
>
>O x x x x x x x x x x x x x x x x x x x x x x 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1
>1 1 1 1 1
>
>b 7 7 8 8 8 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 0 0 0 0 0 0 0 0 0 0 1 1 1 1
1
>1 1 1 1 1
>
>s 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3
4
>5 6 7 8 9
>
>1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
>. . . . .
>
>x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
x
>x x x x
>
>O 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1
>1 1 1 1 1
>
>b 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5
5
>5 5 5 6 6
>
>s 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
6
>7 8 9 0 1
>
>1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
>. . . . .
>
>x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
x
>x x x x
>
>O 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1
>1 2 2 2 2
>
>b 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9
9
>9 0 0 0 0
>
>s 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7
8
>9 0 1 2 3
>
>1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
>. . . . .
>
>x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
x
>x x x
>
>O 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2
>2 2 2 2
>
>b 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3
4
>4 4 4 4
>
>s 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
0
>1 2 3 4
>
>1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
>. . . .
>
>
>
>
>
>On Tue, May 18, 2010 at 11:07 PM, Tom Robinson
<barefootguru@gmail.com>wrote:
>
>> SAS handles large data sets fine and is the most adept language I've
seen
>> for processing messy files.
>>
>> Can you post a sample of what you're trying to read and the code you've
>> written to read the data in?
>>
>> Cheers
>>
>>
>> On 2010-05-19, at 14:12, Anna Supady wrote:
>>
>> > Hi guys,
>> >
>> > I am trying to learn how to handle large data sets, like 1 GB. I have
one
>> > project that I am working on right now from the website:
>> > www.kddcup-orange.com
>> > I am new to it. We tried to read data into SAS and it just doesn't
read
>> any
>> > variable. Any suggestions how to read messy data? Any maybe simpler
>> examples
>> > helpful.
>> >
>> > Thanks a lot,
>> >
>> > Ania,
>>
>
>
>
>On Tue, May 18, 2010 at 11:07 PM, Tom Robinson
<barefootguru@gmail.com>wrote:
>
>> SAS handles large data sets fine and is the most adept language I've
seen
>> for processing messy files.
>>
>> Can you post a sample of what you're trying to read and the code you've
>> written to read the data in?
>>
>> Cheers
>>
>>
>> On 2010-05-19, at 14:12, Anna Supady wrote:
>>
>> > Hi guys,
>> >
>> > I am trying to learn how to handle large data sets, like 1 GB. I have
one
>> > project that I am working on right now from the website:
>> > www.kddcup-orange.com
>> > I am new to it. We tried to read data into SAS and it just doesn't
read
>> any
>> > variable. Any suggestions how to read messy data? Any maybe simpler
>> examples
>> > helpful.
>> >
>> > Thanks a lot,
>> >
>> > Ania,
>>
|