Date: Tue, 21 Sep 2010 07:41:54 -0400
Reply-To: Arthur Tabachneck <art297@NETSCAPE.NET>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Arthur Tabachneck <art297@NETSCAPE.NET>
Subject: Re: Subsampling a Large Dataset
Joey,
Not directly related to your request, but make sure that you add a
semicolon onto the end of your initial filename statement as well.
Art
--------
On Tue, 21 Sep 2010 06:25:31 -0500, Data _null_; <iebupdte@GMAIL.COM>
wrote:
>There are INFILE statement options that can be helpful to you.
>
>FIRSTOBS=record-number
>specifies a record number that SAS uses to begin reading input data
>records in the input file.
>
>OBS=record-number | MAX
>
>record-number specifies the record number of the last record to read
>in an input file that is read sequentially.
>
>MAX specifies the maximum number of observations to process, which
>will be at least as large as the largest signed, 32-bit integer. The
>absolute maximum depends on your host operating environment.
>
>
>
>
>
>
>On Tue, Sep 21, 2010 at 6:19 AM, Joey Engelberg
><j-engelberg@kellogg.northwestern.edu> wrote:
>> I have a very large (150+ GB) flat file that I would like to read in
with
>> the infile statement. Before I go through the trouble of reading in the
>> entire thing I would like to work on perfecting my code on a small
subsample
>> of the data. Is there any way to have SAS work on a small subsample
without
>> looping through the entire 150 GB dataset? Right now my infile
statement
>> looks like this:
>>
>> Filename FT77F001 "D:\Temp\BigFile"
>>
>> Data MyDataset;
>> length v1-v100 $100;
>> Infile FT77F001 missover encoding="unicode";
>> input v1-v100;
>>
>> run;
>>
>>
>>
>> For example, Can I add a WHERE statement? I know an IF statement will
loop
>> through the entire dataset. Any help so that I can, for example, read
in
>> the first 1000 lines without reading in the rest would be very helpful.
>>
|