Date: Mon, 25 Oct 1999 06:38:10 +0000
Reply-To: kmself@ix.netcom.com
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Karsten M. Self" <kmself@IX.NETCOM.COM>
Organization: Self Analysis
Subject: Re: Pipelined SAS? Has anyone tried it?
Content-Type: text/plain; charset=us-ascii
> Date: Sat, 23 Oct 1999 21:17:15 -0400
> From: "Martin H. Rusoff" <mrusoff@COLUMBUS.RR.COM>
> Subject: Pipelined SAS? Has anyone tried it?
>
> Has anyone herad of or tried running multiple SAS steps in parallel with
> UNIX pipes between them to avoid the writing of temp files? (I realize
> this would not work for things requiring indices or sorting) Also, has
> anyone compared using an optimized data reader filling named pipes to
> allow multiple SAS programs to feed from the same file without
> contention?
If you're referring to what I think you are, you can accomplish this
fairly simply using SQL and/or DATA step views.
Note that while you can create views with SQL or a DATA step, other
procs, AFAIK, can only output a data set.
E.g.:
proc sql;
create view foo as select /* stuff */ from bar;
data bar;
set foo;
/* stuff */
run;
proc report data= foo;
/* stuff */
run;
...Effectively, you can accomplish most of the processing of SORT,
SUMMARY, MEANS and, of course, DATA STEP, using SQL and DATA step
processing.
In my experience, it's usually sufficient to perform some level of
subsetting and summarization prior to more expensive processing to
achieve very high levels of processing efficiency. See a series of
efficiency posts I made to SAS-L ~Dec 1998.
If you are referring to pipelining SAS via Unix processes, you can use
the -stdio option to read from and write to stdout. I haven't
experimented with other filedescriptors, or with pipelining multiple
between multiple SAS processes. I imagine that the necessity to write
raw data input and output routines, as well as the efficiency loss
involved in converting to and from raw data, as well as multiple
instances of SAS, would tend to minimize any possible gains from
pipelining between multiple SAS sessions. Doable, yes. Advisable --
you'd have to sell me on it.
If you are trying to spawn off multiple SAS processes to handle
different parts of processing, you could do this via system calls (X,
%SYSEXEC, or CALL SYSTEM (IIRC). However you'd have to use temporary
files for inter-process communications.
For what I *think* you're looking for, more traditional Unix tools
(perl, R, awk, etc.) might be better suited.
What's your application and what do you hope to gain by whatever it is
you're trying to do?
--
Karsten M. Self (kmself@ix.netcom.com)
What part of "Gestalt" don't you understand?
SAS for Linux: http://www.netcom.com/~kmself/SAS/SAS4Linux.html
Mailing list: "subscribe sas-linux" to
mailto:majordomo@cranfield.ac.uk
11:24pm up 3 days, 11:39, 2 users, load average: 0.54, 0.38, 0.31