Date: Thu, 20 May 2010 10:31:50 -0400
Reply-To: "W. Matthew Wilson" <matt@TPLUS1.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "W. Matthew Wilson" <matt@TPLUS1.COM>
Subject: Tools to visualize dataset dependencies?
Content-Type: text/plain; charset=ISO-8859-1
I inherited some REALLY long SAS programs that use lots and lots of
data steps and I'm having a hard time keeping it all in my brain.
I'm a big fan of dot (http://graphviz.org) and I would like to use it
to graph the dependencies. Has anyone done anything like this?
For example, I want to translate the SAS code below:
data b;
set a;
/* skip lots of variable assignments here */
run;
proc summary data=b;
/* skip various options here */
output out=c;
run;
data e;
merge c d;
run;
Into something like this dot syntax:
digraph G {
a -> b [label="data step"];
b -> c [label="proc summary"];
c -> e [label="data step"];
d -> e [label="data step"];
};
And then dot will make a purty picture, like this one
:http://scratch.tplus1.com/scratch.png
When I look at that picture, it is obvious to me that the two input
datasets that must already exist for this code are a and d. That fact
is NOT obvious when I read the code, especially since I really have >
50 intermediate data steps in this program and at least a dozen
prerequisite datasets.
Is there already a tool to visualize dependencies like this? Does
anyone have any other ideas for how to attack this problem?
Thanks in advance.
--
W. Matthew Wilson
http://tplus1.com