LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 2010, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 20 May 2010 10:31:50 -0400
Reply-To:     "W. Matthew Wilson" <matt@TPLUS1.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "W. Matthew Wilson" <matt@TPLUS1.COM>
Subject:      Tools to visualize dataset dependencies?
Content-Type: text/plain; charset=ISO-8859-1

I inherited some REALLY long SAS programs that use lots and lots of data steps and I'm having a hard time keeping it all in my brain.

I'm a big fan of dot (http://graphviz.org) and I would like to use it to graph the dependencies. Has anyone done anything like this?

For example, I want to translate the SAS code below:

data b; set a; /* skip lots of variable assignments here */ run;

proc summary data=b; /* skip various options here */ output out=c; run;

data e; merge c d; run;

Into something like this dot syntax:

digraph G { a -> b [label="data step"]; b -> c [label="proc summary"]; c -> e [label="data step"]; d -> e [label="data step"]; };

And then dot will make a purty picture, like this one :http://scratch.tplus1.com/scratch.png

When I look at that picture, it is obvious to me that the two input datasets that must already exist for this code are a and d. That fact is NOT obvious when I read the code, especially since I really have > 50 intermediate data steps in this program and at least a dozen prerequisite datasets.

Is there already a tool to visualize dependencies like this? Does anyone have any other ideas for how to attack this problem?

Thanks in advance.

-- W. Matthew Wilson http://tplus1.com


Back to: Top of message | Previous page | Main SAS-L page