LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (October 2002, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 23 Oct 2002 16:11:45 -0400
Reply-To:     Quentin McMullen <QuentinMcMullen@WESTAT.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Quentin McMullen <QuentinMcMullen@WESTAT.COM>
Subject:      Re: data dictionary in SAS?
Comments: To: William Kossack <kossackw@NJC.ORG>
Content-Type: text/plain; charset="iso-8859-1"

William Kossack [mailto:kossackw@NJC.ORG] asks:

> I'm working on several large studies with large datasets and I've been > asked to setup a data dictionaries for them. > > In SAS? > > Where do I start? Please give me a clue?

I think you need to start by thinking about what information you want to have in the dictionary, what organization you want it to have, and what format(s) you want (ASCII? PDF? html?).

These questions take a lot of thought, and it's nice to have them answered before starting to code. When I think data dictionary, I think descriptive information on each variable (what could that mean? perhaps frequencies, perhaps mean/sd/min/max, perhaps UNIVARIATE... ). But I think in terms of a data dictionary that I would give to an analyst. If I were making data dictonaries for programmers, I would want to include information on the type of each variable, length, format, etc. In any case, a good data dictionary is a great asset to a project/division/company. It's worth spending time thinking about who the users of the data dictionary will be, and what their needs are.

Once you have designed the layout and content for the dictionary, coding should flow from that. I'd think in terms of FREQ, MEANS, and UNIVARIATE to get summary statistics for each variable. And plenty of use of dictionary tables (esp dictionary.tables and dictionary.columns). And then ODS gives you plenty of options is terms of file formats.

There are lots of fun challenges, for example, assuming you want to give FREQs on character variables, what do you want to do when a character variable has 200 different levels? What's a good way to summarize a SAS date variable (probably don't want to summarize BirthDate as average number of days since 1/1/1960)?

Were I to start working on a %MakeDictionary macro, I think I would approach it as a structured-programming exercise. There are lots of little tasks, and each one would get it's own little macro (%FindVarType, %DescribeNumVar, %DescribeCharVar, %DescribeDateVar... ) [My thanks to Ed Heaton for introducing me to such organization.] Without that, I'd fear the entire macro would get out-of-hand.

I hope this hasn't been so general as to be unhelpful. I'll look forward to seeing other people's thoughts, and perhaps references to existing macros???

Kind Regards, --Quentin

Back to: Top of message | Previous page | Main SAS-L page