| Date: | Fri, 10 Aug 2007 09:13:38 -0500 |
| Reply-To: | "Hoyle, Larry" <larryhoyle@KU.EDU> |
| Sender: | "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU> |
| From: | "Hoyle, Larry" <larryhoyle@KU.EDU> |
| Subject: | Re: Exporting metadata to XML |
|
| Content-Type: | text/plain; charset="US-ASCII" |
Unlike SPSS, where value labels are an attribute of the variable, in SAS
a variable may or may not have a permanent association with a set of
labels for its values. A format may be assigned to a variable in a DATA
step or in SQL or in PROC DATASETS, or as needed in a PROC. As you
noted, this merely assigns the name of the associated format as a
property of the variable. The user defined formats are stored separately
- not in the dataset itself.
To associate the values from the format with the variables that have
assigned formats one could join DICTIONARY.COLUMNS with a cntlout
dataset on the format name.
In the cntlout dataset the format name is stored without the period or
field width(s). In DICTIONARY.COLUMNS it is stored with the period and
any field width information. The following snippet would find the
assigned user formats for the dataset &dataset in the library &lib
proc sql;
create table myColumns as
select *,
prxchange('s/\d*\.\d*//',1,format) as fmtName
from dictionary.columns
where libname=upcase("&lib") and memname=upcase("&dataset")
order by name;
create table userFormats as
select myFormats.*
from myformats,myColumns
where myColumns.fmtName=myFormats.fmtname and
myFormats.type in ("N","C");
I expect it is common practice, though, for SAS programmers to not
permanently associate any user format with a variable, especially in the
case where there are several forms of labels to be associated with the
values of a variable. Some output, like axis labels for a chart, might
need short forms of labels while other output, like row headings for a
table, might use a longer form. Multiple formats may also be used to
group values in alternative ways.
Another inducement to not create permanent associations is the fact that
opening a dataset that has user formats assigned when the formats are
not accessible produces an error unless the NOFMTERR option is in
effect.
All of this means that archiving all of the associations between user
formats and SAS variables will require metadata about the metadata, such
as a table listing the associations between variables and formats.
Larry Hoyle
Associate Scientist
Institute for Policy & Social Research
University of Kansas
Blake Hall
1541 Lilac Lane
Lawrence, KS 66044-3177
http://www.ipsr.ku.edu
>> -----Original Message-----
Larry,
Thanks for your response.
With libname XML engine I was not aware that XMLTYPE is actually
referencing also a tag set. So the only difference between XMLTYPE and
TAGSET seems to be this: some default tag sets are available with
XMLTYPE, additional and user-defined tag sets can be defined by TAGSET.
proc format cntlout=myformats;
Yes, this seems to be the appropriate approach to have a basis for
exporting user-defined formats (my description with "output dataset from
PROC FORMAT" was heading to this).
An approach by PROC SQL and the DICTIONARY tables seems to provide only
the names of the user-defined formats, but not the formats themselves
(like "category labels").
I don't see a possibility to get the details of the user-defined formats
by PROC SQL.
Achim
|