Date: Fri, 5 Nov 2004 14:24:50 -0500
Reply-To: "Chang Y. Chung" <chang_y_chung@HOTMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Chang Y. Chung" <chang_y_chung@HOTMAIL.COM>
Subject: Re: ask help for output xml file from Chang Y. Chung or Hung Ya
On Thu, 4 Nov 2004 09:47:27 -0800, eric <ezhou@SHAW.CA> wrote:
...
>I would like to output an xml file from SAS.
>my source dataset is test as following:
...
Hi,
It is Friday, so,... I am just reporting another version of xquery
solution. This time, I am exporting the data into two xml documents -- one
for parents (P.xml) and another for children (C.xml). This can be done
easily using a simple data step and xml library engine.
Now, here is the fun part. XQuery, so far, does not scale well. The
current implementation (at least the saxon8) keeps the whole document in
the memory. But it is very good at "joining" xml documents together. And
it can do this over the network. In order to demonstrate this, I put the
two documents in a web site as http://changchung.com/g/p.xml and
http://changchung.com/g/c.xml
Now, we can xquery from these two documents, as long as we are connected
to internet. The final output is the same. and reported in the comments
below. (html is an xml application. so, you can generate a html page
directly from xquery, too. This is left as an exercise for the interested.)
I think the xquery itself is simpler and (I hope) runs faster (since I am
reading each document once), than before (when I queried a single xml
document). Happy Friday!
Cheers,
Chang
/* ran on sas for windows 9.1.2 */
%let pwd=%sysfunc(pathname(WORK));
%put NOTE: pwd=&pwd.;
x cd "&pwd.";
/* test data */
data test;
input parent_id child_id name $;
datalines;
1000 1001 c1
1000 1000 p1
2000 2000 p2
2000 2001 c2
2000 2002 c3
3000 3000 p3
;
run;
/* Divide the input data set into two and export them
separately into parents and children^s document separately.
renaming variables, also. pid in the children^s data set
is the "foreign key."
*/
libname p xml "p.xml";
libname c xml "c.xml";
data p.p(keep=pid nm)
c.c(keep=pid cid nm)
;
set test;
length pid cid 8 nm $2;
pid = parent_id;
nm = name;
if parent_id = child_id then do;
output p.p;
end; else do;
cid = child_id;
output c.c;
end;
run;
libname p clear;
libname c clear;
/* I ftp^ed them to a web site */
/* now we write out an xquery to "join" the parent and her children */
/* create XQuery */
data _null_;
infile cards truncover;
file "pc.xq" lrecl=100;
length line $100;
input line $char100.;
L = length(line);
put line $varying. L;
cards4;
(: xquery using saxon8.1.1 :)
(: by chang y. chung on 2004-11-04 :)
<parents>
{
for $p in (doc("http://changchung.com/g/p.xml")//P)
let $cs := doc("http://changchung.com/g/c.xml")//C[pid=$p/pid]
order by $p/pid
return
<parent>
<id>{ $p/pid/text() }</id>
<name>{ $p/nm/text() }</name>
{
for $c in $cs
order by $c/cid
return
<child>
<id>{ $c/cid/text() }</id>
<name>{ $c/nm/text() }</name>
</child>
}
</parent>
}
</parents>
;;;;
run;
/* run the query using saxon8.1.1.
on my pc, the saxon8.jar is in c:\program files\saxon8 directory.
*/
options xsync noxwait;
x ' set classpath="c:\program files\saxon8\saxson8.jar";%classpath%
& java net.sf.saxon.Query pc.xq !encoding=windows-1252 > pcs.xml
& ver';
/* version command--just to make sas sync with the slow java vm execution.
any dos command will do.
& is the dos command delimiter.
*/
/* contents of pcs.xml file
<?xml version="1.0" encoding="windows-1252"?>
<parents>
<parent>
<id> 1000 </id>
<name> p1 </name>
<child>
<id> 1001 </id>
<name> c1 </name>
</child>
</parent>
<parent>
<id> 2000 </id>
<name> p2 </name>
<child>
<id> 2001 </id>
<name> c2 </name>
</child>
<child>
<id> 2002 </id>
<name> c3 </name>
</child>
</parent>
<parent>
<id> 3000 </id>
<name> p3 </name>
</parent>
</parents>
*/