Date: Mon, 8 Sep 2008 14:44:23 -0600
Reply-To: Alan Churchill <savian001@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Alan Churchill <savian001@GMAIL.COM>
Subject: Re: SAS data file format ?
In-Reply-To: <48c58c35$0$874$ba4acef3@news.orange.fr>
Content-Type: text/plain; charset="us-ascii"
Look at the SAS OleDb drivers for Windows. They are free and make SAS files
appear as an OleDb compliant datasource. SAS does not need to be installed
to use this.
Alan
Alan Churchill
Savian
www.savian.net
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of Capra
Hircus
Sent: Monday, September 08, 2008 2:34 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Re: SAS data file format ?
Mary wrote:
> Capra,
>
> One trend I've certainly seen in the health care industry, not just in =
> SAS but also in moving data between health care systems such as IDX/GE =
> to EPIC, is to write data out as XML and then have the recieving system =
> read data in as XML. In SAS, one might write it out as follows:
>
> libname trans xml 'c:\temp\analysis_set.xml' xmltype=3Dgeneric =
> tagset=3Dsasxmxsd=20
> XMLMETA=3DSCHEMADATA;
> data trans.analysis_set;
> set work.analysis_set;
> run;
>
>
> It would seem, if you are willing to devote your time to open code,
I'd like to be useful.
> that =
> perhaps a more valuable use of your time would be to ensure that the =
> packages your users use, such as R, are able to utilize XML with =
> embedded XSD; where it is the XSD that I've found that contains the =
> metadata that defines the informats of the variables; thanks to Alan =
> Churchill for explaining this.
It's an interesting idea.
> Then you could simply export your SAS datasets as above.
XML is certainly a good thing, but :
* at work I have datasets as huge as 1.4G (I said 900M in another post
but after checking it's 1.4G). It may yield a rather large xml file.
* if I write a program for personnal use and hopefully for other
people, I cannot assume anybody has SAS to convert files.
That's a point I don't understand : you are not the first to
tell me to convert my files with SAS, but that's precisely the problem,
getting a SAS file without SAS. That's a problem I had before getting
this job, where I do have SAS now. If I can help someone with finding
the file format, why not ? It's probably not easy though, otherwise R
would already have that. (after a first look, data is rather easy
to see in hexadecimal dumps, but it's exact location in the file
does not look very deterministic for the moment :-))
Oh, and btw, R is really out of question to manage 1.4G
files. Or even 60M. That's a problem often seen with programs that
load entire files in memory.
Just to show some limitation (useful when trying to find
informations with no specs at hand) :
SAS can write a 5 billion lines file, though it seems to
have a problem to store the number of rows : as far as I can tell,
it's limited to signed 32 bit integers.