Date: Tue, 18 Jul 2006 20:07:08 -0400
Reply-To: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Sigurd Hermansen <HERMANS1@WESTAT.COM>
Subject: Re: Create Table Showing Number of Observations in Each Dataset
In-Reply-To: <200607182250.k6IM7PBb025355@mailgw.cc.uga.edu>
Content-Type: text/plain; charset="us-ascii"
Paul:
Note that you have only partial queries in parentheses. SQL requires
in-line queries to follow the syntax of other SELECT queries. To make
union query syntax less confusing, I arrange all UNION queries as a
series of union'd subqueries:
proc sql noprint;
create table work.nobs_in_datasets as
select * from
((select count(*) as nobs, 'D1' as dataset from work.D1)
outer union corr
(select count(*) as nobs, 'D2' as dataset from work.D2)
outer union corr
(select count(*) as nobs, 'D3' as dataset from work.D3)
)
;
quit;
With this construct you can test all of the subqueries to see that they
work as intended.
For counts of observations in ordinary datasets (not views), one can
extract different counts (one counting deleted observations) from a
metadata table, Dictionary.tables. That does not require reading all
observations in the datasets.
Sig
-----Original Message-----
From: owner-sas-l@listserv.uga.edu [mailto:owner-sas-l@listserv.uga.edu]
On Behalf Of Paul Walker
Sent: Tuesday, July 18, 2006 6:51 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Create Table Showing Number of Observations in Each Dataset
I would like to create a table that looks like the following:
dataset nobs
------- ---------
D1 10000
D2 12000
D3 7500
The column 'dataset' represents the name of a dataset, and nobs is the
number of observations. Suppose there are 3 datasets as indicated, D1,
D2, and D3. There are several ways to create such a table using macro
variables or dictionary tables. However, I would like to use SQL to
build this table in the following sort of way.
proc sql noprint;
create table work.nobs_in_datasets as
union a (count(*) as nobs, 'D1' as dataset from work.D1)
b (count(*) as nobs, 'D2' as dataset from work.D2)
c (count(*) as nobs, 'D3' as dataset from work.D3); quit;
However, I know the above code does not work and wondered if any PROC
SQL experts could help correct my syntax? The basic problem is how to
'stack' the rows of data created by the code:
count(*) as nobs, 'D1' as dataset from work.D1;
count(*) as nobs, 'D2' as dataset from work.D2;
count(*) as nobs, 'D3' as dataset from work.D3;
TIA,
-PW