Date: Tue, 23 Dec 2003 09:46:13 +0000
Reply-To: peter@CRAWFORDSOFTWARE.DEMON.CO.UK
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Peter Crawford <peter@CRAWFORDSOFTWARE.DEMON.CO.UK>
Subject: Re: How to SUM a Range of Variables in PROC MEANS
Content-Type: text/plain
On Mon, 22 Dec 2003 17:34:57 -0500, Sheila J Gross <sheilaandhercats@EROLS.COM> wrote:
>Stephen wrote:
>In the original data set below, I have 25 variables (V1S1-V1S25), each of
>which contains 1's and 0's. I'm attempting to sum each variable for each
>type of aggregate and create a new variable in the output VQ1, VQ2, etc
>containing that sum. The code below appears to be working - except for the
>fact that the tallies that are being returned are negative. If I were to
>take the absolute value of each VQ variable for each type of aggregation,
>the results would be correct. I would appreciate it if someone could tell
>me what I'm doing wrong, or if the problem should be approached
>differrently.
>Thanks in advance.
>
>PROC MEANS data=rdpn.ORIGINAL_SET NOPRINT;
> CLASS DTPN VRCN;
> TYPES DTPN * VRCN;
> VAR V1S1-V1S25;
> OUTPUT OUT=NEW_SET SUM(V1S1-V1S25)=VQ1-VQ25 / NOINHERIT;
>RUN;
>
>
>Not that this addresses your problem directly, but I've always used the NWAY
>option in the Proc Means statement and then the CLASS statement to generate
>only the highest level interactions means or sums. (I'm not familiar with
>the "noinherit" option)
>
>Also, in the SUM portion of the OUTPUT statement, I don't think you need to
>list the variable names (I'm not familiar with the "noinherit" option).
>Thus, I would write the step as follows:
>
>PROC MEANS DATA=RDPN.ORIGINAL_SET NWAY NOPRINT;
> CLASS DTPN VRCN;
> VAR V1S1-V1S25;
> OUTPUT OUT=NEW_SET SUM=VQ1-VQ25;
>RUN;
>
>Sheila
>mailto:sheilaandhercats@erols.com
It seemed unlikely to be causing negatives from positives, so I tested
both variations of proc means code, and followed with proc compare
182 PROC MEANS data=rdpn.ORIGINAL_SET NOPRINT;
183 CLASS DTPN VRCN;
184 TYPES DTPN * VRCN;
185 VAR V1S1-V1S25;
186 OUTPUT OUT=NEW_SET1 SUM(V1S1-V1S25)=VQ1-VQ25 / NOINHERIT;
187 RUN;
NOTE: There were 25 observations read from the data set RDPN.ORIGINAL_SET.
NOTE: The data set WORK.NEW_SET1 has 25 observations and 29 variables.
NOTE: PROCEDURE MEANS used:
real time 0.03 seconds
cpu time 0.03 seconds
188
189 /*
190 Not that this addresses your problem directly, but I've always used
... .......snipped
197 Thus, I would write the step as follows:
198 */
199 PROC MEANS DATA=RDPN.ORIGINAL_SET NWAY NOPRINT;
200 CLASS DTPN VRCN;
201 VAR V1S1-V1S25;
202 OUTPUT OUT=NEW_SET2 SUM=VQ1-VQ25;
203 RUN;
NOTE: There were 25 observations read from the data set RDPN.ORIGINAL_SET.
NOTE: The data set WORK.NEW_SET2 has 25 observations and 29 variables.
NOTE: PROCEDURE MEANS used:
real time 0.01 seconds
cpu time 0.01 seconds
204
205 proc compare base=new_set1 compare=new_set2; run;
NOTE: There were 25 observations read from the data set WORK.NEW_SET1.
NOTE: There were 25 observations read from the data set WORK.NEW_SET2.
NOTE: PROCEDURE COMPARE used:
real time 0.01 seconds
cpu time 0.01 seconds
The output window of proc compare, finished with
Number of Observations with Some Compared Variables Unequal: 0.
Number of Observations with All Compared Variables Equal: 25.
NOTE: No unequal values were found. All values compared are exactly equal.
At this point, perhaps the input data becomes relevant. Mine looks like
+FSVIEW: RDPN.ORIGINAL_SET (B)-----------------------------------------+
| Obs DtPn Vrcn v1s1 v1s2 v1s3 v1s4 v1s5 v1s6 v1s7 v1s8 |
| |
| 1 US OK 0 1 0 0 1 1 0 0 |
| 2 US no 1 0 0 0 0 0 0 1 |
| 3 US NO 0 0 0 0 0 0 0 0 |
| 4 US Ja 0 0 0 0 0 0 0 0 |
| 5 US Yo 1 1 1 0 0 0 0 1 |
| 6 UK OK 0 0 0 0 0 0 0 0 |
| 7 UK no 0 0 0 0 0 0 0 0 |
| 8 UK NO 0 0 0 0 0 0 0 0 |
| 9 UK Ja 0 1 0 0 1 0 1 0 |
| 10 UK Yo 0 0 0 0 0 0 1 0 |
| 11 NL OK 1 0 1 0 0 1 0 0 |
+-----------------------------------------------------------------------+
Perhaps we might have a snip of the "real original"
Regards
Peter Crawford