Date: Thu, 11 Sep 2008 11:49:15 -0500
Reply-To: Mary <mlhoward@avalon.net>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Mary <mlhoward@AVALON.NET>
Subject: Re: Question on Transpose with two columns needed
Content-Type: text/plain; format=flowed; charset="iso-8859-1";
reply-type=response
I just tried it; actually the explicit subscripted array approach beat the
do over in this attempt!
6722 data affymetrix_long;
6723 set affymetrix_set4;
6724 informat snp_name $30. numeric_coding $30. allele_coding $30.;
6725 array snp_names_array &snp_names;
6726 array allele_names_array &allele_names;
6727 do i=1 to dim(snp_names_array);
6728 snp_name = vname(snp_names_array[i]);
6729 numeric_coding = snp_names_array[i];
6730 allele_coding = allele_names_array[i];
6731 output;
6732 end;
6733 keep sample_name &non_snp_names snp_name numeric_coding allele_coding;
6734 run;
NOTE: There were 1086 observations read from the data set
WORK.AFFYMETRIX_SET4.
NOTE: The data set WORK.AFFYMETRIX_LONG has 3547962 observations and 19
variables.
NOTE: DATA statement used (Total process time):
real time 51.73 seconds
cpu time 9.49 seconds
6735
6736
6737 data affymetrix_long_test;
6738 set affymetrix_set4;
6739 informat snp_name $30. numeric_coding $30. allele_coding $30.;
6740 array snp_names_array &snp_names;
6741 array allele_names_array &allele_names;
6742 do over snp_names_array;
6743 snp_name = vname(snp_names_array);
6744 numeric_coding = snp_names_array;
6745 allele_coding = allele_names_array;
6746 output;
6747 end;
6748 keep sample_name &non_snp_names snp_name numeric_coding allele_coding;
6749 run;
NOTE: There were 1086 observations read from the data set
WORK.AFFYMETRIX_SET4.
NOTE: The data set WORK.AFFYMETRIX_LONG_TEST has 3547962 observations and 19
variables.
NOTE: DATA statement used (Total process time):
real time 1:48.18
cpu time 10.45 seconds
----- Original Message -----
From: Mary
To: SAS-L@LISTSERV.UGA.EDU
Sent: Thursday, September 11, 2008 11:18 AM
Subject: Re: Re: Question on Transpose with two columns needed
I didn't try the do over, but certainly the subscripted array approach was
very fast; I can't see the that the do over would have saved substantial
time.
One principal I would fall back to is the understandability of the code;
most programmers would understand a subscripted array, as it occurs in most
programming languages, including PL/1, which is the language upon SAS was
based, whereas "do over" does not occur in PL/1 or to my knowledge any other
major language, so that would be a reason to prefer explicitely subscripted
arrays. Using the "array arrayname {*}" and "dim(arrayname) as the upper
limit to the loop" coding solves the problem of not knowing how many
variables are in the array, which I believe was the original advantage of
the "do over" in early versions of SAS, and again these also exist in other
languages for easy understandability.
-Mary
----- Original Message -----
From: ./ ADD NAME=Data _null_,
To: SAS-L@LISTSERV.UGA.EDU
Sent: Thursday, September 11, 2008 11:00 AM
Subject: Re: Question on Transpose with two columns needed
On 9/10/08, Muthia Kachirayan <muthia.kachirayan@gmail.com> wrote:
>
>
>
> The use of name-range is a good choice which I forgot. Your preference to
> the use explicitly subscripted array is fine but the use of 'DO OVER' is
> handy to me. I prefer to know your reasons against its use (except that it
> is undocumented but widely used).
I like an undocumented feature as well as the next guy. This is just
not one of them. There may be a performance difference but I would
not expect one, we should test.
> Muthia Kachirayan
>
>