Date: Thu, 3 Jul 2003 16:29:29 -0700
Reply-To: "Huang, Ya" <yhuang@AMYLIN.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Huang, Ya" <yhuang@AMYLIN.COM>
Subject: Re: A computation question
Content-Type: text/plain; charset="iso-8859-1"
Ken,
You want to try format:
data xx;
input OBS C1 $ C2 $ C3 $ HOW_MANY TEMP;
cards;
1 1 0 0 1 4
2 0 1 0 1 1
3 0 0 1 1 3
4 1 1 0 2 16
5 1 0 1 2 13
6 0 1 1 2 9
7 1 1 1 3 22
;
proc sql noprint;
select "'"||compress(c1||c2||c3)||"'="||put(temp,best.)
into :tmlst separated by ' '
from xx
;
proc format;
invalue tmp &tmlst;
run;
data xx (keep=obs nc: temp);
set xx;
allc=compress(c1||c2||c3);
temp=input(allc,tmp.);
array sub(3) $sub1-sub3;
array nc(3) nc1-nc3;
do i=1 to 3;
sub(i)=allc;
end;
do i=1 to 3;
substr(sub(i),i,1)='0';
end;
do i=1 to 3;
nc(i)=temp-input(sub(i),tmp.);
end;
run;
proc print;
run;
--------------
OBS TEMP nc1 nc2 nc3
1 4 4 0 0
2 1 0 1 0
3 3 0 0 3
4 16 15 12 0
5 13 10 0 9
6 9 0 6 8
7 22 13 9 6
The proc sql step is to generate a list for proc format,
which is like:
proc format;
invalue tmp
'100'=4
'010'=1
'001'=3
'110'=16
'101'=13
'011'=9
'111'=22
;
I used array, so it should be very easy to expand to
20 components. Also, since it is a one pass solution,
even for a one million obs data, it should not take
too long to process, in a few minutes at most.
Kind regards,
Ya Huang
-----Original Message-----
From: Ken Keung [mailto:1800okla@HANMAIL.NET]
Sent: Thursday, July 03, 2003 2:23 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: A computation question
Hello,
I have a question.
Here is a dataset with 6 variables.
Let me explain the variables first.
OBS : Observation number
C1: Component 1 (dummy variable: 1="In the pot", and 0="NOT in the pot")
C2: Component 2 (dummy variable: 1="In the pot", and 0="NOT in the pot")
C3: Component 3 (dummy variable: 1="In the pot", and 0="NOT in the pot")
HOW_MANY: How many components in the pot.
TEMP : Temperature
OBS C1 C2 C3 HOW_MANY TEMP
1 1 0 0 1 4
2 0 1 0 1 1
3 0 0 1 1 3
4 1 1 0 2 16
5 1 0 1 2 13
6 0 1 1 2 9
7 1 1 1 3 22
As you can see, all the combinations of 3 components are listed.
If we put the Component 1 into a empty pot, then the temperature goes to 4
from 0.
Now if I add the Component 2 to the pot (now Two components in the pot:
Component 1 and 2),
the temperature increases to 16.
The increment of temperature is 12 (=16 - 4), which is due to the addition
of Component 2 to the existing pot.
One thing to note is that the temperature always goes up when a component
is added.
Now, I want to produce the sas output like this.
OBS C1 C2 C3 HOW_MANY
1 4 . . 1
2 . 1 . 1
3 . . 3 1
4 15 12 . 2
5 10 . 9 2
6 . 6 8 2
7 13 9 6 3
For clarification, C1 in OBS 7 above output table shows 13.
That is, the temperature increases by 13 (from 9 to 22),
if we add Component 1 to the pot which already contained the Component 2
and 3.
Now the challenge is that I have 20 components, NOT 3 as shown above.
That means there are 1,048,575 observations
(2 to the power of 20 minus 1) in the dataset.
After approximately 20 hours (YES! hours) waiting, my computer (1.7GHZ and
512MB memory) couldn't produce the output.
I know that I should not blame the computer.
Anyway, I was able to obtain the sas output with 15 Components case in an
hour.
Would you kindly help me to code it more efficient way ?
Thank you very much for your help in advance.