Date: Tue, 11 Mar 2003 16:28:21 -0800
Reply-To: "Huang, Ya" <yhuang@AMYLIN.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Huang, Ya" <yhuang@AMYLIN.COM>
Subject: Re: grouping and creating datasets
Content-Type: text/plain; charset="iso-8859-1"
Here is my sql based solution. SQL step flags the
student id as 1-4, then the data step split the raw
data into four data sets. I added student E as a
decreasing grade for testing, you can see the result
data set x1 has two students, A & E:
data xx;
input St $ Grd @@;
cards;
A 5 A 3 A 3
B 4 B 4 B 4
B 4 C 3 C 5
C 5 D 5 D 4
D 5 D 4 D 5
E 5 E 4 E 3
;
data xx;
set xx;
ord=_n_;
run;
proc sql;
create table flag as
select distinct a.st,
case when sum(a.grd > b.grd) + sum(a.grd = b.grd) = count(*)
and sum(a.grd > b.grd) > 0 then 1
when sum(a.grd < b.grd) + sum(a.grd = b.grd) = count(*)
and sum(a.grd < b.grd) > 0 then 2
when sum(a.grd = b.grd)=count(*) then 3
when sum(a.grd > b.grd) > 0 and sum(a.grd < b.grd) > 0 then 4
else . end as flag
from xx a, xx b
where a.st=b.st and a.ord = b.ord -1
group by a.st
order by a.st
;
data x1 x2 x3 x4;
merge xx flag;
by st;
if flag=1 then output x1;
if flag=2 then output x2;
if flag=3 then output x3;
if flag=4 then output x4;
run;
HTH
Ya Huang
-----Original Message-----
From: leyla yerlikaya [mailto:leylay@HOTMAIL.COM]
Sent: Tuesday, March 11, 2003 3:23 PM
To: SAS-L@LISTSERV.UGA.EDU
Subject: grouping and creating datasets
Hello all,
I would like to create four different datasets. One of the datasets includes all of the students whose grades only improved over the time (dataset 1). The other one will include all of the students whose grades decreased (dataset 2), another one will include all the students whose records stayed the same (dataset 3), and the last one will include the students whose grades both increased and decreased.
For instance, if I use the following sample dataset student A should be in the dataset1, student B should be in the dataset 3, student C should be in the dataset 2, and the student D should be in the dataset 4.
Student Grade
A 5
A 3
A 3
B 4
B 4
B 4
B 4
C 3
C 5
C 5
D 5
D 4
D 5
D 4
D 5
I wrote a code that calculates the 'difference' of the first and second, second and the third, ... grades for each student. I also generated dummy variables using this 'difference' variable which help to understand if that grade increased or decreased.
Now my dataset looks like this:
Student Grade difference increase decrease nochange
A 5 -2 0 1 0
A 3 0 0 0 1
A 3 .
B 4 0 0 0 1
B 4 0 0 0 1
B 4 0 0 0 1
B 4 .
C 3 +2 1 0 0
C 5 0 0 0 0
C 5 .
D 5 -1 0 1 0
D 4 +1 1 0 0
D 5 -1 0 1 0
D 4 +1 1 0 0
D 5 .
I will appreciate if you could give me a couple of hints on creating my datasets after this point.
Thanks,
leyla
Add photos to your messages with MSN 8. Get 2 months FREE*.