Date: Tue, 11 Sep 2007 10:23:32 -0700
Reply-To: "ajs2004@bigfoot.com" <ajs2004@BIGFOOT.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "ajs2004@bigfoot.com" <ajs2004@BIGFOOT.COM>
Organization: http://groups.google.com
Subject: Re: Constructing Variables
In-Reply-To: <200709111110.l8BAlc1l009529@malibu.cc.uga.edu>
Content-Type: text/plain; charset="us-ascii"
Result from program below:
Obs VarA Time VarB abgroup varc
1 A 11:55:04 . 0 11
2 A 11:55:04 . 0 11
3 A 11:55:05 . 0 11
4 A 11:55:06 . 0 11
5 A 11:55:06 . 0 11
6 B 11:55:06 10 0 .
7 B 11:55:06 12 0 .
8 A 11:55:07 . 1 13
9 A 11:55:08 . 1 13
10 B 11:55:09 13 1 .
________________________________________
data data1;
infile datalines missover;
input VarA $ Time :time. VarB;
format time time.;
datalines;
A 11:55:04
A 11:55:04
A 11:55:05
A 11:55:06
A 11:55:06
B 11:55:06 10
B 11:55:06 12
A 11:55:07
A 11:55:08
B 11:55:09 13
run;
/*
We assume that the data are already correctly sorted.
(We have to assume that, because they can't be uniquely
ordered based on the values shown.)
When VarA changes from 'B' to 'A', that marks the start
of a new group of observations.
Define those groups, calculate the mean value of VarB
in each group, then merge them back with the original data.
*/
data data1;
set data1;
abgroup + (vara='A' and lag(vara)='B');
run;
proc summary data=data1 nway;
class abgroup;
var varb;
output out=varc (keep=abgroup varc) mean=varc;
run;
data data1;
merge data1 varc;
by abgroup;
if (vara='B') then varc = .; /* Based on your illustration */
run;
On Sep 11, 12:10 pm, randista...@HOTMAIL.COM (Randy) wrote:
> Jim:
> I apologize for not being clearer. Your guess was correct. For every B
> whose time is le the time of the preceeding VarA's (Var A only consists of
> A's and B's); VarC takes the value of that VarB. But there is a slight
> variation that I also noticed in the dataset. Let me explain using the
> original example
>
> VarA Time VarB
> A 11:55:04
> A 11:55:04
> A 11:55:05
> A 11:55:06
> A 11:55:06
> B 11:55:06 10
> B 11:55:06 12
> A 11:55:07
> A 11:55:08
> B 11:55:09 13
> and so on. VarC now takes on the value of the mean of VarB.
> So the data set I want is to construct is
>
> VarA Time VarB VarC
> A 11:55:04 11
> A 11:55:04 11
> A 11:55:05 11
> A 11:55:06 11
> A 11:55:06 11
> B 11:55:06 10
> B 11:55:06 12
> A 11:55:07 13
> A 11:55:08 13
> B 11:55:09 13
>
> Can you help? Does the code change?
>
> On Tue, 11 Sep 2007 02:45:00 -0400, Jim Groeneveld <jim4s...@YAHOO.CO.UK>
> wrote:
>
>
>
> >Hi Randy,
>
> >WHAT SHOULD THE CRITERION BE TO ADD VARC?
> >You omitted to indicate the criterion.
> >Now we just have to guess.
>
> >Furthermore, what is your overall problem?
> >Maybe you don't need such a (partial) solution at all,
> >but quite another overall solution.
> >Please tell us the purpose of this problem.
>
> >Finally, I guessed about the criterion
> >and developed a solution that fits your test data.
> >But it may not fit your real data at all!
>
> >DATA OldData;
> > INPUT VarA $CHAR1. Time : TIME8. @25 VarB $CHAR1.;
> > FORMAT Time TIME8.;
> > CARDS;
> >A 11:55:04
> >A 11:55:04
> >A 11:55:05
> >A 11:55:06
> >A 11:55:06
> >B 11:55:06 X
> >A 11:55:07
> >A 11:55:08
> >B 11:55:09 Y
> >;
> >RUN;
>
> >PROC PRINT DATA=OldData; RUN;
>
> >* Create Group variable;
> >DATA GrpData;
> > SET OldData;
> > BY VarA NOTSORTED;
> >* RETAIN Group 1; * not necessary because of sum statement below;
> > IF (VarA EQ 'A' AND FIRST.VarA) THEN Group + 1;
> >RUN;
>
> >PROC PRINT DATA=GrpData; RUN;
>
> >DATA NewData (DROP=Group);
> > MERGE GrpData /* with itself */
> > GrpData (RENAME=(VarB=VarC) WHERE=(VarC NE '') DROP=VarA Time);
> > BY Group;
> > IF (VarA EQ 'B') THEN VarC = '';
> >RUN;
>
> >PROC PRINT DATA=NewData; RUN;
>
> >Of course there are other coding solutions possible.
> >If you want more explanation on this code then please ask us.
>
> >Regards - Jim.
> >--
> >Jim Groeneveld, Netherlands
> >Statistician, SAS consultant
> >home.hccnet.nl/jim.groeneveld
>
> >On Mon, 10 Sep 2007 23:24:39 -0400, Randy <randista...@HOTMAIL.COM> wrote:
>
> >>Dear Sas Users Group:
>
> >>My data set is as follows:
>
> >>VarA Time VarB
> >>A 11:55:04
> >>A 11:55:04
> >>A 11:55:05
> >>A 11:55:06
> >>A 11:55:06
> >>B 11:55:06 X
> >>A 11:55:07
> >>A 11:55:08
> >>B 11:55:09 Y
> >>and so on.
> >>The data set I want is to construct another VarC and
>
> >>VarA Time VarB VarC
> >>A 11:55:04 X
> >>A 11:55:04 X
> >>A 11:55:05 X
> >>A 11:55:06 X
> >>A 11:55:06 X
> >>B 11:55:06 X
> >>A 11:55:07 Y
> >>A 11:55:08 Y
> >>B 11:55:09 Y
>
> >>any suggestions?
>
> >> Randy- Hide quoted text -
>
> - Show quoted text -