Date: Fri, 12 Mar 2004 16:33:22 -0500
Reply-To: Howard Schreier <Howard_Schreier@ITA.DOC.GOV>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Howard Schreier <Howard_Schreier@ITA.DOC.GOV>
Subject: Re: lag Return upto 12 time for unbalanced panel
Content-Type: text/plain; charset=ISO-8859-1
I'm going to assume that Dale's second option is appropriate.
The LAG function is messy, especially when one gets into deep lags and BY
group boundaries.
Here's a way which avoids the LAG function. I'll illustrate with a length
of 3 rather than 12.
Test data:
data have;
input FirmID $ Time Return;
cards;
A 1 1.2
A 2 1.3
A 3 2.4
A 4 1.7
B 1 4.3
B 2 4.9
B 3 4.5
B 4 4.0
C 1 3.1
C 2 3.2
C 3 3.3
C 5 3.5
;
Notice the hole (Dale's issue) in the last series.
Generate series. These are not the lagged values for each observation, but
rather the places where each RETURN value may appear as a lagged value.
data fortranspose;
set have;
do lagnumber = 1 to 3;
time + 1;
_name_ = 'ReturnLag' || put(lagnumber,z2.);
output;
end;
run;
Then sort and transpose:
proc sort data=fortranspose;
by FirmID Time _name_;
run;
proc transpose data=fortranspose out=transposed(drop=_name_);
by FirmID Time;
var Return;
run;
Fianlly merge with the original and discard mismatches which arise at the
end of each series or where there are holes:
data want;
merge have(in=original_obs) transposed;
by FirmID Time;
if original_obs;
run;
Result:
Firm Return Return Return
Obs ID Time Return Lag01 Lag02 Lag03
1 A 1 1.2 . . .
2 A 2 1.3 1.2 . .
3 A 3 2.4 1.3 1.2 .
4 A 4 1.7 2.4 1.3 1.2
5 B 1 4.3 . . .
6 B 2 4.9 4.3 . .
7 B 3 4.5 4.9 4.3 .
8 B 4 4.0 4.5 4.9 4.3
9 C 1 3.1 . . .
10 C 2 3.2 3.1 . .
11 C 3 3.3 3.2 3.1 .
12 C 5 3.5 . 3.3 3.2
On Thu, 11 Mar 2004 14:11:26 -0800, Dale McLerran
<stringplayer_2@YAHOO.COM> wrote:
>Sonia,
>
>Do you have missing times in the middle of a sequence? That is,
>do you have data as follows:
>
> FirmID time
> C 1
> C 2
> C 3
> C 4
> C 5
> C 6
> C 7
> C 8
> C 9
> C 10
> C 11
> C 12
> C 13
> C 15 /* note missing time=14 */
>
>In this case, what lags should be formed for time 15? Should lag1
>be missing and lag2 be the value of return at time=13? Or should
>lag1 be the value of return at time 13?
>
>Dale
>
>
>--- sonia afroz <afroz_sonia@HOTMAIL.COM> wrote:
>> Hi,
>>
>> I have unbalanced panel data. I have observations for several
>> thousand firms
>> for 229 time periods, but not all the firms have data for all the
>> time
>> periods.
>> Say my dataset looks like:
>>
>> FirmID Time Return
>> A 1 1.2
>> A 2 1.3
>> A 3 2.4
>> A 4 1.7
>> B 1 4.3
>> B 2 4.9
>> B 3 4.5
>> B 4 4.0
>> I want to create new columns in my dataset where is Return(t-1) will
>> have
>> one preiod lagged return and return (t-1) will have two period lagged
>> return
>> and so on and so forth for 12 lags for each firm. I tthink we need to
>> use a
>> loop for each firm, so that the lagged returns are for each firm.
>>
>> The dataset I want will look like the following:
>>
>> FirmID Time Return return(t-1) return(t-2)
>> A 1 1.2 . .
>> A 2 1.3 1.2 .
>> A 3 2.4 1.3 1.2
>> A 4 1.7 2.4 1.3
>> B 1 4.3 . .
>> B 2 4.9 4.3 .
>> B 3 4.5 4.9 4.3
>> B 4 4.0 4.5 4.9
>>
>> Please let me know what sas code I should use for this
>> purpose.
>> Thanks,
>> Sonia Afroz
>>
>> _________________________________________________________________
>> FREE pop-up blocking with the new MSN Toolbar – get it now!
>> http://clk.atdmt.com/AVE/go/onm00200415ave/direct/01/
>
>
>=====
>---------------------------------------
>Dale McLerran
>Fred Hutchinson Cancer Research Center
>mailto: dmclerra@fhcrc.org
>Ph: (206) 667-2926
>Fax: (206) 667-5977
>---------------------------------------
>
>__________________________________
>Do you Yahoo!?
>Get better spam protection with Yahoo! Mail.
>http://antispam.yahoo.com/tools