LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (May 2005, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 24 May 2005 20:21:43 -0700
Reply-To:     shiling99@YAHOO.COM
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         shiling99@YAHOO.COM
Organization: http://groups.google.com
Subject:      Re: proc mixed model with ar(1) processes inter and intra blocks
Comments: To: sas-l@uga.edu
In-Reply-To:  <OF2618244F.73E2D889-ON8825700B.008372B1-8825700C.00014136@epamail.epa.gov>
Content-Type: text/plain; charset="iso-8859-1"

David & Dale,

Thanks for all comments and suggestions.

Here is the link for ar(1) process

http://www1.elsevier.com/hes/books/02/04/050/0204050.htm

and sas online help for sas code.

Example 20.16: AR(1) Process in ETS.

In term of small sample, I said in last post "I agree that there is difference when sample size is small, the initial value becomes not trivial."

Using proc arima just want to understand what is the same and what is different compared with proc mixed. That is the way I do cross-check and help me to understand. David's comments about high order arima and complex time series model seems irrelevant, but always welcome.

"David L. Cassell" wrote: > Dale McLerran <stringplayer_2@yahoo.com> replied: > > I still must disagree. If you were to allow a burn-in of, say, > > 50 iterations before you start outputting u or e values, then > > your approach would generate data which follow an AR(1) process. > > Note that your code produces > > > > u0 ~ N(0,1), > > > > /* First iteration */ > > u1=rho*u0 + err > > err~N(0,1) > > u1~N(0,1+rho**2) > > output u1 > > assign u0=u1 > > > > /* Second iteration */ > > u1=rho*u0 + err > > err~N(0,1) > > u1~N(0,1+rho**2+rho**4) > > output u1 > > assign u0=u1 > > > > /* Third iteration */ > > u1=rho*u0 + err > > err~N(0,1) > > u1~N(0,1+rho**2+rho**4+rho**6) > > output u1 > > assign u0=u1 > > > > ... > > > > /* K-th iteration */ > > u1=rho*u0 + err > > err~N(0,1) > > u1~N(0,1+rho**2+rho**4+rho**6+...+rho**(2K)) > > output u1 > > assign u0=u1 > > > > > > The difference between variance of u1 at iteration m and variance > > of u1 at iteration m+1 is > > > > V(u1{m+1}) - V(u1{m}) = rho**(2*(m+1)) > > > > > > For m sufficiently large, the difference in variances does become > > negligible. However, for the first two iterations (m=1), the > > difference in variance is rho**4. That is not small unless > > rho is near zero. The following code shows the variance function > > by iteration for rho=0.7. > > > > data var; > > rho=0.7; > > V=1; > > do iter=1 to 50; > > V = 1 + (rho**2)*V; > > output; > > end; > > run; > > > > proc print data=var; > > by rho; > > id rho; > > var iter V; > > format V 10.8; > > run; > > > > > > The variance approaches a stable value by about the 5th > > iteration. However, for iterations 1 through 3, the variance > > is quite far from the stable value. > > > > Now, to demonstrate the validity of your AR(1) generation process > > you use the code > > > > data test; > > seed=1234579; u0 = rannor(seed); > > do iter=1 to 100000; > > /* u{t} = rho*u{t-1} + err */ > > u1 = 0.7*u0 + rannor(seed); > > output; > > u0=u1; > > end; > > run; > > title "Covariance matrix: errors constructed as u{t}=rho*u{t-1} + > > err"; > > > > proc corr data=test cov; > > var u:; > > run; > > > > > > This is a very long process, and certainly the variance is stable > > for most of the 100000 iterations. If you were using a long > > sequence like this, then I would have little concern with assuming > > that ALL the data follow the same AR(1) process. However, you have > > relatively short processes (20 or 40 iterations). The first few > > terms of a short process could have strong effect on your parameter > > estimates. > > > > Note that the simulation which I previously posted shows the > > variance structure generated by your data generation process > > through five iterations. That is, the generation model > > > > u0 = rannor(seed); > > u1 = 0.7*u0 + rannor(seed); > > u2 = 0.7*u1 + rannor(seed); > > u3 = 0.7*u2 + rannor(seed); > > u4 = 0.7*u3 + rannor(seed); > > u5 = 0.7*u4 + rannor(seed); > > > > is exactly the same as > > > > u0~N(0,1) > > > > /* First iteration */ > > u1=rho*u0 + err > > err~N(0,1) > > u1~N(0,1+rho**2) > > output u1 > > assign u0=u1 > > > > /* Second iteration */ > > u1=rho*u0 + err > > err~N(0,1) > > u1~N(0,1+rho**2+rho**4) > > output u1 > > assign u0=u1 > > > > ... > > > > /* Fifth iteration */ > > u1=rho*u0 + err > > err~N(0,1) > > u1~N(0,1+rho**2+rho**4+rho**6+rho**8+rho**10) > > output u1 > > assign u0=u1 > > > > > > It most certainly does demonstrate that for the first few > > iterations, your process is NOT AR(1). But the alternative > > process in which we have u{i}=rho*u{i-1} + sqrt(1-rho*rho)*err > > DOES FOLLOW THE AR(1) PROCESS FROM ITERATION 1. > > > > IF YOU WANT YOUR DATA GENERATION MODEL TO BE AR(1) FOR ALL > > OBSERVATIONS, THEN YOU NEED TO MODIFY YOUR CODE. As I stated > > previously, one way to modify your code so that all the data > > follow an AR(1) process is to allow a burn-in period of some M > > iterations before you start accepting data as following the > > AR(1) process. Better yet is to use > > > > u0 = err{0}; > > do iter=1 to n; > > u1 = rho*u0 + (1-rho*rho)*err{iter}; > > output; > > u0=u1; > > end; > > > > where err{j}~N(0,V). This follows the AR(1) process from the > > very first iteration. > > I'm going to side with Dale on this one. Mostly. :-) > > I agree with Dale that if you want the generated stream to be correct > from the first iteration, you need to do this: > > > u0 = rannor(seed); > do i=1 to k; > u1 = rho*u0 + sqrt(1-rho*rho)*rannor(seed); > output; > u0 = u1; > end; > > > This is the formulation that I learned in intro time series courses, > back when dinosaurs still ruled the earth. Shiling is writing the > time series in a traditional manner, using the duality of AR processes > as infinite-order MA processes. But that's not particularly efficient > for process generation. It *is* really useful once you start using > the backshift operator and you want to do a little theory. I think > that Dale's formulation will make the generation process a lot easier > once Shiling moves on up to more complex time series models, with > higher-order ARIMA(p,d,q) models. > > But the idea of 'burn-in' is not new. It's a fundamental part of > some statistical tools. The Gibbs sampler comes to mind, for one. > So Shiling could make this work. I would recommend a burn-in period > of at least M=10 for this case, although the computer cost is so > small that there's really no point in not using a burn-in period > of, say, M=1000. If Shiling builds a much more complex time series > model, he may want to use an M of 100 or so, just to address any > long-term hysteresis of his models. > > So here's my version of Shiling's code: > > > u0 = rannor(seed); > M = 1000; /* just pick something big enough */ > do i=1 to k+M; > u1 = rho*u0 + rannor(seed); > if i > M then output; > u0 = u1; > end; > > > With this sort of minor mod, Shiling's code will do just what he > wants. > > Just sticking my nose in, > David > -- > David Cassell, CSC > Cassell.David@epa.gov > Senior computing specialist > mathematical statistician


Back to: Top of message | Previous page | Main SAS-L page