LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (June 2008, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Sat, 21 Jun 2008 13:05:17 -0700
Reply-To:     Ryan <Ryan.Andrew.Black@GMAIL.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Ryan <Ryan.Andrew.Black@GMAIL.COM>
Organization: http://groups.google.com
Subject:      Re: GLIMMIX Question - Dependent Observations
Comments: To: sas-l@uga.edu
Content-Type: text/plain; charset=ISO-8859-1

On Jun 20, 2:54 pm, Ryan <Ryan.Andrew.Bl...@gmail.com> wrote: > As usual, thank you. > > On Jun 20, 1:27 pm, stringplaye...@yahoo.com (Dale McLerran) wrote: > > > > > > > --- On Thu, 6/19/08, Ryan <Ryan.Andrew.Bl...@GMAIL.COM> wrote: > > > > From: Ryan <Ryan.Andrew.Bl...@GMAIL.COM> > > > Subject: Re:GLIMMIXQuestion- Dependent Observations > > > To: SA...@LISTSERV.UGA.EDU > > > Date: Thursday, June 19, 2008, 7:29 PM > > > Thank you, Dale! You've helped me with so many questions > > > already. I > > > hope it's okay if I ask you two more... > > > > 1. The dichotomous variable in my model was collected at the subjects > > > level (not city level), and the categories are not mutually exclusive-- > > > there were people who fit into both categories. I'm not sure how to > > > handle this issue--one option I thought was to raise it to the city > > > level, and code the city as a particular category based on the higher > > > rate (by the way, DV (rate) and the continuous IV are functions of > > > data at the city level). So if the rate is higher in category one, > > > then that city is assigned category one. Would that work? Would you > > > recommend an alternative approach that can maintain the variable at > > > the city level? > > > > 2. As mentioned above, the DV (rate) and the continuous IV in my model > > > are functions of aggregated data. After you mentioned that a city with > > > less observations would be weighted less, I realized that all cases > > > would actually have equal weights at the city level. Is there a way to > > > deal with unequal Ns per case while maintaining city as the unit of > > > analysis for all variables? > > > > Anyway, I realize I've asked much of you. I completely understand if > > > you're too busy to respond. I appreciate your help. It's been a true > > > learning experience! > > > > Ryan > > > Ryan, > > > I'm confused now. I don't know how your dependent variable (collected > > at the individual level) can take on two values and those two values > > are not mutually exclusive. It sounds to me as though there are two > > boxes that the respondent can check off, and that there are no > > constraints that if they check box 1 then they cannot check box 2 > > (and vice versa). > > Yes! The DV is a rate of obtaining a category in the dichotomous IV > (when the dichotomous IV is raised to the city level). > > Concretely, Rate = # of people who contracted disease A or B / total > number of people at risk of the respective disease within a city. > > The dichotomous IV, which was collected at the subjects level, > reflects two diseases, and people can have one or both--most only have > one. I want to compare the relative risk of contracting disease A to > contracting disease B (Poisson type regression). As a result, I > thought of raising the dichotomous variable, disease type, to the city > level, and if more people have disease A than disease B that city > would be categorized as disease A--bad idea, I know. > > If the dichotomous variable were in fact mutually exclusive, this > analysis would be fairly straightforward (after your help with spatial > analysis!) . The primary goal is to run a statistical test comparing > the risk of contracting disease A to the risk of contracting disease > B, after controlling for a continuous variable. The challenege is that > participant A could have contracted both diseases, and when you raise > it to the city level (which you have to do to obtain the rate), > certainly no city has a diagnosis of only one disease. > > I know I keep saying this, but just the fact that you've talked > through some of this stuff with me has been invaluable. > > > > > > > > > To me, that would represent two (almost certainly correlated) binary > > responses. I would be looking at modeling the binary responses at the > > individual level with the person-specific IV as a predictor. At the > > same time, you can allow for variation across cities in the proportion > > who respond positively. In addition to allowing for the person-specific > > IV to relate directly to the person-specific response, this analysis > > preserves information about differences in number of subjects in > > the different cities. A city with only 10 respondents will have a > > city random effect estimate which has a much larger standard error > > than a city with 1000 respondents. > > > If I am correct that there are two check boxes and hence two binary > > responses, then an appropriate model for check box 1 would be > > something like: > > > procglimmixdata=muydata; > > model box1 = x / s dist=binary; > > random intercept / subject=city > > type=sp(pow)(lat long) > > group=region; > > run; > > I'm not sure if this would answer myquestionregarding relative risk > of contracting one disease versus another. > > > > > A similar model could be fit for check box 2 as a response. One could > > model check box 1 and check box 2 responses together as correlated > > within individuals. There may be quite a few ways that such an analysis > > could be constructed. It is not clear given the spatial covariance > > structure assumed for the city random effects along with correlated > > responses within individuals just what the appropriate code would be > > for such a model. > > Yes. I think this is where I need to be headed. > > > > > Statisticians have the habit of adding confusion to seemingly simple > > problems, don't we? Are you more or less confused than at the start > > of this dialogue? > > This model is particularly confusing. Although I haven't finalized the > model, you have certainly moved me along tremendously! > > > > > > > Dale > > > --------------------------------------- > > Dale McLerran > > Fred Hutchinson Cancer Research Center > > mailto: dmclerra@NO_SPAMfhcrc.org > > Ph: (20... > > Fax: (206) 667-5977 > > ---------------------------------------- Hide quoted text - > > > - Show quoted text -- Hide quoted text - > > - Show quoted text -- Hide quoted text - > > - Show quoted text -- Hide quoted text - > > - Show quoted text -

Sorry the for the double post, but I think I've solved the problem (at least one way), and wanted to share it.

I will run GLIMMIX with a repeated measures factor (disease) and a covariate, and every variable in the model will be at the city level. Each city will have the rate for disease A and the rate for disease B.

Dataset

City Disease Rate Covariate Lat Long 1 1 .04 23 1 2 .07 34 2 1 .45 23 2 2 .01 22 3 1 .02 45 3 2 .36 11 . . .

where,

-->"1" reflects disease A and "2" reflects disease B under "Disease" -->values under "Rate" reflect the rate for that disease for that particular city, which is the DV -->values under "Covariate" are continuous and will adjust for disease rate per city -->values under lat and long will be in degrees and will be based on the centroid of each city

****I'll also include the covariance matrix that can deal with correlations among cities...

type=sp(pow)(lat long)

If the disease effect is significant, this will answer my research question of whether or not there is a significant difference in rates between disease A and B, after controlling for the covariate.

I realize cities are being weighted equally in this model. At some point, I may consider an adjustment based on the number of observations per city.

I'm not sure how the syntax will look, but I'll get to research my books/online guides on Monday.

Thanks again to everyone, and particularly Dale, for guidance.

Ryan


Back to: Top of message | Previous page | Main SAS-L page