```Date: Fri, 19 Dec 2008 14:41:58 -0600 Reply-To: Mary Sender: "SAS(r) Discussion" From: Mary Subject: Re: Aggregate and individual-level data analysis Comments: To: Rieza Soelaeman Content-Type: text/plain; charset="iso-8859-1" I'm not an expert, but I'll give this a college try... I don' t think you can say that people who buy food are obese, because you don't know the actual weights of those who went to the football game. I suppose you might hypothesis whether the proportion of obesity in someone's zip code affects whether people from that zip code buy food at the football game. Do all zip codes have the same populations? I would guess that they do not, so if they don't then convert your aggregate files into rates (such as .10 percent obese of total population in zip code). Then you could join dataset 1 and 2: proc sql; create table newtable as table1.* table2.* from table1 left outer join table2 on table1.zipcode=table2.zipcode; quit; Then you could do a logistic regression to predict whether people bought food or not: proc logistic data=newtable; model bought_food(DESC)= overweight_rate_in_zip_code; run; But note, even if you are doing this, you are not predicting whether overweight *people* buy food, only whether the rate of obesity in the area people live in affects whether they buy food. -Mary ----- Original Message ----- From: Rieza Soelaeman To: SAS-L@LISTSERV.UGA.EDU Sent: Friday, December 19, 2008 1:12 PM Subject: Aggregate and individual-level data analysis Dear SAS-Lers, Yet another question from me. Suppose I have 2 datasets: 1. Dataset1--Contains individual-level data on who bought food at a concession stand during a football game 2. Dataset2--Contains aggregate data on prevalence of obesity (bmi >=30) and overweight (bmi >=25) by zip code Dataset1 looks roughly like this: name zip code John 78530 Jane 78531 Angie 78532 Eileen 78530 Tim 78530 Bob 78532 et cetera...let's say there are 3000 people in this dataset, all of these people bought food. Dataset2 looks roughly like this: zip code overwt obese 78530 500 200 78531 600 500 78532 100 50 Supposing I wanted to know if there was a correlation between buying food and obesity, what procedure can I run? Notice that overweight and obese are BMI classifications, so really, Dataset2 represents data from 1950 respondents. I get a feeling that I need to disaggregate Dataset2, because I was kicking myself in the head when I tried to turn Dataset1 into an aggregate dataset, and finding it impossible (and stupid) to try to plot the data... As always, I welcome and appreciate any suggestions on how to tackle this. -- Rieza H Soelaeman, MPH ```

Back to: Top of message | Previous page | Main SAS-L page