LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (December 2008, week 3)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Fri, 19 Dec 2008 14:41:58 -0600
Reply-To:     Mary <mlhoward@avalon.net>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Mary <mlhoward@AVALON.NET>
Subject:      Re: Aggregate and individual-level data analysis
Comments: To: Rieza Soelaeman <rsoelaeman@GMAIL.COM>
Content-Type: text/plain; charset="iso-8859-1"

I'm not an expert, but I'll give this a college try...

I don' t think you can say that people who buy food are obese, because you don't know the actual weights of those who went to the football game.

I suppose you might hypothesis whether the proportion of obesity in someone's zip code affects whether people from that zip code buy food at the football game.

Do all zip codes have the same populations? I would guess that they do not, so if they don't then convert your aggregate files into rates (such as .10 percent obese of total population in zip code).

Then you could join dataset 1 and 2:

proc sql; create table newtable as table1.* table2.* from table1 left outer join table2 on table1.zipcode=table2.zipcode; quit;

Then you could do a logistic regression to predict whether people bought food or not:

proc logistic data=newtable; model bought_food(DESC)= overweight_rate_in_zip_code; run;

But note, even if you are doing this, you are not predicting whether overweight *people* buy food, only whether the rate of obesity in the area people live in affects whether they buy food.

-Mary

----- Original Message ----- From: Rieza Soelaeman To: SAS-L@LISTSERV.UGA.EDU Sent: Friday, December 19, 2008 1:12 PM Subject: Aggregate and individual-level data analysis

Dear SAS-Lers, Yet another question from me.

Suppose I have 2 datasets:

1. Dataset1--Contains individual-level data on who bought food at a concession stand during a football game 2. Dataset2--Contains aggregate data on prevalence of obesity (bmi >=30) and overweight (bmi >=25) by zip code

Dataset1 looks roughly like this: name zip code John 78530 Jane 78531 Angie 78532 Eileen 78530 Tim 78530 Bob 78532

et cetera...let's say there are 3000 people in this dataset, all of these people bought food.

Dataset2 looks roughly like this: zip code overwt obese 78530 500 200 78531 600 500 78532 100 50

Supposing I wanted to know if there was a correlation between buying food and obesity, what procedure can I run? Notice that overweight and obese are BMI classifications, so really, Dataset2 represents data from 1950 respondents. I get a feeling that I need to disaggregate Dataset2, because I was kicking myself in the head when I tried to turn Dataset1 into an aggregate dataset, and finding it impossible (and stupid) to try to plot the data...

As always, I welcome and appreciate any suggestions on how to tackle this.

-- Rieza H Soelaeman, MPH


Back to: Top of message | Previous page | Main SAS-L page