LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (April 2005, week 2)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 14 Apr 2005 21:57:35 -0400
Reply-To:     "Howard Schreier <hs AT dc-sug DOT org>" <nospam@HOWLES.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Howard Schreier <hs AT dc-sug DOT org>" <nospam@HOWLES.COM>
Subject:      Re: merging data sets problem (potential proc sql?)

Please indicate the scale of the problem, in particular the number of stations and the number of distinct postal codes which appear in the subjects' file.

On Wed, 13 Apr 2005 20:30:00 -0400, Cornel Lencar <clencar@INTERCHANGE.UBC.CA> wrote:

>Hi, > >I am facing a problem that I suspect can be solved with proc sql, >on which I am not proficient. I cannot work the problem in a data >step /or several data step due to the large files that I have to work with. > >Basically I have three files: >1. a distance file that contains all distances between a set of ambient stations >and a much larger set of postal codes > >PCODE STATION_ID DISTANCE >A0A0A0 E3245 4567.7 >A0A0A0 E4567 34567.9 >A0A0A0 M456 34.7 >B0B0B0 E3245 56987 >B0B0B0 .. .. > >2. a file with subjects' data > >SUBJ_ID PCODE DOB EOFU >1 A0A0A0 11/29/1999 12/31/2002 >2 A0A0A0 10/09/2000 12/31/2002 >3 B0B0B0 09/27/2000 11/18/2000 >4 .. .. .. > >3. A file with daily ambient exposure information, for each ambient station, >between >1/01/1998 and 12/31/2002. It is important to note that for some days (few) >measurements are missing (not collected or insuficient hourly measurements -no >need to go in detail here). Also, not all ambient stations cover the whole >window interval mentioned. Some started monitoring earlier and closed in 1999 >for instance, or others started in 2000 on. > >STATION_ID DATE CO_LEVEL >E3245 1/01/1998 125.98 >E3245 1/02/1998 132.90 >E3245 1/03/1998 125.98 >E3245 1/04/1998 . >E3245 1/04/1998 117.34 > >So for each station I will have 365*5 observations > >I need to put together these three file something like this: > >SUBJ_ID PCODE DOB EOFU STATION_ID >1 A0A0A0 11/29/1999 12/31/2002 M456 > > >DISTANCE DATE CO_LEVEL >34.7 11/29/1999 123.45 > >in such a way that for each individual I would have the corresponding >"valid"=non-missing measurement of CO_LEVEL from the closest >ambient station (if the value for the closest ambient station is >missing, the value for the next closest station should be taken), in the >window interval given by DOB and EOFU. > >Any help or suggestion with this problem would be greately appreciated. > >Sincerely > >Cornel Lencar


Back to: Top of message | Previous page | Main SAS-L page