Date: Thu, 29 Dec 2005 10:59:31 -0800
Reply-To: "Ram ." <marketprobe1@GMAIL.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Ram ." <marketprobe1@GMAIL.COM>
Subject: Re: Question on data manipulation
In-Reply-To: <200512291202.jBTBkuuS022562@mailgw.cc.uga.edu>
Content-Type: text/plain; charset=ISO-8859-1
Thanks for your reply. I am sorry if my earlier post was not clear. Another
try:
I have a question on data manipulation. I have two data sets:
1) Dataset A which has three columns: zip code, la and lo
where zip code is a five digit code and la and lo are the latitudes and
longitudes of the zipcodes
2) Data set B has 5 columns: custid, Zipid, zip1 zip2 and zip3 where custid
is the customer id #, Zipid is the zip code where the particular customer
lives and zip1, zip2 and zip3 are the zipcodes of the three relatives of the
customer.
*
Data* A*;*
input Zip la lo;datalines;
90007 204 22
90008 104 23
90009 305 23
93456 206 24
;
*
run*;
*
Data* B;*input* *Custid* Zipid Zip1 Zip2 Zip3;
datalines;
1 90007 90008 90007 90009
2 90008 93456 90007 90009
;
From these two data sets, I would like to create a data set with 4 columns
say custid Zipid Dist1 Dist2 Dist3
where as before custid is the customer id #, Zipid is the zipcode where the
particular customer lives, and Dist 1, Dist 2 and Dist3 are the distances
between Zipid and Dist1, Zipid and Dist2, Zipid and Dist3 respectively.
The distance between two zip codes say i and j can be calculated from the
latitudes and longitudes which is given by the formula:
sqrt(((la(i)-la(j))**2) + ((lo(i)-lo(j))**2)) .
Can anyone suggest an code that can accomplish this?
thanks,
-Ram
On 12/29/05, Gerhard Hellriegel <ghellrieg@t-online.de> wrote:
>
> I'm not sure if I understand your example data: what is a, b, c, d, e, f,
> g
> ??? You use a, b in your formula - what about the other values?
> In your example, I'd expect pairs of numerics (latitude and longitude) for
> each zipcode...
> If that is right, you could create 2 informats lo and la to convert each
> zipcode to numerics which you can use in your formula.
> Something like:
>
> data a;
> zip="90007"; la=1; lo=7; output;
> zip="90009"; la=2; lo=2; output;
> zip="93456"; la=3; lo=9; output;
> run;
> data lo;
> set a;
> fmtname="lo";
> type="I";
> if _n_=1 then do;
> hlo="O";
> label=".";
> output;
> end;
> hlo=" ";
> start=zip;
> label=lo;
> output;
> run;
> data la;
> set a;
> fmtname="la";
> type="I";
> if _n_=1 then do;
> hlo="O";
> label=".";
> output;
> end;
> hlo=" ";
> start=zip;
> label=la;
> output;
> run;
> proc format cntlin=lo;
> run;
> proc format cntlin=la;
> run;
>
>
>
>
> data xx;
> a="90007";
> x=input(a,la.);
> y=input(a,lo.);
> z=x*y;
> put x= y= z=;
> run;
>
> You now have informats to all your zips which you can use as
> numeric-substitutes where you want, e.g. in your formula.
>
>
>
>
> On Thu, 29 Dec 2005 01:07:37 -0800, Ram . <marketprobe1@GMAIL.COM> wrote:
>
> >Dear SAS-ers,
> >
> >I have a question on data manipulation. I have two data sets:
> >1) Dataset A which has three columns: zip code, la and lo
> >
> >where zip code is a five digit code and la and lo are the latitudes and
> >longitudes of the zipcodes
> >
> >2) Data set B has 5 columns: custid, Zipid, zip1 zip2 and zip3 where
> custid
> >is the customer id #, Zipid is the zip code where the particular customer
> >lives and zip1, zip2 and zip3 are the zipcodes of the three relatives of
> the
> >customer.
> >
> >From these two data sets, I would like to create a data set with 4
> columns
> >say custid Zipid Dist1 Dist2 Dist3
> >where as before custid is the customer id #, Zipid is the zipcode where
> the
> >particular customer lives, and Dist 1, Dist 2 and Dist3 are the distances
> >between Zipid and Dist1, Zipid and Dist2, Zipid and Dist3 respectively.
> >
> >The distance between two zip codes say *a and b *can be calculated from
> the
> >latitudes and longitudes which is given by the formula:
> >sqrt(((a.latitude-a.latitude)**2) + ((b.longitude-a.longitude)**2)) .
> >
> >Can anyone suggest any code that can accomplish this? To be further
> clear,
> >here is sample of the two datsets
> >
> >Data set A:
> >
> >Zip la lo
> >---------------------
> >90007 a b
> >90008 c d
> >90009 e f
> >93456 g h
> >
> >
> >Data set B:
> >
> > Custid Zipid Zip1 Zip2 Zip3
> >-------------------------------------------------
> >1 90007 90008 90007 90009
> >2 90008 93456 90007 90009
> >
> >Thanks,
> >-Ram
>
|