Date: Wed, 31 Jan 2007 22:37:12 -0500
Reply-To: "Howard Schreier <hs AT dc-sug DOT org>" <nospam@HOWLES.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Howard Schreier <hs AT dc-sug DOT org>" <nospam@HOWLES.COM>
Subject: Re: Updating data from a distance table
On Wed, 31 Jan 2007 09:36:36 -0800, Terjeson, Mark <Mterjeson@RUSSELL.COM>
wrote:
>Hi Julien,
>
>Here is one approach:
>
>(proc transpose -- the unsung hero)
>
>
>
>data cities;
> input city1 $ city2 $;
>cards;
>v01000 v13000
>v38000 v69000
>v01000 v38000
>;
>run;
>
>
>
>data distances;
> input id $ v01000 v13000 v38000 v69000;
>cards;
>v01000 0 125 24 312
>v13000 125 0 76 94
>v38000 24 76 0 215
>v69000 312 94 215 0
>;
>run;
>
>
>proc transpose data=distances
> out=dist2(rename=(
> id=city1
> _name_=city2
> col1=distance
> ));
> by id;
>run;
The transpose is indeed the critical thing.
Here's a variation. First insert a WHERE filter so that the PROC TRANSPOSE
excludes the trivial and repetitious cells from the symmetric distance matrix:
proc transpose data=distances
out=dist2(rename=(
id=city1
_name_=city2
col1=distance
)
where=(city2>city1)
);
by id;
run;
Sort the other data set:
proc sort data=cities;
by city1 city2;
run;
Then it's a rather simple merge:
data result;
merge cities(in=wanted)
dist2;
by city1 city2;
if wanted;
run;
>
>
>proc sql;
> create table result as
> select
> a.city1,
> a.city2,
> b.distance
> from
> cities as a
> left join
> dist2 as b
> on
> a.city1 eq b.city1
> and
> a.city2 eq b.city2
> ;
>quit;
>
>
>
>
>Hope this is helpful.
>
>
>Mark Terjeson
>Senior Programmer Analyst, IM&R
>Russell Investment Group
>
>
>Russell
>Global Leaders in Multi-Manager Investing
>
>
>
>
>
>
>-----Original Message-----
>From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
>Julien Barnier
>Sent: Wednesday, January 31, 2007 9:04 AM
>To: SAS-L@LISTSERV.UGA.EDU
>Subject: Updating data from a distance table
>
>Hi to all,
>
>I am quite new to SAS, and here is a problem I can't figure how to
>solve.
>
>I have two datasets. The first one has two columns which represent city
>codes :
>
>city1 city2
>v01000 v13000
>v38000 v69000
>v01000 v38000
>
>The second one is a sort of distance table which contains distance
>between cities identified by their codes :
>
>id v01000 v13000 v38000 v69000
>v01000 0 125 24 312
>v13000 125 0 76 94
>v38000 24 76 0 215
>v69000 312 94 215 0
>
>What I'd like to do is to add a "distance" variable to the first dataset
>with the distance between the two cities :
>
>city1 city2 distance
>v01000 v13000 125
>v38000 v69000 215
>v01000 v38000 24
>
>I've looked at macros, at proc sql, but I can't manage to get what I
>want.
>
>If you have an idea or a link to a tutorial which could explain this, it
>would be greatly appreciated !
>
>Thanks in advance,
>
>--
>Julien