Date: Thu, 11 Jul 2002 09:59:55 -0700
Reply-To: Dale McLerran <stringplayer_2@YAHOO.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Dale McLerran <stringplayer_2@YAHOO.COM>
Subject: Re: excluding records on the fly
In-Reply-To: <agk5ek$acu$1@pulp.srv.ualberta.ca>
Content-Type: text/plain; charset=us-ascii
N,
How about storing the data into a two dimensional array and stepping
over the array once you have the entire array filled. Something like
the following:
/* First, read in your data */
data mydata;
input ID NID ValueN;
cards;
1 2 65
1 6 26
2 1 26
2 3 30
2 7 2
3 2 65
3 4 34
3 8 88
;
data neighbor;
set mydata end=lastrec;
/* place all data into 2 dimensional array */
array adjacency {8,8} _temporary_;
adjacency{id,nid} = ValueN;
/* When we reach last record in dataset, then start algorithm */
if lastrec then do;
/* Initialize ID to start the chaining process */
id=1;
/* Initialize var to signal no more neighbors for this chain */
endchain=0;
do while(endchain=0);
/* For each ID, start with maximum ValueN=0. Any NID which */
/* neighbors ID and has value greater than the current value */
/* ov ValueN is a candidate for the chain. */
ValueN=0;
/* Loop over NID values */
do j=1 to 8;
if j^=ID then do;
if adjacency{id,j}>ValueN then do;
/* Candidate neighbor found */
ValueN=adjacency{id,j};
nid=j;
end;
/* Set adjacencies for current ID to missing so that */
/* this ID will not be reused. */
adjacency{id,j}=.;
end;
end;
/* If ValueN is still 0, then no neighbor was found. */
if ValueN=0 then endchain=1;
else do;
/* Output record for ID with NID maximizing ValueN */
output;
/* Set the ID to the value of NID and continue chaining */
id=nid;
end;
end;
end;
keep id nid ValueN;
run;
proc print data=neighbor;
run;
Of course, you will have to modify the size of the temporary array
and the upper range for the do loop to accomodate data with more
ID's and NID's.
Dale
--- N Yiannakoulias <nwy@SRV.UALBERTA.CA> wrote:
> Hi all,
>
> I have a large data set which represents a 2 dimensional grid over a
> geographic area. I've built an adjacency file of this grid that looks
> like this:
>
> ID NID ValueN
> 1 2 65
> 1 6 26
> 2 1 26
> 2 3 30
> 2 7 2
> 3 2 65
> 3 4 34
> 3 8 88
> and so on...
>
> ID is an identifier for each grid NID is a neighbour's ID ValueN is a
> value associated with a neighbour
>
> Now, I want to group adjacent records such that starting with area 1
> (represented by ID=1) the adjacent area with the largest ValueN is
> grouped with this area (ID=1). Once a neighbour is grouped, it can no
> longer be a part of any other group. _This is where I get stuck_. For
> example, NID=2 meets the condition for area 1. As a result, records
> with NID=2 AND ID=2 should be removed from the dataset as I proceed
> with grouping. I haven't been able to do this successfully. Can
> anyone offer any advice? It seems pretty trivial, but maybe it isn't?
>
> N
=====
---------------------------------------
Dale McLerran
Fred Hutchinson Cancer Research Center
mailto: dmclerra@fhcrc.org
Ph: (206) 667-2926
Fax: (206) 667-5977
---------------------------------------
__________________________________________________
Do You Yahoo!?
Sign up for SBC Yahoo! Dial - First Month Free
http://sbc.yahoo.com