```Date: Thu, 11 Jul 2002 09:59:55 -0700 Reply-To: Dale McLerran Sender: "SAS(r) Discussion" From: Dale McLerran Subject: Re: excluding records on the fly Comments: To: N Yiannakoulias In-Reply-To: Content-Type: text/plain; charset=us-ascii N, How about storing the data into a two dimensional array and stepping over the array once you have the entire array filled. Something like the following: /* First, read in your data */ data mydata; input ID NID ValueN; cards; 1 2 65 1 6 26 2 1 26 2 3 30 2 7 2 3 2 65 3 4 34 3 8 88 ; data neighbor; set mydata end=lastrec; /* place all data into 2 dimensional array */ array adjacency {8,8} _temporary_; adjacency{id,nid} = ValueN; /* When we reach last record in dataset, then start algorithm */ if lastrec then do; /* Initialize ID to start the chaining process */ id=1; /* Initialize var to signal no more neighbors for this chain */ endchain=0; do while(endchain=0); /* For each ID, start with maximum ValueN=0. Any NID which */ /* neighbors ID and has value greater than the current value */ /* ov ValueN is a candidate for the chain. */ ValueN=0; /* Loop over NID values */ do j=1 to 8; if j^=ID then do; if adjacency{id,j}>ValueN then do; /* Candidate neighbor found */ ValueN=adjacency{id,j}; nid=j; end; /* Set adjacencies for current ID to missing so that */ /* this ID will not be reused. */ adjacency{id,j}=.; end; end; /* If ValueN is still 0, then no neighbor was found. */ if ValueN=0 then endchain=1; else do; /* Output record for ID with NID maximizing ValueN */ output; /* Set the ID to the value of NID and continue chaining */ id=nid; end; end; end; keep id nid ValueN; run; proc print data=neighbor; run; Of course, you will have to modify the size of the temporary array and the upper range for the do loop to accomodate data with more ID's and NID's. Dale --- N Yiannakoulias wrote: > Hi all, > > I have a large data set which represents a 2 dimensional grid over a > geographic area. I've built an adjacency file of this grid that looks > like this: > > ID NID ValueN > 1 2 65 > 1 6 26 > 2 1 26 > 2 3 30 > 2 7 2 > 3 2 65 > 3 4 34 > 3 8 88 > and so on... > > ID is an identifier for each grid NID is a neighbour's ID ValueN is a > value associated with a neighbour > > Now, I want to group adjacent records such that starting with area 1 > (represented by ID=1) the adjacent area with the largest ValueN is > grouped with this area (ID=1). Once a neighbour is grouped, it can no > longer be a part of any other group. _This is where I get stuck_. For > example, NID=2 meets the condition for area 1. As a result, records > with NID=2 AND ID=2 should be removed from the dataset as I proceed > with grouping. I haven't been able to do this successfully. Can > anyone offer any advice? It seems pretty trivial, but maybe it isn't? > > N ===== --------------------------------------- Dale McLerran Fred Hutchinson Cancer Research Center mailto: dmclerra@fhcrc.org Ph: (206) 667-2926 Fax: (206) 667-5977 --------------------------------------- __________________________________________________ Do You Yahoo!? Sign up for SBC Yahoo! Dial - First Month Free http://sbc.yahoo.com ```

Back to: Top of message | Previous page | Main SAS-L page