```Date: Thu, 22 Sep 2005 09:21:36 +0200 Reply-To: Spousta Jan Sender: "SPSSX(r) Discussion" From: Spousta Jan Subject: Re: 2nd Attempt: Grouping Zip Codes Content-Type: text/plain; charset="us-ascii" Yes Simon, something like this can indicate which cases are "close enough". But how to create the clusters? Even in your simple case you have two possible candidate solutions: (1,2);(3);(4) and (1);(2);(3);(4) And you should find that the first one is better in the Deepak's sense. If you will have n ZIP's, then the maximum number of possible solutions is given by the Bell number Bn (see http://mathworld.wolfram.com/BellNumber.html and http://www.research.att.com/cgi-bin/access.cgi/as/njas/sequences/eisA.cg i?Anum=A000110) - numbers growing approximatedly as quickly with n as n**n. For example: B10 = 115,975, B20 = 51,724,158,235,372 - here it becomes unsearchable even for the best today's comuters in a reasonable time using a brute-force search B30 = 846,749,014,511,809,332,450,147 - here I am no more able to promounce the number correctly even in my mother's language :-) B100 = 475853912767648336587907688413872078263636696868256114666163346375591144 97892442622672724044217756306953557882560751 In an average country, there are thousands ZIP's... *** Regarding Richard's suggestion to use SQL - I think, Richard, that C/C++/... would be much quicker and Python or Ruby much more convenient. All these languages are better suited for loops and branching of algorithms. But I am not an SQL expert - if you are, you will do your best in SQL. In my opinion, the programmer is more important for the solution than the tool. Best regards Jan -----Original Message----- From: SPSSX(r) Discussion [mailto:SPSSX-L@LISTSERV.UGA.EDU] On Behalf Of Simon Freidin Sent: Wednesday, September 21, 2005 11:36 PM To: SPSSX-L@LISTSERV.UGA.EDU Subject: Re: 2nd Attempt: Grouping Zip Codes data list list /origin (a5) Zip1 Zip2 Zip3 . begin data. Zip1 0 5 15 Zip2 5 0 12 Zip3 15 12 0 Zip4 end data. casestovars/separator='_'. flip. sel if index(case_lbl,'ORIGIN')=0. compute lt10=(var001<10). match files file=*/drop=var001. formats lt10 (f1). list. CASE_LBL LT10 ZIP1_1 1 ZIP1_2 1 ZIP1_3 0 ZIP1_4 . ZIP2_1 1 ZIP2_2 1 ZIP2_3 0 ZIP2_4 . ZIP3_1 0 ZIP3_2 0 ZIP3_3 1 ZIP3_4 . Number of cases read: 12 Number of cases listed: 12 On 21/09/2005, at 11:19 PM, Deepak Jethwani wrote: > Hi Listers, > This is my second mail to the list. I am still struggling with the > following issue. > We have a list of zip codes and a table that lists out the drive time > distance from one zip code to the other in the following format. > > Zip1 Zip2 Zip3 ..... > Zip1 0 5 15 > Zip2 5 0 12 > Zip3 15 12 0 > Zip4 > > > > Now, we need to group the zip codes which fall within 10 minutes of > drive time distance from a zip code into a group. > The problem is actually about identifying starting zip codes around > which to build these groupings. > > I would really welcome any comments from anyone who has faced a > similar problem or any suggested approaches that come to mind for a > possible solution or any suggested texts which I can use to tackle > this problem. > > Best regards > Deepak Jethwani > ```

Back to: Top of message | Previous page | Main SPSSX-L page