LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (March 2004, week 5)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 31 Mar 2004 23:03:17 -0500
Reply-To:     "Chang Y. Chung" <chang_y_chung@HOTMAIL.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         "Chang Y. Chung" <chang_y_chung@HOTMAIL.COM>
Subject:      Re: How to clustering the data variables instead of data points
Comments: To: fzh113@HECKY.IT.NORTHWESTERN.EDU

On Wed, 31 Mar 2004 20:31:52 -0600, Fred <fzh113@HECKY.IT.NORTHWESTERN.EDU> wrote:

>Dear SAS users, > >Do you know if there are some specific functions under SAS >to do clustering? >In the general case, given a set of sample data, we just use >clustering algorithm to classify these data sample into some groups. > >Now my problems is different from the above. > >To be specific, suppose I have a d-dimensional vector x = [x1,x2, ..., xd]', >and wish to clustering these d variables of x into some finite groups >using a user-defined distance measure a. >The distance measure a was defined to measure the similarity between >any two data variables xi and xj (1<= i, j, <= d).

hi, Fred,

Sas/stat has proc cluster, which accepts a dataset of type=distance as its input. You can readily make a such dataset in many different ways. Let me try a data step. Here is a simple example. HTH.

Cheers, Chang

data xes; x1=0.73902; x2=0.27248; x3=0.70953; x4=0.31916; x5=0.36785; x6=0.10449; run;

data ds(type=distance keep=d1-d6 i); set xes; array x[1:6] x1-x6; array d[1:6] d1-d6; do i = 1 to 6; do j = 1 to i; /* assumes that the distance measure d is simply the absolute difference */ d[j] = abs(x[i] - x[j]); end; output; end; run; proc print data=ds; format d1-d6 6.4; run; /* on lst Obs d1 d2 d3 d4 d5 d6 i

1 0.0000 . . . . . 1 2 0.4665 0.0000 . . . . 2 3 0.0295 0.4371 0.0000 . . . 3 4 0.4199 0.0467 0.3904 0.0000 . . 4 5 0.3712 0.0954 0.3417 0.0487 0.0000 . 5 6 0.6345 0.1680 0.6050 0.2147 0.2634 0.0000 6 */

/* average link -- this part from sas/stat online doc */ proc cluster data=ds method=average pseudo; id i; run; proc tree horizontal spaces=2; id i; run;


Back to: Top of message | Previous page | Main SAS-L page