LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (February 2006, week 4)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:   Wed, 22 Feb 2006 11:24:19 -0500
Reply-To:   "Droogendyk, Harry" <harry.droogendyk@RBC.COM>
Sender:   "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:   "Droogendyk, Harry" <harry.droogendyk@RBC.COM>
Subject:   Re: Collapsing data into meaningful categories
Content-Type:   text/plain; charset="iso-8859-1"

Create a format to categorize the house values. Read up on the specifying value ranges, paying particular attention to the "low", "high", "-<" stuff.

There was a value range missing in your specs. If that was intentional ( ie. you don't care to categorize the houses in the 100-150K range ), remove the range from the PROC FORMAT stuff below and add "other = 'Not categorized'" to the value specification.


proc format; value house low -< 50000 = '<50K' 50000 -< 100000 = '50-100K' 100000 -< 150000 = '100-150K' 150000 -< 200000 = '150-200K' 200000 -< 300000 = '200-300K' 300000 -< 500000 = '300-500K' 500000 -< 1000000 = '500-1,000K' 1000000 - high = '1m+'; run;

data a; input value freq; category = put(value,house.); cards; 100000 2 200000 7 299999.99 7 102934 2 98328 1 400831 5 1039482 2 50000 3 run;

proc print data=a; run;

-----Original Message----- From: []On Behalf Of Ron Sent: Wednesday, February 22, 2006 11:00 AM To: Subject: Collapsing data into meaningful categories


I'm new to the SAS world. please forgive my elementary question.

I have data on customer home values (around 5000 obs). The values are listed as actual values. There is another column of frequencies. So there are something like 3500 unique values.

I would like to collapse the actual home values into more tracteable categories while maintaining the frequencies for each category.

The category breakdown would be something like: <50K 50-100k 150-200k 200-300k 300-500k 500-1,000k 1m+

Sample data

value freq 102934 2 98328 1 400831 5 1039482 2 50000 3

If someone could point me in the right direction I would be grateful.


RON _______________________________________________________________________

This e-mail may be privileged and/or confidential, and the sender does not waive any related rights and obligations. Any distribution, use or copying of this e-mail or the information it contains by other than an intended recipient is unauthorized. If you received this e-mail in error, please advise me (by return e-mail or otherwise) immediately.

Ce courrier électronique est confidentiel et protégé. L'expéditeur ne renonce pas aux droits et obligations qui s'y rapportent. Toute diffusion, utilisation ou copie de ce message ou des renseignements qu'il contient par une personne autre que le (les) destinataire(s) désigné(s) est interdite. Si vous recevez ce courrier électronique par erreur, veuillez m'en aviser immédiatement, par retour de courrier électronique ou par un autre moyen.

Back to: Top of message | Previous page | Main SAS-L page