LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (July 2011, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Thu, 7 Jul 2011 17:14:48 -0500
Reply-To:     Joe Matise <snoopy369@GMAIL.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Joe Matise <snoopy369@GMAIL.COM>
Subject:      Re: alpha-numeric data split into two tables?
Comments: To: Chris Bat <darth.pathos@gmail.com>
In-Reply-To:  <201107072201.p67LkS6i027116@waikiki.cc.uga.edu>
Content-Type: text/plain; charset=ISO-8859-1

Assuming the dataset 'old_data' has some variable "datavar", which either has numeric data in it or 'comment' data in it: data old_data; input @1 datavar $CHAR20.; datalines; 3 4 5 6 Hello World 784 745 ;;;; run; data text_data num_data; set old_data; if anyalpha(datavar) then output text_data; else output num_data; run;

The ANYALPHA function returns 1 (true) when any alphabetic character is found, or FALSE if no alphabetic characters are found. This will work if you explicitly expect alphabetic characters in the text data.

If you might have 3 5 or something (ie, not something that is legal to use in numeric evaluations), then you need something slightly better. Then COMPRESS is best:

data text_data num_data; set old_data; if missing(compress(datavar,,'d')) and not(find(trim(datavar),' ')) then output num_data; else output text_data; run; should do it; compress(datavar,,'d') should remove all digits (numbers) from the variable and leave everything else, only failing if there are spaces [as in my example], as space is equivalent to missing in SAS; the find function should handle those cases.

-Joe

On Thu, Jul 7, 2011 at 5:01 PM, Chris Bat <darth.pathos@gmail.com> wrote:

> Hi all, happy thursday! I don't think this is to terribly difficult to > accomplish but I've dug around and can't find anything; it may also be that > after spending 10 hours in front of my computer today, I'm a little tired. > I have received over 102,000 rows of lab data, with patient ID, result date > / time, result, test, and a pile of other stuff. What I've noticed is that > there are many more alpha-based results than I was expecting (I was told to > expect 15%, and based on a random sample I'm seeing closer to 30%). What > I'd like to do is somehow split the comment-type results (which would be > alpha, at least in part) into a new table, and the numeric results into a > third (thus leaving my original table untouched). I tried proc sort, but > because some comments start with a number, didn't work too well. I thought > of using a length(result), but there are some results of "unk" and "n/a". I > tried to pull out anything with something in the Comments field, but that > didn't work be! > cause there are valid (although anomalous) results with detailed comments. > So, I am stumped, and would appreciate any thoughts.... > TIA > Chris >


Back to: Top of message | Previous page | Main SAS-L page