LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (December 2005, week 1)Back to main SAS-L pageJoin or leave SAS-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Tue, 6 Dec 2005 17:30:25 -0500
Reply-To:     Sigurd Hermansen <HERMANS1@WESTAT.COM>
Sender:       "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From:         Sigurd Hermansen <HERMANS1@WESTAT.COM>
Subject:      Re: Eliminating rows, based on date values
Comments: To: Tomas <tomas.corveleyn@gmail.com>
Content-Type: text/plain; charset="us-ascii"

Tomas: Good idea to include an example of data and of what you want a program operating on those data to yield ... Your question also includes a statement of what you are trying to achieve. Now all you need to do is to covert that statement of a constraint to a program and check results of executing the program.

This type of problem has appeared in many forms on SAS-L. A quick check of SAS-L archives may give a variety of solutions, and, more important, ideas for extending solutions to this type of problem to related problems.

I'll suggest a SQL solution because we can declare a solution in a relatively simple form. A procedural program may eliminate rows 3 to 6 yet not solve your problem except in the specific case of your example of data. A logical solution will work for any group of ID's and any cardinal sequence of numbers, such as SAS date values.

Given any one SASValue in a row of your example, how do you know that you want to exclude it? As you have stated the condition for excluding a row, it must have a SASValue less than 90 days prior to the SASValue in the given row. So for a possible exclusion, 'row2', that we can compare to a different 'row1', we have three conditions to check: ID[row1]=ID[row2], SASValue[row1]>SASValue[row2], and SASValue[row2] between SASValue[row1] -89 and SASValue[row1]. (The second Boolean in the condition guarantees that the values of SASValue come from different rows.) This program implements that logic:

data test; input obs: var_a: Date: $char9. Sasvalue: ; cards; 1 1 2/apr/02 15432 2 1 12/jun/03 15868 3 1 13/jun/03 15869 4 1 14/jun/03 15870 5 1 15/jun/03 15871 6 1 16/jun/03 15872 7 2 26/mar/02 15425 8 2 11/sep/03 15959 9 3 1/jan/03 15706 10 4 1/jan/03 15706 11 4 1/apr/03 15796 12 4 30/jun/03 15886 ; run; /* With var_a being the id for a person. Based on the variable "Date" or the Sasvalue of that variable, I want to eliminate observations 3 to 6. I do not want to end up with only one observation per person, I want to keep the observations between which there are at least 90 days! So, my end-file has to be the following:

obs var_a Date Sasvalue(date) 1 1 2/apr/02 15432 2 1 12/jun/03 15868 7 2 26/mar/02 15425 8 2 11/sep/03 15959 9 3 1/jan/03 15706 */ %let dif=89; proc sql; create table results as select * from test as t1 where not exists(select t2.SASValue from test as t2 where t1.var_a=t2.var_a and t1.SASValue>t2.SASValue and t2.SASValue between (t1.SASValue-&dif.) and t1.SASValue ) ; quit;

This solution works in general for different required intervals, and will work with any series of cardinal numbers. The 'correlated subquery' essentially joins two copies of a dataset on an indexable key. More critical for database programming, the query has a logical foundation that lines up with the conditions you stated. Sig

-----Original Message----- From: owner-sas-l@listserv.uga.edu [mailto:owner-sas-l@listserv.uga.edu] On Behalf Of Tomas Sent: Tuesday, December 06, 2005 10:29 AM To: sas-l@uga.edu Subject: Eliminating rows, based on date values

Dear,

I'm trying desperately to eliminate rows from the following type of file:

obs var_a Date Sasvalue(date) 1 1 2/apr/02 15432 2 1 12/jun/03 15868 3 1 13/jun/03 15869 4 1 14/jun/03 15870 5 1 15/jun/03 15871 6 1 16/jun/03 15872 7 2 26/mar/02 15425 8 2 11/sep/03 15959 9 3 1/jan/03 15706

With var_a being the id for a person. Based on the variable "Date" or the Sasvalue of that variable, I want to eliminate observations 3 to 6. I do not want to end up with only one observation per person, I want to keep the observations between which there are at least 90 days! So, my end-file has to be the following:

obs var_a Date Sasvalue(date) 1 1 2/apr/02 15432 2 1 12/jun/03 15868 7 2 26/mar/02 15425 8 2 11/sep/03 15959 9 3 1/jan/03 15706

I've tried "retain" statements working on the Sasvalue of the datevariable, but I haven't been able to compute the correct formula which eliminates only observations 3 to 6. Or I eliminate to much, or I eliminate to little. Can someone help me with this problem?

Many thanks,

Tomas Corveleyn.


Back to: Top of message | Previous page | Main SAS-L page