Date: Fri, 4 Nov 2005 16:59:37 -0600
Reply-To: John Norton <email@example.com>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: John Norton <firstname.lastname@example.org>
Subject: Evaluating dates
Content-Type: text/plain; charset=US-ASCII
I'd attempted to respond directly to the original post, but for some reason my email server won't let me. So, I post to the general list.
I posted the following solution to the original request to identify cases where the pretest date follows the post test date:
COMPUTE flag = predate > (NUMBER(SUBSTR(postdate,1,INDEX(postdate,' ')-1),ADATE10)).
When I used to teach command language for SPSS, this type of solution frequently resulted in head-scratching. Hopefully I can clarify if there is any confusion. Essentially, I use logic to evaluate whether the pre_date is greater - and therefore follows - the post_date. As it is a logical evaluation, the result of that evaluation is either true (the condition is met) or false (the condition is not met). If the condition is met, the target variable is assigned the value 1; false, it is 0. (If the evaluation can not be done - e.g. there is missing data for one or both of the dates - then the target variable is assigned no value at all.)
The form of the argument is:
COMPUTE x = y > z.
x is the target variable, here called 'flag'.
y is the value of the date variable, here called 'predate'.
z is the rather complex computation to determine the date value of 'postdate'.
It's important to remember that SPSS stores date and time values in terms of the number of seconds from a specified starting point. For a date, it's some time in the 16th century. So, that in mind, we're really assessing whether one numeric value is greater than another, and if so, then the argument is true, and the target variable, "flag" is assigned a value of 1, for true.
That complex computation starts with the INDEX function to locate the space in the string value between the date and the time. Once located, SPSS stores the numeric value of that location in memory. Subtracting 1 from that value will be the number of characters we want to strip out of "postdate" and we use the SUBSTR() function for that. The argument of the SUBSTR function contains 3 elements. They are first, the variable from which information will be stripped; second, the starting position from which SPSS will begin stripping out information, and third, the length of information (number of characters) to strip out. So if I write
I'm telling SPSS to strip from the variable called str_var 4 characters, starting in the first position. But what if the number of characters is variable, as is the case with your string variable? Then the solution is to locate a common character - using the INDEX() function I just explained - and use the position of that character as part of a computation (position of the blank minus 1) which defines the total number of characters to strip from the string variable.
But after stripping that information from the string source variable, that information remains string in definition. In order for the evaluation to be successful, we need to convert that string information into a number which reflects the number of seconds since the starting date (in 15...something). So, we embed all this within the NUMBER() function, as the first argument. The second argument of the NUMBER() function is the format to define the number, and here it s ADATE8, which is "American Date, 8 characters" or mm/dd/yy.
Now we have a date value (or more specifically, a numeric value formatted as a date for our legibility) with which we can compare to another date value and determine whether that other value is greater or not. If the other value is greater, then the argument is true, and flag is assigned a value of 1. If the other value is not greater - if it is equal or less - then the evaluation returns a false or a 0 value.
I hope this helps explain the logic.
Loyola University Medical Center
"Absence of evidence
is not evidence of absence"