Date: Wed, 6 Jul 2005 12:40:20 -0400
Reply-To: "Howard Schreier <hs AT dc-sug DOT org>" <nospam@HOWLES.COM>
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: "Howard Schreier <hs AT dc-sug DOT org>" <nospam@HOWLES.COM>
Subject: Re: validating dates
Content-Type: text/plain; charset=ISO-8859-1
Here are a couple of examples where it *does* make a difference:
1.1.2006
02.02.19999
The first is interpreted as the 11th day of the 20th month of 2006, the
second as the 20th day of the 21st month of 1999.
Consider also that both are plausible data entry errors or inconsistencies
(omitting leading zeroes in the first case; punching 4 9's instead of 3 in
the second case).
I suppose it's a good thing to trap the 19999 date as invalid. What that
really demonstrates though is that data validation which goes no further
than seeing what SAS will swallow is not adequate.
On Wed, 6 Jul 2005 08:17:00 +0200, Rune Runnestø <rune@FASTLANE.NO> wrote:
>As far as I have experienced, it doesn't matter if you use
>datevalue=input(date_values, ddmmyy11.);
>or
>dataevalue=input(compress(date_values, '.'), ddmmyy10.);
>Anyway, not as long as the only purpose it do decide which values are valid
>and which are not.
>data dates;
> attrib date_values length=$12.;
> infile cards;
> input @1 date_values $12.;
>datalines;
>0000000000
>01.05.2001
>31.12.2002
>
>01.02.2002
>31.09.1999
>40.10.1998
>20.04.1800
>22644840
>01.01.21000
>01.01.19900
>01.01.19899
>01.01.19999
>01.01.20000
>01.01.20100
>run;
>
>
>
>data not_valid (drop = datevalue)
> is_valid (drop = datevalue);
> set dates;
> datevalue=input(compress(date_values,'.'), ddmmyy10.);
> if date_values ne '' and datevalue = . then
> output not_valid;
> else output is_valid;
>run;
>
>
>
>data not_valid2 (drop = datevalue)
> is_valid2 (drop = datevalue);
> set dates;
> datevalue=input(date_values, ddmmyy11.);
> if date_values ne '' and datevalue = . then
> output not_valid2;
> else output is_valid2;
>run;
>
>The datasteps not_valid and not_valid2 has the same contents. That's also
so
>for the datasets valid and valid2. According to this, '01.01.20000' is
>valid, but '01.01.21100' is not valid. As far as I don't need the numeric
>value of '01.01.19900', it does not have any practical implications whether
>I use ddmmyy10. or ddmmyy11., does it ?
>
>Rune
>
>
>""Howard Schreier <hs AT dc-sug DOT org>"" <nospam@HOWLES.COM> skrev i
>melding news:200507051348.j65DmmAD003192@listserv.cc.uga.edu...
>> See code below. I think you are mistaken about 31.09.2005.
>>
>> 01.01.21000 is a complicated case. When you compress out the dots you are
>> left with 9 characters (010121000). So if you process that with DDMMYY8.,
>> it will ignore the last zero and extract 1 January 2100, which is valid.
>>
>> When you increase the width of the informat, SAS has to guess where the
>> implicit delimiters are (010 12 1000 vs 01 01 21000). In this case, both
>> are out of range and yield missing values for the date.
>>
>> To better see what's going on, try 01.01.19900. It's valid, but the
>> extracted date is 10 November 9900, telling us that 010119900 is
>> interpreted as 010 11 9900 rather than as 01 01 19900. This leads to what
>> is probably the most important point: Why eliminate the dots? They can
>only
>> help. In other words, get rid of the COMPRESS call and widen the informat
>> to accommodate:
>>
>> datevalue=input(date_values, ddmmyy11.);
>>
>> Then you will get the date value 1 January 19900. Interestingly, you
>cannot
>> display it with the DATE format, which has a maximum width of 9. So even
>> though SAS can store date values out to the year 19900, not all of the
>> tools support that full range. SAS is not fully Y10K compliant.
>>
>> data _null_;
>> date_values = '31.09.2005 ' ;
>> datevalue=input(compress(date_values,'.'), ddmmyy10.);
>> put date_values = datevalue= date9.;
>> date_values = '01.01.21000 ' ;
>> datevalue=input(compress(date_values,'.'), ddmmyy10.);
>> put date_values = datevalue= date9.;
>> date_values = '01.01.19900 ' ;
>> datevalue=input(compress(date_values,'.'), ddmmyy10.);
>> put date_values = datevalue= date9.;
>> datevalue=input( date_values , ddmmyy11.);
>> put date_values = datevalue= date9.;
>> put date_values = datevalue= weekdate.;
>> run;
>>
|