| Date: | Tue, 17 Dec 1996 11:01:39 -0600 |
| Reply-To: | "Simon, Steve, PhD" <ssimon@CMH.EDU> |
| Sender: | "SPSSX(r) Discussion" <SPSSX-L@UGA.CC.UGA.EDU> |
| From: | "Simon, Steve, PhD" <ssimon@CMH.EDU> |
| Subject: | Missing vs. Sysmis |
| Content-Type: | multipart/mixed; |
|---|
David Kurzman writes:
>As I create a new data set and use a previous one I began to wonder
>wheteher it is better to use a code for missing data (e.g., -99) or
>simply leave it blank and later add the
>
>set blanks = -99
>(missing values all (-99)
>
>is there any reason why I would want to initially code it as -99
>rather than leave it blank?
Blanks in a data entry field are (in my humble opinion) bad form.
First, blanks often cause problems when exporting to other formats.
Second, there will come a time when you wish to explicitly analyze the
pattern of missingness. When you do, a special code gives you more
flexibility. Third, data are often missing for more than one reason.
When you have more than one reason for missing data (e.g., drop-outs,
refused to answer, not applicable, below detectable limits, sample
lost), then you should have a separate code for each. It is very
possible that each type of missing value will be handled differently.
>This is especially true given, a data set that I have been given in
>which the data entry involves fields of different length such that the
>missing data were entered as 99, 999, 9999 .... and there are now
>many many lines simply to define the missing data.
Gee, why not plan ahead? If the length of the longest field is four
characters, then use 9999, 9998, etc. as missing codes for all of your
variables. It may waste a bit of space in your one digit fields, but
with computers as inexpensive as they are, it's better to waste
computer resources than human resources.
>Also, when new data were entered, they decided to now enter the
>additional data by simply leaving missing data fields blank. does
>this combination of defining missing data create problems?
See my third point above. Also, the word "they" implies that more than
one person is doing data entry. In this case, the standardization
imposed by implementing missing value codes is extremely important.
>Lastly, Is there a way to recode all missing data to a single code in
>one step (e.g., recode all missing to -9)?
Don't know the answer to this one. Sorry.
Steve Simon, ssimon@cmh.edu, Standard Disclaimer.
[application/ms-tnef]
|