LISTSERV at the University of Georgia
Menubar Imagemap
Home Browse Manage Request Manuals Register
Previous messageNext messagePrevious in topicNext in topicPrevious by same authorNext by same authorPrevious page (September 2005)Back to main SPSSX-L pageJoin or leave SPSSX-L (or change settings)ReplyPost a new messageSearchProportional fontNon-proportional font
Date:         Wed, 28 Sep 2005 10:32:19 -0400
Reply-To:     Kathryn Gardner <KJGARDNER10@HOTMAIL.COM>
Sender:       "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From:         Kathryn Gardner <KJGARDNER10@HOTMAIL.COM>
Subject:      data screening help
Content-Type: text/plain; charset=ISO-8859-1

Hi all,

I have a number of questions relating to data screening (i.e., outlier, normality, linearity checks) that I am hoping people can help me out with, as I have literally exhausted all my resources. Some of the questions are practical, others technical. I realize that there a quite a few Qs there, but a simple ďyesĒ / ďnoĒ responses (where possible) will be more than enough! Or if people can only answer 1 or some of the Qs that will be just as helpful. Iíd be really really grateful of any help at all.

1) Practical question Ė Iíve been trying to figure out how to make my boxplots so that I can actually see the case/ID numbers next to the outliers. Itís OK when I have one outlier, but when I have a bunch of them they end up on top of each other and I canít see what ID/Case no. they are. Is there another way SPSS can show me the case numbers of the outliers, or a way I can visually inspect the case numbers on the boxplots? Iíve tried blowing the boxplot up to full screen size, but even this doesnít help.

2) The procedure for detecting outliers depends on whether data is continuous or categorical. If it is continuous this means data screening the sample as a whole, if categorical this means screening by group. I am using analyses that will involve both the use of continuous and categorical data, so how should I screen my data? Iíve been screening as a whole up until now. Besides, if I decided to screen using groups, at what point do I decide not the split the data into groups i.e., I could split my data according to gender, age, ethnicity, education, occupation, country etc etc.

3) Related to the above Q, does the idea that screening for outliers depends on whether data is continuous or categorical apply to all data screening procedures i.e., normality analyses?

4) I have screened my data according to subscales rather than full scale scores e.g., checked the normality of each individual subscale on each questionnaire (some questionnaires donít produce full scale scores). I donít know whether this is standard practice, but to me it makes sense to screen by subscale. I do however, have a variable that does produce subscales, but I have had to use the full scale score in my data screening because I canít split it into subscales until Iíve factor analysed it. Is this OK?

5) Iíve been using logarithm & square root transformations etc to reduce skew and kurtosis, but these transformations donít appear to be effective in improving normality when there is only high or low kurtosis (i.e., when skew is OK). Any suggestions?

6) In some cases Iíve transformed a variable to reduce skew so it is less than 1, but this has sometimes also inflated kurtosis by about 0.6. Is it best to have a variable with a skew level of about 1.3 and kurtosis 0.15, or a variable with skew at .6 and kurtosis about .6?

7) When computing mahlanobis distance I am assuming that I move ďallĒ of my variables into the ďindependentsĒ box. I have about 40 variables because as well as some full-scale variables, I also have some variables that assess subscales and do not combine to produce a full-scale score. However, what about the variable for which I only have full scale scores for (because Iíve not yet factor analysed the scale)? Am I ok to simply put this variable across even though I will at some point be breaking the scale down into subscales?

"THANK YOU" to anyone who has taken the time to read this e-mail. I really do appreciate the help

Kathryn


Back to: Top of message | Previous page | Main SPSSX-L page