Date: Mon, 3 Feb 2003 16:33:38 -0500
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: "Arthur J. Kendall" <Art@DrKendall.org>
Organization: Social Research Consultants
Subject: Re: Inquiry of Method
Content-Type: text/plain; charset=us-ascii; format=flowed
What statistical techniques are useful in answering questions depends
on 1) the questions themselves, 2) the nature of the variables
available, and 3) the nature of the set of cases available.
Even if you cannot change the selection, knowing how and why the set
of cases were selected gives some idea what it reasonable to do with the
data. The strongest source of bias you need to worry about about is
that the set of cases selected may not be be representative of the pop.
If you are going to do anything meaningful with the data, you'll need to
be able to argue that you have reason to believe that you have some
rough equivalent to a sample.
Is there ANY way you can argue that you have the equivalent of a
complex sample? (stratification by year or other variables? cluster
sampling? systematic sampling with a random start with different
intervals in different years? etc.)
You also need to be able to fashion caseweights. In a meaningful
analysis, there are always caseweights. In simple random samples these
are one's and they "disappear" from the formula, but they are always
Unweighted analysis of cases with an unknown selection method can
produce almost no useful information. (You can argue that in this set
these sets of circumstances existed but there are not even meaningful
percentages. E.g. you can say that "Some injuries involved males and
some involved females but we have no meaningful info about absolute or
relative frequency." "Some injuries were to noses, toes, fingers, or
elbows, but we don't know if there were injuries to other body sites or
the absolute or relative frequency".
What do you mean by "the medical utilization of injuries"?
What statistical techniques can be used to answer questions depends on
the questions themselves and the nature of the variables available and
the nature of the set of cases available.
Hope this helps.
Social Research Consultants
University Park, MD USA
Nan Li wrote:
> Thanks for Art, Dale, and Dorothy to answer my question. Actually, my
> current study is to link the fishery and rescues data with injury data and
> analyze the medical utilization of injuries using the 1000 non-random
> sample. The population is the total cases of injuries for ten years, since
> the records are not good in the first few years, more cases were selected in
> the latest few years. We are only assigned to perform the analysis based on
> these information and couldn't make any changes of sample selection. We
> would like to estimate particular variables using some statistics, like
> mean, confidence interval, etc. whatever statistical methods or statistics
> will be used. This is the general idea of my current study. Dale and Dorothy
> suggested me to examine all types of bias. I am wondering what kind of
> method I could use to examine the bias. If there is no bias, could I use the
> statistical methods suitable for random samples to analyze my data and make
> reference to the pop?
> Thank you very much for your assistance!
> -----Original Message-----
> From: Arthur J. Kendall [mailto:Art@DrKendall.org]
> Sent: Monday, February 03, 2003 10:44 AM
> To: Nan Li
> Cc: SPSSX-L@VM.MARIST.EDU
> Subject: Re: Inquiry of Method
> What was the nature of the pop? What constitutes a case in the pop?
> Is this a population that was very small 10 years ago and is growing
> rapidly in size? Were some cases in the pop at more than one point in
> time? You say "estimate the pop" but say that it is composed of 5600
> cases. Does this mean that you want count or percentages of cases in
> the pop that have particular characteristics? Does it mean that you
> want estimates of particular means in the pop? Does it mean that you
> want to know the relation(s) of particular variables in the pop?
> How and why were the cases in your data subset chosen? Are there ways
> that the cases in the subset are systematically different from those in
> the pop?
> Hope this helps.
> Social Research Consultants
> University Park, MD USA
> (301) 864-5570
> Nan Li wrote:
>>Dear list members,
>>I just start a project and I'm not sure which statistical method fits my
>>study and would like to get some help from you. I have a 10-year sample
>>data, which is not a random sample, and 80% of them are from the latest 4
>>years and 20% from the first six years. The total population is 5,600, and
>>the sample size is 1,000. I am just wondering could I use this non-random
>>sample to estimate the population and what kind of method I could use to
>>analyze this data. From my knowledge, most of the statistic methods are
>>based on random sample. Does anybody have any idea about analyzing
>>non-random sample? I would appreciated. Thanks in advance!