Date: Wed, 9 Feb 2005 12:01:08 -0300
Reply-To: Hector Maletta <hmaletta@fibertel.com.ar>
Sender: "SPSSX(r) Discussion" <SPSSX-L@LISTSERV.UGA.EDU>
From: Hector Maletta <hmaletta@fibertel.com.ar>
Subject: Time to event
Content-Type: text/plain; charset="us-ascii"
I am trying to estimate interval between births from Bolivian census data
about the date of birth of the latest child born from women over 15, and
predict the hazard rate of having a new baby (for women of various ages and
various numbers of children already born, including childless women) from
other variables in the census. Besides ordinary census questions, this
census asked every woman over 15 how many children she has given birth to,
and the date of birth of the latest one (month and year). So I have the
birth order of the latest child, and the interval elapsed between the birth
of the latest child and the census date, but not the interval between the
births of any two children. For some women the latest is the first, for some
the latest is the last, for some is just another one. Some women may still
bear another baby at some time after the census, while other women bear no
more children in their lives. If I could figure out how to estimate the
probability of having a baby so many months or years after the latest child,
and considering that I also know the birth order of the latest child, I
could potentially estimate the hazard rates for new births and the change in
the interval between births according to birth order, mother's age and other
variables.
To attack this problem I have thought of the following possible strategies:
1. Consider all cases as right-censored, because by definition no other
birth has occurred since the latest up to the census date. In other words,
all cases are right-censored, and no event (new birth) occurred, but the
censored intervals are of different length.I do not know whether you can
work with a dataset in which all cases are censored and no event ever
occurs.
2. Consider the opposite of giving birth as the event, i.e. the event would
be defined as the event of NOT having a baby at each time point after the
birth of the latest birth. The event of not having a baby every month or
year since their latest child is, of course, observed for all women. Not
having a baby is a repeatable event which is repeated every month or year
since the latest birth to census date when all cases are censored out.
Repeatable events imply some difficulties in time to event models.
The data being from a census, cases are aplenty and statistical significance
is not a big constraint. Even for a small country like Bolivia, women with a
child born in the last 5 years are nearly 800,000, and the total number of
women of bearing age is about 2.5 million. I prefer using births in the last
5 years only, to avoid recall errors which are reportedly very frequent with
longer recall intervals (i.e. women having had their latest child more than
5 years ago) and also because many potential explanatory variables (like
place of residence, marital status, and others) refer to the current status
of the women, and may not be applicable to the time of births occurred many
years ago.
I am not a demographer and have only a limited acquaintance with
time-to-event models. I suspect demographers may have a standard strategy
for this kind of situation, but somehow I haven't found it so far. I'd like
to know how I could model this problem in order to arrive at a solution,
possibly using SPSS Cox regression to introduce predictors.
Any help appreciated.
Hector
|