Date: Thu, 31 Oct 2002 17:24:32 -0500
Sender: "SAS(r) Discussion" <SAS-L@LISTSERV.UGA.EDU>
From: Steve Albert <SAlbert@AOL.COM>
Subject: Re: Figuring distances - thanks and comments
Content-Type: text/plain; charset=iso-8859-1
> Certainly there are many people here who inject and don't go to
> exchanges. And very few go any distance to get needles. But we are
> looking at a different question: Our key question is What influence when
> a city gets a syringe exchange program. It doesn't take many people to
> start a program, so even if only one is influenced by having one 100
> miles away, that's important to know.
> wish it to be. Therefore, many programs are 'underground', and some
> operate illegally but with the forbearance of the local authorities.
You might want to consider whether legal and illegal programs have different effects on the probability of starting new programs. I'd suspect they might, at least if the new program is an official one requiring some political support or approval. Similarly, might there be different cross-influences between public and private programs -- an existing public program has a bigger effect on the chance of starting a new public program, etc.
> As to 'N within 100 miles' being arbitrary, I agree. And it is
> problematic that having 5 within 10 miles would count the same as 5
> within 100. I am not set on this solution, by any means. To provide a
> little more context, we basically have two areas of the country where
> there are clusters of cities with programs (the West coast from San
> Francisco north to Canada, and the Northeast).
1. You might try something like the AAA road atlas to get estimated auto travel times between locations, and try that as an alternative distance metric.
2. Is NYC treated as one location? It's probably too big to do that, and travel times even within the city limits can be quite significant. (Same would probably apply to LA, if that's in your sample; consider other cities as appropriate.)
3. To measure political effects, you might want to look at number of programs in-state, or perhaps either in-state or within a closer distance. That is, program counts if it's in your state or is within, say, 40 miles if in another state. That would, for example, allow northern NJ or western CT to influence NYC; you need some judgment and perhaps some exploration here to see what seems reasonable for the cities you're dealing with.
4. I suspect the effects will be nonlinear in the number of sites within 100 miles -- the jump from 0 to 1 will matter much more than the jump from 9 to 10, etc. You could consider something like a Box-Cox approach, or if you just want to experiment a bit, perhaps look at something like sqrt(n) as an alternative to n.
5. There seem to be two possible effects to consider, startup and continuing. Startup -- influence via example -- may be less sensitive to distance. Continuing support (needing regular visits to exchange needles, for example) might be more sensitive.
> I would be open to other indexes, but I am not sure your idea of one
> based on mean distance is better; mostly because of the effect of large
> values on the mean. Perhaps something based on the median would be
I'd look for a convenience metric, one which increases (or at least doesn't decrease) whenever you add another site, even if it's more distant. Similarly, it should increase whenever a site moves closer. An example, though not necessarily a good one, would be sum(1/distance) -- e.g. if you have a site 5 miles away and a site 2 miles away, you get (1/5 + 1/2) = .7 as your convenience metric. I don't really like this one in detail -- ten sites 100 miles away are not as convenient as one site ten miles away -- but it might be a step in the right direction. I'm slightly happier with using a square root, for example : sum(1/sqrt(distance)) -- but that's still arbitrary, and you may want to think about this a bit. (Could you apply Box-Cox again?)
> In an earlier post Steve asked about multiple exchanges in one city.
> Right now, we are not looking at multiple events, but we
> may do so later.
Actually, my thought there wasn't multiple events, but the influence of multiple existing sites in another city. If City A has five sites it should have a bigger effect than if it has only one site.
I'd suspect that in this case there may not be enough data to do a really strong statistical model, particularly since there isn't a good a priori theory of effects and appropriate scales of measurement, so that some data snooping may be in order. I'd definitely try to get some information from people involved in starting some of these programs to get an idea of what was actually going on, which might help you pin the model down more tightly.
Hope this helps.
Director of Biostatistics
Spectrum Pharmaceutical Research Corp.
San Antonio, TX
SAlbert at SpectrumCRO dot com