Deriving ERF from summary data

From Testiwiki
Jump to: navigation, search

Question

How to derive exposure-response functions from summary data?

Answer

Now if you want to and can assume linearity with your particular data, you can estimate b by simply using the three ORs from the publication and weighting them by 1/SE, and you get something like b = 0.12 / uT.

Rationale

See email from Jouni Tuomisto at 15 October 2011 06:42.

Odds ratios at different exposure levels
Type of study < 0.1 μT 0.1 < 0.2 μT 0.2 < 0.3 μT ≥ 0.3 μT
All studies 1.00 (1.00, 1.00) 1.07 (0.81, 1.41) 1.16 (0.69, 1.93) 1.44 (0.88, 2.36)

For a reference, you might be interested in this book: [1] It is not a systematic description of the problem, but several experts go into fair detail in explaining several methods to estimate dose responses and discussing problems. You may have to buy the book to actually get access to it, but there is also another link that contains a selection of pages from it: [2]

Whether you should force zero risk at zero exposure is not the right question in my thinking. OR is about relative differences in risk between two exposure levels. What exposure level has OR 1 is just a matter of taste, because that only tells that the risk is the same as in the reference exposure level, and what is your reference is something that you decide, it is not a property of the world. OR does not tell anything about the actual risk level, and you must get background risk data from elsewhere. Therefore, forcing the curve to go through point (0,1) seems to be a bad idea, although I don't quite understand what is the exact mathematical interpretation of such a curve.

Now that I read and write this back and forth, I start seeing what you are trying to do. So, I can give some further comments.

I don't understand how you include or exclude the reference group from the analysis. Exclusion is clear in practical terms but wrong, because all exposure levels are relative to the reference level, which is by definition 1. Note that it is not the zero exposure that is 1 but the reference group. But inclusion is strange: the SE for the reference is 0, and you are using weights 1/SE?! What do you do in practice? If you give the reference SE some arbitrarily small number to avoid 1/0 error, you also force severe bias underestimating the uncertainty, which seems to be the case in some of your graphs.

I don't quite understand what you do in the offset part, so I cannot comment that.

A general comment: it is always useful to try and understand where the estimates came from. In this case, a model like this was assumed:

log(p(x)/(1-p(x)) ~ a + b(x)
where p(x) is the probability of disease at exposure level x, a is the background, and b is the risk coefficient. The OR reported is defined as
log(OR) = log(p(x)/(1-p(x)) / (p(r)/(1-p(r)))) = log(p(x)/(1-p(x))) - log(p(r)/(1-p(r)))
where r is the reference exposure. In other words
log(OR) ~ (a + b(x)) - (a + b(r)) = b(x) - b(r) and the background risk a cancels out.

If the ecposure-response function is linear, this is straightforward, because the function b(x) is simply b*x and the last equation reduces to

log(OR) ~ b*(x - r) 

Now if you want to and can assume linearity with your particular data, you can estimate b by simply using the three ORs from the publication and weighting them by 1/SE, and you get something like b = 0.12 / uT. The b is then used to draw a curve OR = exp(b*(x-r)) (which results in OR = 1 at the reference dose level, but as I said, the question about what is a reference is an arbitrary decision, so you can as well draw the curve OR = exp(b*x) which has the value (0,1). Note that the selection of reference is done AFTER the regression, so it does not affect the critical parameter b.