Difference between revisions of "Attributable risk"

From Testiwiki
Jump to: navigation, search
(Estimating etiologic fraction)
(text blocks reordered)
Line 6: Line 6:
 
==Question==
 
==Question==
  
How to calculate attributable risk? What different approaches there are, and what are their differences in interpretation and use?
+
How to calculate attributable risk? What different approaches are there, and what are their differences in interpretation and use?
  
 
==Answer==
 
==Answer==
Line 12: Line 12:
 
; Risk ratio (RR): risk among the exposed divided by the risk among the non-exposed  
 
; Risk ratio (RR): risk among the exposed divided by the risk among the non-exposed  
 
:<math>RR = \frac{R_1}{R_0}.</math>
 
:<math>RR = \frac{R_1}{R_0}.</math>
; Attributable fraction: the fraction of cases '''among the exposed''' that would not have occurred if the exposure would not have taken place:  
+
; Attributable fraction: (aka excess fraction) the fraction of cases '''among the exposed''' that would not have occurred if the exposure would not have taken place:  
 
:<math>AF = \frac{RR - 1}{RR}</math>
 
:<math>AF = \frac{RR - 1}{RR}</math>
 
; Population attributable fraction: the fraction of cases '''among the total population''' that would not have occurred if the exposure would not have taken place. The most useful formulas are
 
; Population attributable fraction: the fraction of cases '''among the total population''' that would not have occurred if the exposure would not have taken place. The most useful formulas are
Line 19: Line 19:
 
::<math>PAF = 1- \sum_{i=0}^k \frac{p_{ci}}{RR_i} = \sum_i p_{ci} \frac{p_{ei}(RR_i - 1)}{p_{ei}(RR_i - 1) + 1}</math>
 
::<math>PAF = 1- \sum_{i=0}^k \frac{p_{ci}}{RR_i} = \sum_i p_{ci} \frac{p_{ei}(RR_i - 1)}{p_{ei}(RR_i - 1) + 1}</math>
 
::which produces valid estimates when confounding exists but with a problem that parameters are often not known. p<sub>ci</sub> is the proportion of '''cases''' falling in subgroup i (so that &Sigma;<sub>i</sub>p<sub>ci</sub> = 1), p<sub>ei</sub> is the proportion of '''exposed''' people within subgroup i (and 1-p<sub>i</sub> is the fraction of unexposed)
 
::which produces valid estimates when confounding exists but with a problem that parameters are often not known. p<sub>ci</sub> is the proportion of '''cases''' falling in subgroup i (so that &Sigma;<sub>i</sub>p<sub>ci</sub> = 1), p<sub>ei</sub> is the proportion of '''exposed''' people within subgroup i (and 1-p<sub>i</sub> is the fraction of unexposed)
; Etiologic fraction: Fraction of cases among the exposed that would have occurred later (if at all) if the exposure would not have taken place. It cannot be calculated without understanding of the biological mechanism, but it is always between
+
; Etiologic fraction: Fraction of cases '''among the exposed''' that would have occurred later (if at all) if the exposure would not have taken place. It cannot be calculated without understanding of the biological mechanism, but there are equations for several different models:
:<math>\frac{RR-1}{RR^{RR/(RR-1)}}</math> and 1.
+
{| {{prettytable}}
 +
! Equation
 +
! Mechanisms and description
 +
|----
 +
|<math>EF = 1</math>
 +
|'''Rank-preserving model'''. The rank of individual deaths is not affected by exposure, i.e. everyone dies in the same order as without exposure, just sooner.
 +
|----
 +
|<math>EF = \frac{RR - 1}{RR}</math>
 +
|'''Competing causes model'''. The hazard rate (occurrence rate of cases in time) in the exposed population is relative to that of the non-exposed population: h<sub>1</sub>(t) = RR h<sub>0</sub>(t). The ratio is constant although the hazard rates are functions of time t.
 +
|----
 +
|<math>EF_l = \frac{RR-1}{RR^{RR/(RR-1)}}</math>
 +
|'''Exponential survival model'''. When the survival times in the population follow exponential distribution, the lowest possible EF can be calculated from this equation. However, the exponential survival model says nothing about which individuals are affected and lose how much life years, and therefore in this model the actual EF may be between the lower bound and 1.
 +
|}
 +
 
 +
With this code, you can compare attributable fraction and lower (assuming exponential survival distribution) and upper bounds of etiological fraction.
 +
 
 +
<rcode label="Compare attributable and etiologic fractions" embed=1 variables="name:RR|description:What is (are) the relative risk(s), i.e. RR?|default:c(1, 1.02, 1.3, 1,5, 2, 3)">
 +
library(OpasnetUtils)
 +
AF <- function(x) {return(data.frame(RR = x, AF = (x-1)/x, EF_exp_lower = (x-1)/x^(x/(x-1)), EF_upper = 1))}
 +
 
 +
oprint(AF(RR))
 +
</rcode>
 +
 
 +
This code creates a simulated population of 180 individuals and calculates their survival and attributable and etiologic fractions in different mechanistic settings.
 +
 
 +
<rcode label="Test different etiologic fractions" embed=0 graphics=1 variables="
 +
name:scenario|description:What scenarios do you want to see?|type:checkbox|options:
 +
1;BAU (business as usual, no exposure;
 +
2;Orderly (everyone dies one year earlier because of exposure);
 +
3;Relative (everyone dies 20 % faster because of exposure);
 +
4;Maxloss (those who live longest are killed by exposure as early as possible);
 +
5;Competcause (20 % of people died earlier than others);
 +
6;EF0.25 (25 % random people die at 30 % of their expected life);
 +
7;BAU_expon (Exponential distribution for deaths);
 +
8;Rel_expon (RR 1.2 to reduce eveyone's life);
 +
9;Rel_exMaxloss (Lost life comes to those who would have lived the longest)|
 +
default:1;2;4;6|
 +
name:bau|description:Which of the scenarios is used as the reference (BAU)?|type:selection|options:
 +
1;BAU (business as usual, no exposure;
 +
2;Orderly (everyone dies one year earlier because of exposure);
 +
3;Relative (everyone dies 20 % faster because of exposure);
 +
4;Maxloss (those who live longest are killed by exposure as early as possible);
 +
5;Competcause (20 % of people died earlier than others);
 +
6;EF0.25 (25 % random people die at 30 % of their expected life);
 +
7;BAU_expon (Exponential distribution for deaths);
 +
8;Rel_expon (RR 1.2 to reduce eveyone's life);
 +
9;Rel_exMaxloss (Lost life comes to those who would have lived the longest)|
 +
default:1
 +
">
 +
#This is code 6211/ on page [[Attributable risk]]
 +
library(OpasnetUtils)
 +
library(reshape2)
 +
library(ggplot2)
 +
 
 +
lifetime <- data.frame(
 +
  Id = 1:180,
 +
  BAU = 10 - 0:179 * 8 / 180
 +
)
 +
 
 +
scenarios <- c(
 +
"BAU",
 +
"Orderly",
 +
"Relative",
 +
"Maxloss",
 +
"Competcause",
 +
"EF0.25",
 +
"BAU_expon",
 +
"Rel_expon",
 +
"Rel_exMaxloss"
 +
)
 +
 
 +
bau <- scenarios[bau]
 +
scenario <- scenarios[scenario]
 +
scenario <- union(bau, scenario)
 +
 
 +
scenario <- c("Id", scenario)
 +
 
 +
lifetime$Orderly <- lifetime$BAU -1
 +
#RR <- sum(lifetime$BAU) / sum(lifetime$Orderly)
 +
lifetime$Relative <- lifetime$BAU / 1.2
 +
lifetime$Maxloss <- lifetime$Orderly[c(156:180, 1:155)]
 +
#lifetime$Submaxloss <- lifetime$Orderly[c((1:30)*6, rep(0:29, each = 5)*6+(1:5))]
 +
temp <- numeric()
 +
for(i in 0:29) {
 +
  temp <- c(temp, i * 5 + 1:5, 151 + i)
 +
}
 +
lifetime$Competcause <- lifetime$Orderly[temp]
 +
a <- 0.3
 +
lifetime$EF0.25 <- ifelse(lifetime$Id/4 == round(lifetime$Id/4), lifetime$BAU * a, lifetime$BAU)
 +
lifetime$BAU_expon <- qexp((180:1)/181, 1/6)
 +
lifetime$Rel_expon <- qexp((180:1)/181, 1.2/6)
 +
lifetime$Rel_exMaxloss <- lifetime$Rel_expon[c(168:180, 1:167)]
 +
lifetime <- lifetime[scenario]
 +
 
 +
objects.latest("Op_en6211", code_name = "EF")
 +
 
 +
metrices <- EvalOutput(metrices)
 +
 
 +
cat("Relative risks.\n")
 +
oprint(RR@output)
 +
 
 +
cat("Different etiologic and attributable fractions.\n")
 +
oprint(unkeep(metrices, sources = TRUE))
 +
 
 +
BS <- 24
 +
ggplot(lif@output, aes(x = Id, y = lifResult, colour = Scenario))+geom_point()+
 +
  coord_cartesian(ylim=c(0,10)) + theme_gray(base_size = BS) + labs(title = "Life expectancies of 180 individuals", y = "Age at death", x = "Individual")
 +
ggplot(fr@output, aes(x = Time, y = frResult, colour = Scenario, group = Scenario))+geom_line() +
 +
  theme_gray(base_size = BS) + labs(title = "fraction of people dying at different time groups")
 +
ggplot(surv@output, aes(x = Time, y = survResult, colour = Scenario, group = Scenario))+geom_line() +
 +
  theme_gray(base_size = BS) + labs(title = "Survival curves in different scenarios")
 +
ggplot(EF_eq9@output, aes(x = Time, y = EF_eq9Result, colour = Scenario, group = Scenario))+geom_line()+
 +
  theme_gray(base_size = BS) + labs(title = "Development of etiologic fraction in time")
 +
</rcode>
  
 
==Rationale==
 
==Rationale==
Line 36: Line 149:
 
; Proportion of population (p<sub>i</sub>): proportion of population in subgroups i among the total population: N(S=i)/N
 
; Proportion of population (p<sub>i</sub>): proportion of population in subgroups i among the total population: N(S=i)/N
 
; Proportion of cases (p<sub>ci</sub>): proportion of cases in subgroups i among the total cases: N(D=c,S=i)/N(D=c)
 
; Proportion of cases (p<sub>ci</sub>): proportion of cases in subgroups i among the total cases: N(D=c,S=i)/N(D=c)
 
=== Etiologic fraction ===
 
 
Etiologic fraction is defined as the fraction of cases that is advanced in time because of exposure.<ref name="robins">Robins JM, Greenland S. Estimability and estimation of excess and etiologic fractions. Statistics in Medicine 1989 (8) 845-859.</ref>{{reslink|Choosing the right fraction}}  It can also be called ''probability of causation'', which has importance in court. It can also be used to calculate ''premature cases'', but that word is ambiguous and sometimes attributable fraction is used instead.{{reslink|Meaning of premature}} Therefore, it is important to explicitly explain what is meant by the work ''premature''.
 
 
The exact value of etiologic fraction cannot be estimated directly from risk ratio (RR) because some knowledge is needed about biological mechanisms (more precisely: timing of disease). In any case, the etiologic fraction always lies between f and 1, when f is
 
 
<math>\frac{RR - 1}{RR^{RR/(RR-1)}}.</math>
 
 
The code below calculates the attributable fraction and lower and upper bounds of the etiological fraction for user-defined RRs.
 
  
 
=== Attributable fraction ===
 
=== Attributable fraction ===
Line 92: Line 195:
 
|}
 
|}
  
=== Impact of confounders ===
+
==== Impact of confounders ====
  
 
[[File:Darrow Steenland AF bias analysis png.png|thumb|400px|Darrow and Steenland<ref name="darrow">Darrow LA, Steenland NK. Confounding and bias in the attributable fraction. Epidemiology 2011: 22 (1): 53-58. [http://www.ncbi.nlm.nih.gov/pubmed/20975564] {{doi|10.1097/EDE.0b013e3181fce49b}}</ref> studied the direction and magnitude of bias in attributable fraction with different confounding situations. For details, see [[Attributable risk#Impact of confounders]]. ]]
 
[[File:Darrow Steenland AF bias analysis png.png|thumb|400px|Darrow and Steenland<ref name="darrow">Darrow LA, Steenland NK. Confounding and bias in the attributable fraction. Epidemiology 2011: 22 (1): 53-58. [http://www.ncbi.nlm.nih.gov/pubmed/20975564] {{doi|10.1097/EDE.0b013e3181fce49b}}</ref> studied the direction and magnitude of bias in attributable fraction with different confounding situations. For details, see [[Attributable risk#Impact of confounders]]. ]]
Line 116: Line 219:
 
|}
 
|}
  
=== Calculations ===
+
=== Population attributable fraction ===
 
 
With this code, you can compare attributable fraction and lower and upper bounds of etiological fraction.
 
 
 
<rcode label="Compare attributable and etiologic fractions" embed=1 variables="name:RR|description:What is (are) the relative risk(s), i.e. RR?|default:c(1, 1.02, 1.3, 1,5, 2, 3)">
 
library(OpasnetUtils)
 
AF <- function(x) {return(data.frame(RR = x, AF = (x-1)/x, EF_lower = (x-1)/x^(x/(x-1)), EF_upper = 1))}
 
 
 
oprint(AF(RR))
 
</rcode>
 
 
 
{{attack|# |UPDATE AF TO REFLECT THE CURRENT IMPLEMENTATION OF ERF [[Exposure-response function]]|--[[User:Jouni|Jouni]] ([[User talk:Jouni|talk]]) 05:20, 13 June 2015 (UTC)}}
 
 
 
<rcode name="AF" label="Initiate AF (only for developers)" embed=1>
 
# This is code Op_en6211/AF on page [[Attributable risk]]
 
# Parameters: none
 
 
 
library(OpasnetUtils)
 
 
 
# AF = attributable fraction
 
# EF = etiologic fraction
 
# PAF = population attributable fraction using
 
EF <- Ovariable("EF",
 
dependencies = data.frame(Name = c(
 
"RR" # Risk ratio
 
)),
 
 
formula = function(...) {
 
 
 
R <- unkeep(RR, sources = TRUE, prevresults = TRUE)
 
EF <- (RR - 1) / R^(R/(R-1))
 
EF <- EF * Ovariable("temp", data = data.frame(
 
EFestimate = c("Low", "High"),
 
Result = 1
 
))
 
result(EF)[EF$EFestimate == "High"] <- 1
 
 
 
return(EF)
 
}
 
)
 
 
 
AF <- Ovariable("AF",
 
dependencies = data.frame(Name = c(
 
"RR" # Risk ratio
 
)),
 
 
formula = function(...) {
 
 
 
AF <- (RR - 1) / unkeep(RR, sources = TRUE, prevresults = TRUE)
 
 
 
return(AF)
 
}
 
)
 
 
 
PAF <- Ovariable("PAF",
 
dependencies = data.frame(Name = c(
 
"RR", # Risk ratio
 
"pci", # proportion of cases falling subgroup i among all cases
 
"pei" # proportion of exposed people within subgroup i
 
)),
 
 
formula = function(...) {
 
 
 
peirri <- pei * (RR - 1)
 
peirri <- unkeep(peirri, sources = TRUE, prevresults = TRUE)
 
 
 
PAF <- pci * peirri / (peirri + 1) # The population subgroup could be summed up.
 
 
 
return(PAF)
 
}
 
)
 
 
 
objects.store(EF, AF, PAF)
 
cat("Ovariables EF, AF, PAF stored.\n")
 
</rcode>
 
 
 
A [http://en.opasnet.org/en-opwiki/index.php?title=Attributable_risk&oldid=39071#Calculations previous version of code] looked at RRs of all exposure agents and summed PAFs up.
 
 
 
=== Derivation of PAF ===
 
 
 
{{attack|# |Do we need this section?|--[[User:Jouni|Jouni]] ([[User talk:Jouni|talk]]) 20:53, 7 April 2016 (UTC)}}
 
 
 
{{defend|# |I think that we do. It clearly shows how to add exposure to PAF calclualtions and how we are using it HIA.|--[[User:Arja|Arja]] ([[User talk:Arja|talk]]) 07:31, 8 April 2016 (UTC)}}
 
  
 
The ''population attributable fraction PAF'' is that fraction among the whole cohort:
 
The ''population attributable fraction PAF'' is that fraction among the whole cohort:
Line 254: Line 275:
 
This equation is used in e.g. [[Health impact assessment]].
 
This equation is used in e.g. [[Health impact assessment]].
  
=== Constant background assumption ===
+
==== Constant background assumption ====
  
 
{{attack|# |Is this section necessary?|--[[User:Jouni|Jouni]] ([[User talk:Jouni|talk]]) 20:53, 7 April 2016 (UTC)}}
 
{{attack|# |Is this section necessary?|--[[User:Jouni|Jouni]] ([[User talk:Jouni|talk]]) 20:53, 7 April 2016 (UTC)}}
Line 295: Line 316:
 
<math>= \frac{p e^{ln(ERF)exposure} + (1-p)1}{1} = p e^{ln(ERF)exposure} -p + 1</math>
 
<math>= \frac{p e^{ln(ERF)exposure} + (1-p)1}{1} = p e^{ln(ERF)exposure} -p + 1</math>
  
=== Estimating etiologic fraction ===
+
=== Etiologic fraction ===
 +
 
 +
Etiologic fraction is defined as the fraction of cases that is advanced in time because of exposure.<ref name="robins">Robins JM, Greenland S. Estimability and estimation of excess and etiologic fractions. Statistics in Medicine 1989 (8) 845-859.</ref>{{reslink|Choosing the right fraction}}  It can also be called ''probability of causation'', which has importance in court. It can also be used to calculate ''premature cases'', but that word is ambiguous and sometimes attributable fraction is used instead.{{reslink|Meaning of premature}} Therefore, it is important to explicitly explain what is meant by the work ''premature''.
 +
 
 +
The exact value of etiologic fraction cannot be estimated directly from risk ratio (RR) because some knowledge is needed about biological mechanisms (more precisely: timing of disease). In any case, the etiologic fraction always lies between f and 1, when f is
 +
 
 +
<math>\frac{RR - 1}{RR^{RR/(RR-1)}}.</math>
 +
 
 +
The code below calculates the attributable fraction and lower and upper bounds of the etiological fraction for user-defined RRs.
 +
 
 +
==== Estimating etiologic fraction ====
  
 
'''Etiologic fraction''' (EF) tells which fraction of the population dies earlier because of the exposure considered, compared with the situation that they would not be exposed. Often that is calculated is the ''attributable fraction AF'':
 
'''Etiologic fraction''' (EF) tells which fraction of the population dies earlier because of the exposure considered, compared with the situation that they would not be exposed. Often that is calculated is the ''attributable fraction AF'':
Line 313: Line 344:
 
It is not clear how they got from the first equation to the second equation, especially because the second gives values that are typically less than half of that if the first one. Both are used in the code below. In addition, the ''true etiologic fraction'' is calculated for this simulated population, because in the simulation we assume that we know exactly what happens to each individual in each scenario and how much their lengths of lives change.
 
It is not clear how they got from the first equation to the second equation, especially because the second gives values that are typically less than half of that if the first one. Both are used in the code below. In addition, the ''true etiologic fraction'' is calculated for this simulated population, because in the simulation we assume that we know exactly what happens to each individual in each scenario and how much their lengths of lives change.
  
'''Some interesting model runs:
+
=== Calculations ===
* [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=HiKEL00JBT5himg6 Population with exponentially distributed lifetimes]
+
 
* [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=Kamc6txZdl8R2fy0 Life loss to a fraction of people across the whole population]
+
{{attack|# |UPDATE AF TO REFLECT THE CURRENT IMPLEMENTATION OF ERF [[Exposure-response function]]|--[[User:Jouni|Jouni]] ([[User talk:Jouni|talk]]) 05:20, 13 June 2015 (UTC)}}
* [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=MR9wpPlPiKmsr7mD Give all life loss to the people living the longest9
+
 
+
<rcode name="AF" label="Initiate AF (only for developers)" embed=1>
<rcode label="Test different etiologic fractions" embed=0 graphics=1 variables="
+
# This is code Op_en6211/AF on page [[Attributable risk]]
name:scenario|description:What scenarios do you want to see?|type:checkbox|options:
+
# Parameters: none
1;BAU (business as usual, no exposure;
+
 
2;Orderly (everyone dies one year earlier because of exposure);
 
3;Relative (everyone dies 20 % faster because of exposure);
 
4;Maxloss (those who live longest are killed by exposure as early as possible);
 
5;Competcause (20 % of people died earlier than others);
 
6;EF0.25 (25 % random people die at 30 % of their expected life);
 
7;BAU_expon (Exponential distribution for deaths);
 
8;Rel_expon (RR 1.2 to reduce eveyone's life);
 
9;Rel_exMaxloss (Lost life comes to those who would have lived the longest)|
 
default:1;2;4;6|
 
name:bau|description:Which of the scenarios is used as the reference (BAU)?|type:selection|options:
 
1;BAU (business as usual, no exposure;
 
2;Orderly (everyone dies one year earlier because of exposure);
 
3;Relative (everyone dies 20 % faster because of exposure);
 
4;Maxloss (those who live longest are killed by exposure as early as possible);
 
5;Competcause (20 % of people died earlier than others);
 
6;EF0.25 (25 % random people die at 30 % of their expected life);
 
7;BAU_expon (Exponential distribution for deaths);
 
8;Rel_expon (RR 1.2 to reduce eveyone's life);
 
9;Rel_exMaxloss (Lost life comes to those who would have lived the longest)|
 
default:1
 
">
 
#This is code 6211/ on page [[Attributable risk]]
 
 
library(OpasnetUtils)
 
library(OpasnetUtils)
library(reshape2)
 
library(ggplot2)
 
  
lifetime <- data.frame(
+
# AF = attributable fraction
  Id = 1:180,
+
# EF = etiologic fraction
  BAU = 10 - 0:179 * 8 / 180
+
# PAF = population attributable fraction using
)
+
EF <- Ovariable("EF",
 +
dependencies = data.frame(Name = c(
 +
"RR" # Risk ratio
 +
)),
 +
 +
formula = function(...) {
 +
 
 +
R <- unkeep(RR, sources = TRUE, prevresults = TRUE)
 +
EF <- (RR - 1) / R^(R/(R-1))
 +
EF <- EF * Ovariable("temp", data = data.frame(
 +
EFestimate = c("Low", "High"),
 +
Result = 1
 +
))
 +
result(EF)[EF$EFestimate == "High"] <- 1
  
scenarios <- c(
+
return(EF)
"BAU",
+
}
"Orderly",
 
"Relative",
 
"Maxloss",
 
"Competcause",
 
"EF0.25",
 
"BAU_expon",
 
"Rel_expon",
 
"Rel_exMaxloss"
 
 
)
 
)
  
bau <- scenarios[bau]
+
AF <- Ovariable("AF",
scenario <- scenarios[scenario]
+
dependencies = data.frame(Name = c(
scenario <- union(bau, scenario)
+
"RR" # Risk ratio
 +
)),
 +
 +
formula = function(...) {
  
scenario <- c("Id", scenario)
+
AF <- (RR - 1) / unkeep(RR, sources = TRUE, prevresults = TRUE)
  
lifetime$Orderly <- lifetime$BAU -1
+
return(AF)
#RR <- sum(lifetime$BAU) / sum(lifetime$Orderly)
+
}
lifetime$Relative <- lifetime$BAU / 1.2
+
)
lifetime$Maxloss <- lifetime$Orderly[c(156:180, 1:155)]
 
#lifetime$Submaxloss <- lifetime$Orderly[c((1:30)*6, rep(0:29, each = 5)*6+(1:5))]
 
temp <- numeric()
 
for(i in 0:29) {
 
  temp <- c(temp, i * 5 + 1:5, 151 + i)
 
}
 
lifetime$Competcause <- lifetime$Orderly[temp]
 
a <- 0.3
 
lifetime$EF0.25 <- ifelse(lifetime$Id/4 == round(lifetime$Id/4), lifetime$BAU * a, lifetime$BAU)
 
lifetime$BAU_expon <- qexp((180:1)/181, 1/6)
 
lifetime$Rel_expon <- qexp((180:1)/181, 1.2/6)
 
lifetime$Rel_exMaxloss <- lifetime$Rel_expon[c(168:180, 1:167)]
 
lifetime <- lifetime[scenario]
 
  
objects.latest("Op_en6211", code_name = "EF")
+
PAF <- Ovariable("PAF",
 +
dependencies = data.frame(Name = c(
 +
"RR", # Risk ratio
 +
"pci", # proportion of cases falling subgroup i among all cases
 +
"pei" # proportion of exposed people within subgroup i
 +
)),
 +
 +
formula = function(...) {
  
metrices <- EvalOutput(metrices)
+
peirri <- pei * (RR - 1)
 +
peirri <- unkeep(peirri, sources = TRUE, prevresults = TRUE)
  
cat("Relative risks.\n")
+
PAF <- pci * peirri / (peirri + 1) # The population subgroup could be summed up.
oprint(RR@output)
 
  
cat("Different etiologic and attributable fractions.\n")
+
return(PAF)
oprint(unkeep(metrices, sources = TRUE))
+
}
 +
)
  
BS <- 24
+
objects.store(EF, AF, PAF)
ggplot(lif@output, aes(x = Id, y = lifResult, colour = Scenario))+geom_point()+
+
cat("Ovariables EF, AF, PAF stored.\n")
  coord_cartesian(ylim=c(0,10)) + theme_gray(base_size = BS) + labs(title = "Life expectancies of 180 individuals", y = "Age at death", x = "Individual")
 
ggplot(fr@output, aes(x = Time, y = frResult, colour = Scenario, group = Scenario))+geom_line() +
 
  theme_gray(base_size = BS) + labs(title = "fraction of people dying at different time groups")
 
ggplot(surv@output, aes(x = Time, y = survResult, colour = Scenario, group = Scenario))+geom_line() +
 
  theme_gray(base_size = BS) + labs(title = "Survival curves in different scenarios")
 
ggplot(EF_eq9@output, aes(x = Time, y = EF_eq9Result, colour = Scenario, group = Scenario))+geom_line()+
 
  theme_gray(base_size = BS) + labs(title = "Development of etiologic fraction in time")
 
 
</rcode>
 
</rcode>
  
 +
A [http://en.opasnet.org/en-opwiki/index.php?title=Attributable_risk&oldid=39071#Calculations previous version of code] looked at RRs of all exposure agents and summed PAFs up.
 +
 +
 +
'''Some interesting model runs:
 +
* [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=HiKEL00JBT5himg6 Population with exponentially distributed lifetimes]
 +
* [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=Kamc6txZdl8R2fy0 Life loss to a fraction of people across the whole population]
 +
* [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=MR9wpPlPiKmsr7mD Give all life loss to the people living the longest9
 +
 
<rcode name="EF" label="Initiate ovariables (for developers only)" embed=1>
 
<rcode name="EF" label="Initiate ovariables (for developers only)" embed=1>
 
#This is code Op_en6211/EF on page [[Attributable risk]]
 
#This is code Op_en6211/EF on page [[Attributable risk]]
Line 562: Line 573:
 
== See also ==
 
== See also ==
  
* [[:en:Attributable risk|Attributable risk]].
+
* [[:en:Attributable risk|Attributable risk]] in Wikipedia.
  
 
==References==
 
==References==
  
 
<references/>
 
<references/>

Revision as of 07:02, 25 April 2016

Progression class
In Opasnet many pages being worked on and are in different classes of progression. Thus the information on those pages should be regarded with consideration. The progression class of this page has been assessed:
This page is a full draft
This page has been written through once, so all important content is already where it should be. However, the content has not been thoroughly checked yet, and for example important references might still be missing.
The content and quality of this page is being curated by THL.
Error creating thumbnail: Unable to save thumbnail to destination


Attributable risk is a fraction of total risk that can be attributed to a particular cause. There are a few different ways to calculate it. Population attributable fraction of an exposure agent is the fraction of disease that would disappear if the exposure to that agent would disappear in a population. Etiologic fraction is the fraction of cases that have occurred earlier than they would have occurred (if at all) without exposure. Etiologic fracion cannot typically be calculated based on risk ratio (RR) alone, but it requires knowledge about biological mechanisms.

Question

How to calculate attributable risk? What different approaches are there, and what are their differences in interpretation and use?

Answer

Risk ratio (RR)
risk among the exposed divided by the risk among the non-exposed
Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): RR = \frac{R_1}{R_0}.
Attributable fraction
(aka excess fraction) the fraction of cases among the exposed that would not have occurred if the exposure would not have taken place:
Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): AF = \frac{RR - 1}{RR}
Population attributable fraction
the fraction of cases among the total population that would not have occurred if the exposure would not have taken place. The most useful formulas are
Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): PAF = 1 - \frac{1}{\sum_{i=0}^k p_i (RR_i)}
for use with several population subgroups (typically with different exposure levels). Not valid when confounding exists. Subscript i refers to the ith subgroup. pi = proportion of total population in ith subgroup.
Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): PAF = 1- \sum_{i=0}^k \frac{p_{ci}}{RR_i} = \sum_i p_{ci} \frac{p_{ei}(RR_i - 1)}{p_{ei}(RR_i - 1) + 1}
which produces valid estimates when confounding exists but with a problem that parameters are often not known. pci is the proportion of cases falling in subgroup i (so that Σipci = 1), pei is the proportion of exposed people within subgroup i (and 1-pi is the fraction of unexposed)
Etiologic fraction
Fraction of cases among the exposed that would have occurred later (if at all) if the exposure would not have taken place. It cannot be calculated without understanding of the biological mechanism, but there are equations for several different models:
Equation Mechanisms and description
Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): EF = 1 Rank-preserving model. The rank of individual deaths is not affected by exposure, i.e. everyone dies in the same order as without exposure, just sooner.
Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): EF = \frac{RR - 1}{RR} Competing causes model. The hazard rate (occurrence rate of cases in time) in the exposed population is relative to that of the non-exposed population: h1(t) = RR h0(t). The ratio is constant although the hazard rates are functions of time t.
Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): EF_l = \frac{RR-1}{RR^{RR/(RR-1)}} Exponential survival model. When the survival times in the population follow exponential distribution, the lowest possible EF can be calculated from this equation. However, the exponential survival model says nothing about which individuals are affected and lose how much life years, and therefore in this model the actual EF may be between the lower bound and 1.

With this code, you can compare attributable fraction and lower (assuming exponential survival distribution) and upper bounds of etiological fraction.

What is (are) the relative risk(s), i.e. RR?:

+ Show code

This code creates a simulated population of 180 individuals and calculates their survival and attributable and etiologic fractions in different mechanistic settings.

What scenarios do you want to see?:
BAU (business as usual, no exposure
Orderly (everyone dies one year earlier because of exposure)
Relative (everyone dies 20 % faster because of exposure)
Maxloss (those who live longest are killed by exposure as early as possible)
Competcause (20 % of people died earlier than others)
EF0.25 (25 % random people die at 30 % of their expected life)
BAU_expon (Exponential distribution for deaths)
Rel_expon (RR 1.2 to reduce eveyone's life)
Rel_exMaxloss (Lost life comes to those who would have lived the longest)

Which of the scenarios is used as the reference (BAU)?:

+ Show code

Rationale

Definitions of terms

There are several different kinds of proportions that sound alike but are not. Therefore, we explain the specific meaning of several terms.

Total population (N)
The number of people in the total population considered, including cases, non-cases, exposed and non-exposed.
Classifications
There are three classifications, and every person in the total population belongs to exactly one group in each classification.
  • Disease (D): classes case (C) and non-case (nc)
  • Exposure (E): classes exposed (1) and non-exposed (0)
  • Population subgroup (S): classes i = 1, 2, ..., k (typically based on different exposure levels)
Attributable fraction (AF)
The proportion of cases caused by exposure among all cases (in the subgroup)
Proportion exposed (pe, pei)
proportion of exposed among the total population or within subgroup i: pe = N(E=1)/N, pei = N(E=1,S=i)/N(S=i)
Proportion of population (pi)
proportion of population in subgroups i among the total population: N(S=i)/N
Proportion of cases (pci)
proportion of cases in subgroups i among the total cases: N(D=c,S=i)/N(D=c)

Attributable fraction

Rockhill et al.[1] give an extensive description about different ways to calculate attributable fraction (AF) and population attributable fraction (PAF) and assumptions needed in each approach. Modern Epidemiology [2] is the authoritative source of epidemiology. They first define attributable fraction AF for a cohort of people (pages 295-297). It is the fraction of cases among the exposed that would not have occurred if the exposure would not have taken place.R↻

Different ways to calculate population attributable fraction AF and PAF.
# Formula Description
1 Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): AF = \frac{IP_1 - IP_0}{IP_1} = \frac{RR-1}{RR} is empirical approximation of [1]

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \frac{P(D) - \sum_C P(D|C, \bar{E}) P(C)}{P(D)}

where IP1 = cumulative proportion of total population developing disease over specified interval; IP0 = cumulative proportion of unexposed persons who develop disease over interval. Valid only when no confounding of exposure(s) of interest exists. If disease is rare over time interval, ratio of average incidence rates I0/I1 approximates ratio of cumulative incidence proportions, and thus formula can be written as (I1 - I0)/I1. Both formulations found in many widely used epidemiology textbooks.

2 Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \frac{p_e(RR-1)}{p_e(RR-1)+1} Transformation of formula 1.[1] Not valid when there is confounding of exposure-disease association. pe = proportion of total population exposed to the factor of interest. RR may be ratio of two cumulative incidence proportions (risk ratio), two (average) incidence rates (rate ratio), or an approximation of one of these ratios. Found in many widely used epidemiology texts, but often with no warning about invalidness when confounding exists.
3 Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \frac{\sum_{i=0}^k p_i (RR_i - 1)}{1 + \sum_{i=0}^k p_i (RR_i - 1)} = 1 - \frac{1}{\sum_{i=0}^k p_i (RR_i)} Extension of formula 2 for use with multicategory exposures. Not valid when confounding exists. Subscript i refers to the ith exposure level. pi = proportion of total population in ith exposure level, RRj = relative risk comparing ith exposure level with unexposed group (i = 0). Derived by Walter[3]; given in Kleinbaum et al.[4] but not in other widely used epidemiology texts.
4 Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \sum_i p_{ci} \frac{p_{ei}(RR_i - 1)}{p_{ei}(RR_i - 1) + 1} A useful formulation where [5]
  • pci is the proportion of cases falling in subgroup i (so that Σipci = 1),
  • pei is the fraction of exposed people within subgroup i (and 1-pi is the fraction of unexposed),
  • RRi is the risk ratio for subgroup i due to the subgroup-specific exposure level (assuming that everyone in that subgroup is exposed to that level or none).
5 Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): p_c(\frac{RR-1}{RR}) Alternative expression of formula 3.[1] Produces internally valid estimate when confounding exists and when, as a result, adjusted relative risks must be used.[6] pc = proportion of cases exposed to risk factor. In Kleinbaum et al.[4] and Schlesselman.[7]
6 Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \sum_{i=0}^k p_{ci} (\frac{RR_i - 1}{RR_i}) = 1- \sum_{i=0}^k \frac{p_{ci}}{RR_i} Extension of formula 5 for use with multicategory exposures.[1] Produces internally valid estimate when confounding exists and when, as a result, adjusted relative risks must be used. pci = proportion of cases falling into ith exposure level; RRi = relative risk comparing ith exposure level with unexposed group (i = 0). See Bruzzi et al. [8] and Miettinen[6] for discussion and derivations; in Kleinbaum et al.[4] and Schlesselman.[7]

Impact of confounders

Error creating thumbnail: Unable to save thumbnail to destination
Darrow and Steenland[5] studied the direction and magnitude of bias in attributable fraction with different confounding situations. For details, see Attributable risk#Impact of confounders.

The problem with the two PAF equations (see [[#Answer|]]) is that the former has easier-to-collect input, but it is not valid if there is confounding. It is still often mistakenly used. The latter equation would produce an unbiased estimate, but the data needed is harder to collect. Darrow and Steenland[5] have studied the impact of confounding on the bias in attributable fraction. This is their summary:

The impact of confounding on the bias in attributable fraction.
Bias in attributable fraction Confounding in RR Confounding in inputs
AF bias (-), calculated AF is smaller than true AF Conf RR (+), crude RR is larger than adjusted (true) RR Confounder is positively associated with exposure and disease (++)
Confounder is negatively associated with exposure and disease (--)
AF bias (+), calculated AF is larger than true AF Conf RR (-), crude RR is smaller than adjusted (true) RR Confounder is negatively associated with exposure and positively with disease (-+)
Confounder is positively associated with exposure and negatively with disease (+-)

Population attributable fraction

The population attributable fraction PAF is that fraction among the whole cohort:

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): PAF = \frac{N_1 (R_1 - R_0)}{N_1 R_1 + N_0 R_0} = \frac{N_1 (R_1 - R_0)/R_0}{N_1 R_1/R_0 + N_0 R_0/R_0} = \frac{N_1 (RR - 1)}{N_1 RR + N_0}

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): = \frac{ \frac{N_1 (RR - 1)}{N_1 + N_0} }{ \frac{N_1 RR + N_0}{N_1 + N_0}} = \frac{ p (RR - 1) }{ \frac{N_1 RR - N_1 + (N_1 + N_0)}{N_1 + N_0}} = \frac{p (RR - 1)}{p RR - p + 1} = \frac{p (RR - 1)}{p (RR - 1) + 1},

where

  • N1 and N0 are the numbers of exposed and unexposed people, respectively,
  • R1 and R0 are the risks of disease in the exposed and unexposed group, respectively, and RR = R1 / R0,
  • p is the fraction of exposed people among the whole cohort.

Note that there is a typo in the Modern Epidemiology book: the denominator should be p(RR-1)+1, not p(RR-1)-1.

Population attributable fraction can be calculated as a weighted average based on subgroup data:

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): PAF = \Sigma_i p_{ci} PAF_{i},

where

  • pci is the proportion of cases falling in stratum (subgroup) i,
  • PAFi is the population attributable fraction calculated for the subgroup.

Specifically, we can divide the cohort into subgroups based on exposure (in the simplest case exposed and unexposed), so we get

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): PAF = p_c \frac{1(RR - 1)}{1(RR - 1) + 1} + (1 - p_c) \frac{0(RR - 1)}{0(RR - 1) +1} = p_c \frac{RR - 1}{RR},

where pc is the proportion of cases in the exposed group among all cases; this is the same as exposure prevalence among cases.

WHO approach

PAF is [9]

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): PAF = \frac{\sum_{i=0}^k P_i RR_i - \Sigma_{i=0}^k P'_i RR_i}{\Sigma_{i=0}^k P_i RR_i}

where i is a certain exposure level, P is the fraction of population in that exposure level, RR is the relative risk at that exposure level, and P' is the fraction of population in a counterfactual ideal situation (where the exposure is typically lower).

Based on this, we can limit our examination to a situation where there are only two population groups, one exposed to background level (with relative risk 1) and the other exposed to a higher level (with relative risk RR). In the counterfactual situation nobody is exposed. Thus, we get

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): PAF = \frac{(P RR + (1-P)*1) - (0*RR + 1*1)}{P RR + (1-P)*1}

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): PAF = \frac{P RR - P}{P RR + 1 - P}

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): PAF = \frac{P(RR - 1)}{P(RR -1) + 1}

This equation is used in e.g. Health impact assessment.

Constant background assumption

# : Is this section necessary? --Jouni (talk) 20:53, 7 April 2016 (UTC)

pci can be calculated for each subgroup with the following equation if the background risk of disease is equal in all subgroups (and thus cancels out):

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): p_{ci} = \frac{N_i \Pi_j RR_{i,j}}{\Sigma_i N_i \Pi_j RR_{i,j}},

where

  • Ni is the number of people in each subgroup i,
  • RRi,j is the risk ratio in subgroup i due to pollutant j (accounting for the estimated exposure in the subgroup). Note that this assumes that multiplicative assumption holds between different pollutant effects.

This page does not contain R code. Instead, it is written as part of the model in Health impact assessment.

pc can be calculated by first calculating number of cases in each subgroup:

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): cases_i = N_i * background * \Pi_j e^{ln(ERF_{j}) exposure_{i,j}},

where

  • casesi is the number of cases in subgroup i,
  • Ni is the number of people in subgroup i,
  • background is the background risk of the disease in the unexposed; we assume that it is the same in all subgroups,
  • ERFj is the risk ratio for unit exposure for each pollutant j (if the exposure response function ERF assumes another form than relative risk, i.e. exponential, then another equations must be used),
  • exposurei,j is the amount of exposure in a subgroup i to pollutant j.

Therefore,

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): p_{ci} = \frac{cases_i}{\Sigma_i cases_i} = \frac{N_i * background * \Pi e^{ln(ERF_{j}) exposure_{i,j}}}{background \Sigma N_i \Pi e^{ln(ERF_{j}) exposure_{i,j}}}

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): p_{ci} = \frac{N_i \Pi_j RR_{i,j}}{\Sigma_i N_i \Pi_j RR_{i,j}},

where RRi,j = exp(ln(ERFj) exposurei,j).

In addition, if only fraction p of the population is exposed, for the total population we get

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): RR = \frac{p * N * background * RR_{exposed} + (1-p) * N * background * RR_{unexposed}}{N * background * RR_{unexposed}}

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): = \frac{p e^{ln(ERF)exposure} + (1-p)1}{1} = p e^{ln(ERF)exposure} -p + 1

Etiologic fraction

Etiologic fraction is defined as the fraction of cases that is advanced in time because of exposure.[10]R↻ It can also be called probability of causation, which has importance in court. It can also be used to calculate premature cases, but that word is ambiguous and sometimes attributable fraction is used instead.R↻ Therefore, it is important to explicitly explain what is meant by the work premature.

The exact value of etiologic fraction cannot be estimated directly from risk ratio (RR) because some knowledge is needed about biological mechanisms (more precisely: timing of disease). In any case, the etiologic fraction always lies between f and 1, when f is

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \frac{RR - 1}{RR^{RR/(RR-1)}}.

The code below calculates the attributable fraction and lower and upper bounds of the etiological fraction for user-defined RRs.

Estimating etiologic fraction

Etiologic fraction (EF) tells which fraction of the population dies earlier because of the exposure considered, compared with the situation that they would not be exposed. Often that is calculated is the attributable fraction AF:

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): AF = \frac{RR - 1}{RR},

where RR is the relative risk: risk of exposed people divided by the risk of non-exposed. Robins and Greenland[10] studied the estimability of etiologic fraction. They concluded that observations are not enough to conclude about the precise value of EF, because irrespective of observation, the same amount of observed life years lost may be due to many people losing a short time each, or due to a few losing a long time each. The upper limit in theory is always 1, and the lower bound they estimated by this equation (equation 9 in the article):

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \frac{\int_G [f_1(u) - f_0(u)]du}{1 - S_1(t)},

where 1 means the exposed group, 0 means the non-exposed group, f is the proportion of population dying particular time points, S is the survival function, t is the length of the observation time, u the observation time and G is the set of all u < t such that f1(u) > f0(u).

From this, they derive the following equation (number 11 in the article):

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \frac{RR-1}{RR^{RR/(RR-1)}}

It is not clear how they got from the first equation to the second equation, especially because the second gives values that are typically less than half of that if the first one. Both are used in the code below. In addition, the true etiologic fraction is calculated for this simulated population, because in the simulation we assume that we know exactly what happens to each individual in each scenario and how much their lengths of lives change.

Calculations

# : UPDATE AF TO REFLECT THE CURRENT IMPLEMENTATION OF ERF Exposure-response function --Jouni (talk) 05:20, 13 June 2015 (UTC)

+ Show code

A previous version of code looked at RRs of all exposure agents and summed PAFs up.


Some interesting model runs:

+ Show code

See also

References

  1. 1.0 1.1 1.2 1.3 1.4 Rockhill B, Newman B, Weinberg C. use and misuse of population attributable fractions. American Journal of Public Health 1998: 88 (1) 15-19.[1]
  2. Kenneth J. Rothman, Sander Greenland, Timothy L. Lash: Modern Epidemiology. Lippincott Williams & Wilkins, 2008. 758 pages.
  3. Walter SD. The estimation and interpretation of attributable fraction in health research. Biometrics. 1976;32:829-849.
  4. 4.0 4.1 4.2 Kleinbaum DG, Kupper LL, Morgenstem H. Epidemiologic Research. Belmont, Calif: Lifetime Learning Publications; 1982:163.
  5. 5.0 5.1 5.2 Darrow LA, Steenland NK. Confounding and bias in the attributable fraction. Epidemiology 2011: 22 (1): 53-58. [2] doi:10.1097/EDE.0b013e3181fce49b
  6. 6.0 6.1 Miettinen 0. Proportion of disease caused or prevented by a given exposure, trait, or intervention. Am JEpidemiol. 1974;99:325-332.
  7. 7.0 7.1 Schlesselman JJ. Case-Control Studies: Design, Conduct, Analysis. New York, NY: Oxford University Press Inc; 1982.
  8. Bruzzi P, Green SB, Byar DP, Brinton LA, Schairer C. Estimating the population attributable risk for multiple risk factors using case-control data. Am J Epidemiol. 1985; 122: 904-914.
  9. WHO: Health statistics and health information systems. [3]. Accessed 16 Nov 2013.
  10. 10.0 10.1 Robins JM, Greenland S. Estimability and estimation of excess and etiologic fractions. Statistics in Medicine 1989 (8) 845-859.