Attributable risk

**Progression class**
In Opasnet many pages being worked on and are in different classes of progression. Thus the information on those pages should be regarded with consideration. The progression class of this page has been assessed: This page is a full draft This page has been written through once, so all important content is already where it should be. However, the content has not been thoroughly checked yet, and for example important references might still be missing.	The content and quality of this page is being curated by THL. Error creating thumbnail: Unable to save thumbnail to destination

This page is a method. The page identifier is Op_en6211
Moderator:Jouni (see all)
Give your opinion to the peer rating of the content of this page. {{ #opasnet_rater: }}
Upload data Show results

Attributable risk is a fraction of total risk that can be attributed to a particular cause. There are a few different ways to calculate it. Population attributable fraction of an exposure agent is the fraction of disease that would disappear if the exposure to that agent would disappear in a population. Etiologic fraction is the fraction of cases that have occurred earlier than they would have occurred (if at all) without exposure. Etiologic fracion cannot typically be calculated based on risk ratio (RR) alone, but it requires knowledge about biological mechanisms.

Question

How to calculate attributable risk? What different approaches there are, and what are their differences in interpretation and use?

Answer

Risk ratio (RR)

risk among the exposed divided by the risk among the non-exposed

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): RR = \frac{R_1}{R_0}.

Attributable fraction

the fraction of cases among the exposed that would not have occurred if the exposure would not have taken place:

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): AF = \frac{RR - 1}{RR}

Population attributable fraction

the fraction of cases among the whole population that would not have occurred if the exposure would not have taken place. The most useful formulas are

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): PAF = 1 - \frac{1}{\sum_{i=0}^k p_i (RR_i)}

for use with several population subgroups (typically with different exposure levels). Not valid when confounding exists. Subscript i refers to the ith subgroup. p_i = proportion of source population in ith subgroup.

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): PAF = 1- \sum_{i=0}^k \frac{p_{ci}}{RR_i} = \sum_i p_{ci} \frac{p_i(RR_i - 1)}{p_i(RR_i - 1) + 1}

which produces valid estimates when confounding exists but with a problem that parameters are often not known. p_ci is the proportion of cases falling in subgroup i (so that Σ_ip_ci = 1), p_i is the fraction of exposed people within subgroup i (and 1-p_i is the fraction of unexposed)

Etiologic fraction: Fraction of cases among the exposed that would have occurred later (if at all) if the exposure would not have taken place. It cannot be calculated without understanding of the biological mechanism, but it is always between; Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \frac{RR-1}{RR^{RR/(RR-1)}} and 1.

Rationale

Etiologic fraction

Etiologic fraction is defined as the fraction of cases that is advanced in time because of exposure.^[1] It can also be called probability of causation, which has importance in court. Its exact value cannot be estimated directly from risk ratio (RR) because some knowledge is needed about biological mechanisms (more precisely: timing of disease). In any case, the etiologic fraction always lies between f and 1, when f is

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \frac{RR - 1}{RR^{RR/(RR-1)}}.

The code below calculates the attributable fraction and lower and upper bounds of the etiological fraction for user-defined RRs.

Attributable fraction

Rockhill et al.^[2] give an extensive description about different ways to calculate attributable fraction (AF) and population attributable fraction (PAF) and assumptions needed in each approach. Modern Epidemiology ^[3] is the authoritative source of epidemiology. They first define attributable fraction AF for a cohort of people (pages 295-297). It is the fraction of cases among the exposed that would not have occurred if the exposure would not have taken place.

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): AF = \frac{IP_1 - IP_0}{IP_1} = \frac{RR-1}{RR}

is empirical approximation of

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \frac{P(D) - \sum_C P(D|C, \bar{E}) P(C)}{P(D)}

where IP₁ = cumulative proportion of total population developing disease over specified interval; IP₀ = cumulative proportion of unexposed persons who develop disease over interval. Valid only when no confounding of exposure(s) of interest exists. If disease is rare over time interval, ratio of average incidence rates I₀/I₁ approximates ratio of cumulative incidence proportions, and thus formula can be written as (I₁ - I₀)/I₁. Both formulations found in many widely used epidemiology textbooks.

**Different ways to calculate population attributable fraction PAF**
Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \frac{p_e(RR-1)}{p_e(RR-1)+1}	Transformation of formula 1. ⇤# : Which is formula 1? Same questions for the other ones below --Arja (talk) 07:01, 8 April 2016 (UTC) Not valid when there is confounding of exposure-disease association. p_e = proportion of source population exposed to the factor of interest. RR may be ratio of two cumulative incidence proportions (risk ratio), two (average) incidence rates (rate ratio), or an approximation of one of these ratios. Found in many widely used epidemiology texts, but often with no warning about invalidness when confounding exists.
Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \frac{\sum_{i=0}^k p_i (RR_i - 1)}{1 + \sum_{i=0}^k p_i (RR_i - 1)} = 1 - \frac{1}{\sum_{i=0}^k p_i (RR_i)}	Extension of formula 2 for use with multicategory exposures. Not valid when confounding exists. Subscript i refers to the ith exposure level. p_i = proportion of source population in ith exposure level, RR_j = relative risk comparing ith exposure level with unexposed group (i = 0). Derived by Walter^[4]; given in Kleinbaum et al.^[5] but not in other widely used epidemiology texts.
Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \sum_i p_{ci} \frac{p_i(RR_i - 1)}{p_i(RR_i - 1) + 1}	A useful formulation where p_ci is the proportion of cases falling in subgroup i (so that Σ_ip_ci = 1), p_i is the fraction of exposed people within subgroup i (and 1-p_i is the fraction of unexposed), RR_i is the risk ratio for subgroup i due to the subgroup-specific exposure level (assuming that everyone in that subgroup is exposed to that level or none).
Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): p_c(\frac{RR-1}{RR})	Alternative expression. Produces internally valid estimate when confounding exists and when, as a result, adjusted relative risks must be used.^[6] p_c = proportion of cases exposed to risk factor. In Kleinbaum et al.^[5] and Schlesselman.^[7]
Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): \sum_{i=0}^k p_{ci} (\frac{RR_i - 1}{RR_i}) = 1- \sum_{i=0}^k \frac{p_{ci}}{RR_i}	Extension of formula 4 for use with multicategory exposures. Produces internally valid estimate when confounding exists and when, as a result, adjusted relative risks must be used. p_ci = proportion of cases falling into ith exposure level; RR_i = relative risk comparing ith exposure level with unexposed group (i = 0). See Bruzzi et al. ^[8] and Miettinen^[6] for discussion and derivations; in Kleinbaum et al.^[5] and Schlesselman.^[7]

Impact of confounders

The problem with the two PAF equations (see [[#Answer|]]) is that the former has easier-to-collect input, but it is not valid if there is confounding. It is still often mistakenly used. The latter equation would produce an unbiased estimate, but the data needed is harder to collect. Darrow and Steenland^[9] have studied the impact of confounding on the bias in attributable fraction. This is their summary:

**The impact of confounding on the bias in attributable fraction.**
Bias in attributable fraction	Confounding in RR	Confounding in inputs
AF bias (-), calculated AF is smaller than true AF	Conf RR (+), crude RR is larger than adjusted (true) RR	Confounder is positively associated with exposure and disease (++)
AF bias (-), calculated AF is smaller than true AF	Conf RR (+), crude RR is larger than adjusted (true) RR	Confounder is negatively associated with exposure and disease (--)
AF bias (+), calculated AF is larger than true AF	Conf RR (-), crude RR is smaller than adjusted (true) RR	Confounder is negatively associated with exposure and positively with disease (-+)
AF bias (+), calculated AF is larger than true AF	Conf RR (-), crude RR is smaller than adjusted (true) RR	Confounder is positively associated with exposure and negatively with disease (+-)

Calculations

With this code, you can compare attributable fraction and lower and upper bounds of etiological fraction.

+ Show code - Hide code

library(OpasnetUtils)
AF <- function(x) {return(data.frame(RR = x, AF = (x-1)/x, EF_lower = (x-1)/x^(x/(x-1)), EF_upper = 1))}

oprint(AF(RR))

⇤# : UPDATE AF TO REFLECT THE CURRENT IMPLEMENTATION OF ERF Exposure-response function --Jouni (talk) 05:20, 13 June 2015 (UTC)

+ Show code - Hide code

library(OpasnetUtils)

# UPDATE AF TO REFLECT THE CURRENT IMPLEMENTATION OF ERF [[Exposure-response function]]
### ESTIMATES OF ATTRIBUTABLE CASES BASED ON ATTRIBUTABLE FRACTION
# We estimate the number of cases and their attributable causes based on [[Population attributable fraction]].

AF <- Ovariable("AF", # Cases attributed to specific (combinations of) causal exposures.
	dependencies = data.frame(Name = c(
		"ERF", # Exposure-response function
		"exposure", # Total exposure to an agent or pollutant
		"frexposed", # fraction of population that is exposed
		"bgexposure" # Background exposure to an agent (a level below which you cannot get in practice)
	)),
	
	formula = function(...) {

		# First calculate risk ratio and remove redundant columns because they cause harm when operated with itself.
		RR <- frexposed * exp(log(ERF) * (exposure - bgexposure)) - frexposed + 1
		PAF <- (RR - 1) / unkeep(RR, sources = TRUE, prevresults = TRUE)

		# pollutants is a vector of pollutants considered.
		pollutants <- as.character(unique(exposure@output$Pollutant))

		expname <- paste(exposure@name, "Result", sep = "")

		out <- 1
		for(i in 1:length(pollutants)) {
			
			# Attributable fraction of a particular pollutant is combined with all pollutant AFs.
			# The combination has 2^n rows (n = number of pollutants). Pollutant is either + or - depending on
			# whether it caused the disease or not.
			temp <- Ovariable("temp", data = data.frame(
				Pollutant = pollutants[i], 
				Temp1 = c(paste(pollutants[i], "-", sep = ""), paste(pollutants[i], "+", sep = "")), 
				Result = c(-1, 1) # Non-causes are temporarily marked with negative numbers.
			))
			temp <- temp * PAF

			# Non-causes are given the remainder (1-AF) of temporary attributable fraction AF.
			result(temp) <- ifelse(result(temp) > 0, result(temp), 1 + result(temp))
			# Causes with 0 AF are marked 1. This must be corrected.
			result(temp) <- ifelse(result(temp) == 1 & grepl("\\+", temp@output$Temp1), 0, result(temp))

			#If exists, the exposureResult is renamed so that it can be kept without side effects.
			#These should not be marginals but there seems to be problems in this respect.
			if(expname != "Result"){
				colnames(temp@output)[colnames(temp@output) == expname] <- paste("expo", pollutants[i], sep = "")
			}
			
			out <- out * temp
			out <- unkeep(out, cols = "Pollutant", sources = TRUE, prevresults = TRUE)

			# Combine and rename columns.
			if(i == 1) {
				colnames(out@output)[colnames(out@output) == "Temp1"] <- "Causes"
			} else {
				out@output$Causes <- paste(out@output$Causes, out@output$Temp1)
				out@output$Temp1 <- NULL
			}
		}
		return(out)
	}
)

objects.store(AF)
cat("Ovariable AF stored.\n")

Derivation of PAF

⇤# : Do we need this section? --Jouni (talk) 20:53, 7 April 2016 (UTC)

The population attributable fraction PAF is that fraction among the whole cohort:

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): PAF = \frac{N_1 (R_1 - R_0)}{N_1 R_1 + N_0 R_0} = \frac{N_1 (R_1 - R_0)/R_0}{N_1 R_1/R_0 + N_0 R_0/R_0} = \frac{N_1 (RR - 1)}{N_1 RR + N_0}

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): = \frac{ \frac{N_1 (RR - 1)}{N_1 + N_0} }{ \frac{N_1 RR + N_0}{N_1 + N_0}} = \frac{ p (RR - 1) }{ \frac{N_1 RR - N_1 + (N_1 + N_0)}{N_1 + N_0}} = \frac{p (RR - 1)}{p RR - p + 1} = \frac{p (RR - 1)}{p (RR - 1) + 1},

where

N₁ and N₀ are the numbers of exposed and unexposed people, respectively,
R₁ and R₀ are the risks of disease in the exposed and unexposed group, respectively, and RR = R₁ / R₀,
p is the fraction of exposed people among the whole cohort.

Note that there is a typo in the Modern Epidemiology book: the denominator should be p(RR-1)+1, not p(RR-1)-1.

Population attributable fraction can be calculated as a weighted average based on subgroup data:

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): PAF = \Sigma_i p_{ci} PAF_{i},

where

p_ci is the proportion of cases falling in stratum (subgroup) i,
PAF_i is the population attributable fraction calculated for the subgroup.

Specifically, we can divide the cohort into subgroups based on exposure (in the simplest case exposed and unexposed), so we get

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): PAF = p_c \frac{1(RR - 1)}{1(RR - 1) + 1} + (1 - p_c) \frac{0(RR - 1)}{0(RR - 1) +1} = p_c \frac{RR - 1}{RR},

where p_c is the proportion of cases in the exposed group among all cases; this is the same as exposure prevalence among cases.

WHO approach

PAF is ^[10]

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): PAF = \frac{\sum_{i=0}^k P_i RR_i - \Sigma_{i=0}^k P'_i RR_i}{\Sigma_{i=0}^k P_i RR_i}

where i is a certain exposure level, P is the fraction of population in that exposure level, RR is the relative risk at that exposure level, and P' is the fraction of population in a counterfactual ideal situation (where the exposure is typically lower).

Based on this, we can limit our examination to a situation where there are only two population groups, one exposed to background level (with relative risk 1) and the other exposed to a higher level (with relative risk RR). In the counterfactual situation nobody is exposed. Thus, we get

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): PAF = \frac{(P RR + (1-P)*1) - (0*RR + 1*1)}{P RR + (1-P)*1}

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): PAF = \frac{P RR - P}{P RR + 1 - P}

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): PAF = \frac{P(RR - 1)}{P(RR -1) + 1}

This equation is used in e.g. Health impact assessment.

Constant background assumption

⇤# : Is this section necessary? --Jouni (talk) 20:53, 7 April 2016 (UTC)

p_ci can be calculated for each subgroup with the following equation if the background risk of disease is equal in all subgroups (and thus cancels out):

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): p_{ci} = \frac{N_i \Pi_j RR_{i,j}}{\Sigma_i N_i \Pi_j RR_{i,j}},

where

N_i is the number of people in each subgroup i,
RR_i,j is the risk ratio in subgroup i due to pollutant j (accounting for the estimated exposure in the subgroup). Note that this assumes that multiplicative assumption holds between different pollutant effects.

This page does not contain R code. Instead, it is written as part of the model in Health impact assessment.

p_c can be calculated by first calculating number of cases in each subgroup:

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): cases_i = N_i * background * \Pi_j e^{ln(ERF_{j}) exposure_{i,j}},

where

cases_i is the number of cases in subgroup i,
N_i is the number of people in subgroup i,
background is the background risk of the disease in the unexposed; we assume that it is the same in all subgroups,
ERF_j is the risk ratio for unit exposure for each pollutant j (if the exposure response function ERF assumes another form than relative risk, i.e. exponential, then another equations must be used),
exposure_i,j is the amount of exposure in a subgroup i to pollutant j.

Therefore,

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): p_{ci} = \frac{cases_i}{\Sigma_i cases_i} = \frac{N_i * background * \Pi e^{ln(ERF_{j}) exposure_{i,j}}}{background \Sigma N_i \Pi e^{ln(ERF_{j}) exposure_{i,j}}}

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): p_{ci} = \frac{N_i \Pi_j RR_{i,j}}{\Sigma_i N_i \Pi_j RR_{i,j}},

where RR_i,j = exp(ln(ERF_j) exposure_i,j).

In addition, if only fraction p of the population is exposed, for the whole population we get

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): RR = \frac{p * N * background * RR_{exposed} + (1-p) * N * background * RR_{unexposed}}{N * background * RR_{unexposed}}

Failed to parse (Missing <code>texvc</code> executable. Please see math/README to configure.): = \frac{p e^{ln(ERF)exposure} + (1-p)1}{1} = p e^{ln(ERF)exposure} -p + 1

References

↑ Robins JM, Greenland S. Estimability and estimation of excess and etiologic fractions. Statistics in Medicine 1989 (8) 845-859.
↑ Rockhill B, Newman B, Weinberg C. use and misuse of population attributable fractions. American Journal of Public Health 1998: 88 (1) 15-19.[1]
↑ Kenneth J. Rothman, Sander Greenland, Timothy L. Lash: Modern Epidemiology. Lippincott Williams & Wilkins, 2008. 758 pages.
↑ Walter SD. The estimation and interpretation of attributable fraction in health research. Biometrics. 1976;32:829-849.
↑ ^5.0 ^5.1 ^5.2 Kleinbaum DG, Kupper LL, Morgenstem H. Epidemiologic Research. Belmont, Calif: Lifetime Learning Publications; 1982:163.
↑ ^6.0 ^6.1 Miettinen 0. Proportion of disease caused or prevented by a given exposure, trait, or intervention. Am JEpidemiol. 1974;99:325-332.
↑ ^7.0 ^7.1 Schlesselman JJ. Case-Control Studies: Design, Conduct, Analysis. New York, NY: Oxford University Press Inc; 1982.
↑ Bruzzi P, Green SB, Byar DP, Brinton LA, Schairer C. Estimating the population attributable risk for multiple risk factors using case-control data. Am J Epidemiol. 1985; 122: 904-914.
↑ Darrow LA, Steenland NK. Confounding and bias in the attributable fraction. Epidemiology 2011: 22 (1): 53-58. [2] doi:10.1097/EDE.0b013e3181fce49b
↑ WHO: Health statistics and health information systems. [3]. Accessed 16 Nov 2013.

[robins-1] Robins JM, Greenland S. Estimability and estimation of excess and etiologic fractions. Statistics in Medicine 1989 (8) 845-859.

[rockhill-2] Rockhill B, Newman B, Weinberg C. use and misuse of population attributable fractions. American Journal of Public Health 1998: 88 (1) 15-19.[1]

[3] Kenneth J. Rothman, Sander Greenland, Timothy L. Lash: Modern Epidemiology. Lippincott Williams & Wilkins, 2008. 758 pages.

[walter-4] Walter SD. The estimation and interpretation of attributable fraction in health research. Biometrics. 1976;32:829-849.

[kleinbaum-5] 5.0 ^5.1 ^5.2 Kleinbaum DG, Kupper LL, Morgenstem H. Epidemiologic Research. Belmont, Calif: Lifetime Learning Publications; 1982:163.

[miettinen-6] 6.0 ^6.1 Miettinen 0. Proportion of disease caused or prevented by a given exposure, trait, or intervention. Am JEpidemiol. 1974;99:325-332.

[schlesselman-7] 7.0 ^7.1 Schlesselman JJ. Case-Control Studies: Design, Conduct, Analysis. New York, NY: Oxford University Press Inc; 1982.

[bruzzi-8] Bruzzi P, Green SB, Byar DP, Brinton LA, Schairer C. Estimating the population attributable risk for multiple risk factors using case-control data. Am J Epidemiol. 1985; 122: 904-914.

[darrow-9] Darrow LA, Steenland NK. Confounding and bias in the attributable fraction. Epidemiology 2011: 22 (1): 53-58. [2] doi:10.1097/EDE.0b013e3181fce49b

[10] WHO: Health statistics and health information systems. [3]. Accessed 16 Nov 2013.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

Attributable risk

Contents

Question

Answer

Rationale

Etiologic fraction

Attributable fraction

Impact of confounders

Calculations

Derivation of PAF

Constant background assumption

See also

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Page Tools

Tools

In other websites