Value of information

<section begin=glossary />
Value of information (VOI) in decision analysis is the amount a decision maker would be willing to pay for information prior to making a decision.[1]. Value of information is specific to a combination of a particular decision with several options, a particular objective (i.e., outcome of interest that can be quantitatively estimated), and a particular issue that is affected by the decision and is relevant for the objective. If all such issues are considered at the same time, we talk about expected value of perfect information.<section end=glossary />

Question

How can value of information be calculated in an assessment in such a way that

• it helps in understanding the impacts of uncertainties on conclusions and
• it helps to direct further assessment efforts to improve guidance to decision making?

This code is an example about how the VOI (value of information) function can be used. The actual function is defined in the code under the title Input.

 ```library(OpasnetUtils) # The package of Opasnet functionalities objects.latest("Op_en2480", code_name = "Initiate") # Load VOI functions # Define the example ovariable. ova <- EvalOutput(Ovariable(name = "ova", data = data.frame(C1 = c("A", "B"), Result = c("1-3", 2)) )) # Define decisions decisions <- data.frame( # Define decisions to be applied Decision = "D1", Option = c("BAU", "D1b", "D1b"), Variable = "ova", Cell = c("C1:A", "C1:A", "C1:B"), Change = "Add", Result = c(0, "-0.8-0.4", "-0.6-0.9") ) # Apply decisions DecisionTableParser(decisions) ova <- CheckDecisions(ova) # Define stakeholder preferences stakeholders <- data.frame( # Define stakeholder preferences Decision = "Stakeholder", Option = c("General", "A-promoters", "A-promoters"), Variable = "ova", Cell = c("C1:A", "C1:A", "C1:B"), Change = "Multiply", Result = c(1, 2, 0.5) ) # Apply stakeholder preferences DecisionTableParser(stakeholders) ova <- CheckDecisions(ova) cat("Data for making ova, the ovariable in this example,\n") oprint(ova@data) cat("Decision applied on ova.\n") oprint(decisions) cat("Stakeholder preferences applied on ova.\n") oprint(stakeholders) cat("The structure of ova output. D1 is a decision, C1 an index, Iter is the number of random sample,", " and ovaResult is the actual result of the ovariable.\n") oprint(head(ova@output)) voi <- VOI(ova, decision = "D1", indices = "C1", stakeholder = "Stakeholder") # Actual function cat("VOI table. Result column shows the VOI, Var shows what was calculated, and other columns are intermediate results.\n") oprint(voi) # Print the output. ```

Rationale

Input

To calculate value of information, you need

• a decision to be made with at least two different options a decision maker can choose from,
• an objective (i.e., outcome of interest or indicator) that can be quantitatively estimated and optimised,
• an optimising function to be used as the criterion for the best decision,
• an uncertain variable of interest (optional, needed only if partial VOI is calculated for the variable; if omitted, combined value of information is estimated for all uncertain variables in the assessment model).

 ```library(OpasnetUtils) library(ggplot2) Optimize <- function(dat, indices = NULL, keepindex = TRUE, bins = 5){ # dat is an ovariable. # indices is a vector of indices to save in the output. If NULL, only one value is kept. # keepindex = TRUE keeps the indices in the output. # bins is the number of bins used when a continuous numeric index is converted to a factor. # If bins == NA, indices are not binned. Iter is never binned. # Optimize finds the smallest value of dat for each unique group defined by indices. dat@output <- dropall(dat@output[!is.na(result(dat)) , ]) # Drop reduntand rows and locations if(is.null(indices)) { ind <- rep(1, length(result(dat))) } else { ind <- dat@output[indices] for(i in 1:length(ind)) { if(is.numeric(ind[[i]]) & colnames(ind)[i] != "Iter" & !is.na(bins)) {ind[[i]] <- cut(ind[[i]], bins)} } } minrow <- tapply(result(dat), ind, which.min) # Find the row for smallest value of each unique group rownum <- tapply(1:nrow(dat@output), ind, list) # Order row numbers according to unique groups. temp <- NULL for (i in 1:length(rownum)) { temp <- c(temp, rownum[[i]][minrow[i]]) # Pick the row with minimum of each unique group } temp <- temp[!is.na(temp)] keep <- colnames(dat@output) if(!keepindex) { keep <- keep[!keep %in% indices] if(is.null(indices)) {keep <- comment(result(dat))} } dat@output <- dat@output[temp , ] dat@output <- dat@output[keep] dat@marginal <- colnames(dat@output) %in% indices return(dat) } Aggregate <- function(dat, indices = NULL, fun = mean, name = NULL, bins = 5){ # dat is an ovariable. # indices is a vector of indices to save in the output. # fun in the function to use with aggregation. # name is the name to be used with the new ovariable. # bins is the number of bins used when a continuous numeric index is converted to a factor. # If bins == NA, indices are not binned. Iter is never binned. # Aggregate collapses an ovariable in a similar way as the generic function aggregate. dat@output <- dropall(dat@output[!is.na(result(dat)) , ]) # Drop reduntand rows and locations if(is.null(indices)) { ind <- data.frame(Temp = rep(1, length(result(dat)))) } else { ind <- dat@output[indices] for(i in 1:length(ind)) { if(is.numeric(ind[[i]]) & colnames(ind)[i] != "Iter" & !is.na(bins)) {ind[[i]] <- cut(ind[[i]], bins)} } } temp <- as.data.frame(as.table(tapply(result(dat), ind, fun))) temp <- temp[colnames(temp) != "Temp"] if(is.null(name)) { colnames(temp)[colnames(temp) == "Freq"] <- comment(result(dat)) } else { colnames(temp)[colnames(temp) == "Freq"] <- paste(name, "Result", sep = "") dat@name <- name } dat@output <- temp dat@marginal <- colnames(dat@output) %in% indices return(dat) } VOI <- function( ova, decision, indices = colnames(ova@output)[ova@marginal], bins = 5, stakeholder = NULL ) { # ova is an ovariable # decision is an index that is used as the decision to be considered. # indices are the indices whose partial values of information are estimated. # bins is the number of bins used when a continuous numeric index is converted to a factor. # If bins == NA, indices are not binned. Iter is never binned. # VOI calculates # a) expected value of perfect information, # b) expected value of including an option, for all options of the decision, # c) expected value of partial perfect information, for all marginal indices or indices given by the user. # ncuu = net cost under uncertainty ncuu <- Optimize(Aggregate(ova, c(decision, stakeholder), name = "ncuu", bins = bins), stakeholder)#, keepindex = FALSE) # ncpi = net cost with perfect information ncpi <- Aggregate( Optimize( Aggregate(ova, c("Iter", decision, stakeholder), name = "ncpi", bins = bins), c("Iter", stakeholder)), stakeholder ) evpi <- ncpi - ncuu # evio = expected value of including an option evio <- data.frame() # ncpieo = net cost with perfect information but excluding an option for(i in levels(ova@output[ , decision])) { temp <- ova temp@output <- temp@output[temp@output[ , decision] != i , ] ncpieo <- ncpi - Aggregate( Optimize( Aggregate( temp, c("Iter", decision, stakeholder), name = "ncpieo", bins = bins ), c("Iter", stakeholder)), stakeholder) evio <- rbind(evio, data.frame(Var = paste("Excluding", i), ncpieo@output)) } out <- orbind(data.frame(Var = "EVPI", evpi@output), evio) if(length(indices) == 0) {return(out)} evppi <- data.frame() # ncppi = net costs of partial perfect information. for(i in 1:length(indices)) { temp <- Aggregate( Optimize( Aggregate( ova, c(decision, indices[i], stakeholder), name = "ncppi", bins = bins ), c(indices[i], stakeholder) ), stakeholder ) - ncuu evppi <- rbind(evppi, data.frame(Var = paste("EVPPI:", indices[i]), temp@output)) } out <- orbind(out, evppi) return(out) } objects.store(Optimize, Aggregate, VOI) cat("Functions Optimize, Aggregate, and VOI stored.\n") ```

Output

Value of information, i.e. the amount of money that the decision-maker is willing, in theory, to pay to obtain a piece of information. Value of information can also be measured in other units than money, e.g. disability-adjusted life years if health impacts only are considered.

• See Decision theory, Value of information, and Expected value of perfect information in Wikipedia.
• There are different kinds of indicators under value of information, depending on what level of information is compared with the current situation:
EVPI
Expected value of perfect information (everything is known perfectly)
EVPPI
Expected value of partial perfect information (one variable is known perfectly, otherwise current knowledge)
EVII
Expected value of imperfect information (things are known better but not perfectly)
EVPII
Expected value of partial imperfect information (one variable is known better but not perfectly, otherwise current knowledge)
EVIU
Expected value of including uncertainty (a decision analysis can ignore uncertainties and go with expected value of each variable, or include uncertainty and propagate that through the model. There is a difference especially if the uncertainty distributions are skewed.)
EVIO
Expected value of including an option (is there any value of including a non-optimal decision option in the final assessment?)

An example output from an ovariable ova with a decision index D1 (with options BAU and D1b) and one index C1. Ncuu means net cost under uncertainty. Also other values in the table are costs, therefore negative values are savings. Syntax used:

```VOI(ova, "D1", indices = "C1")
```
Var evpiResult ncuuResult Result evioResult evppiResult
EVPI 2.395021 2.482161 -8.714023e-02 NA NA
BAU 2.395021 NA -1.542299e-01 2.549250 NA
D1b 2.395021 NA -8.714023e-02 2.482161 NA
C1 NA 2.482161 4.440892e-16 NA 2.482161

Management

A previous Analytica version of VOI calculation is archived. The related model file is File:VOI analysis.ANA.

Impact of a strong correlation between the decision and a variable

There is a problem with the approach using the decision as a random variable. The problem occurs with variables that are strongly correlated with the decision variable. The iterations are categorised into "VOI bins" based on the variable to be studied. In addition, iterations are categorised into "decision bins" based on the value of the decision variable. The idea is to study one VOi bin at a time and find the best decision bin within that VOI bin. If the best decision is different in different VOI bin, there is some value of knowing to which VOI bin the true value of the variable belongs. However, if the variable correlates strongly with the decision, it may happen that all iterations that are in a particular VOI bin are also in a particular decision bin. Then, it is impossible to compare different decision bins to find out which decision is the best in that VOI bin.

This problem can be overcome by assessing counter-factual worlds, because then there is always the same number of iterations in every decision bin. The conclusion of this is that the VOI analysis using decisions as random variables is a simple and quick screening method, but it cannot be reliably used for a final VOI analysis. In contrast, the counter-factual assessment is the method of choice for that. Originally developed by Jouni Tuomisto and Marko Tainio, National Public Health Institute (KTL), Finland, 2005. The screening version was developed by Jouni Tuomisto, National Institute for Health and Welfare (THL), 2009. (c) CC-BY-SA.

Error creating thumbnail: Unable to save thumbnail to destination
This test run shows that the VOI estimates only stabilise if there are more than 17 bins used. The number of iterations was 10000.

Value of information score

The VOI score is the current expected value of perfect information (EVPI) for that variable in an assessment where it is used. If the variable is used is several assessments, it is the sum of EVPIs across all assessments.

Background

Value of information (VOI) is a decision analysis method that estimates the benefits of collecting additional information. Yokota and Thompson (2004a) described VOI method as "…a decision analytic technique that explicitly evaluates the benefits of collecting additional information to reduce or eliminate uncertainty." [2] The term value of information covers a number of different analyses with different requirements and objectives.

To be able to perform a value of information analysis, the researcher needs to define possible decision options, consequences of each option, and uncertainty of each input variable. With the VOI method, the researcher can estimate the effect of additional information to decision making and guide the further development of the model. Thus, the VOI analysis can be used as a sensitivity analysis tool.

This review will shortly consider different VOI methods, requirements of the analysis, mathematical background and applications. In the end a short summary of the previously published VOI reviews by Yokota and Thompson (2004a, 2004b) [2] [3] , is provided.

A family of analyses

Term value of information analyses covers a number of different decision analyses. The expected value of perfect information (EVPI) analysis estimates the value of completely eliminating uncertainty from the particular decision. The EVPI analysis does not consider the sources of uncertainty, but how much the decision would benefit if uncertainty was removed. The VOI of a particular input variable X can be analysed with expected value of perfect X information (EVPXI) (or expected value of partial perfect information (EVPPI) analysis. The sum of all individual EVPXIs from all input variables is always less than EVPI.

The situations where uncertainty of the decision could be reduced to zero are exceptional, especially in the field of environmental health. Therefore, the results of EVPI and EVPXI analyses should be treated as maximum gain that could be achieved by reducing uncertainty. For more realistic approach, the expected value of sample information (EVSI) and expected value of sample X information (EVSXI) (or partial imperfect ie. EVII and EVPII, respectively) analyses could be used to estimate the value of reducing uncertainty of the model for a certain level or reducing uncertainty of the certain input variable for a certain level, respectively. The use of these two analyses increase requirements of the model since the targeted uncertainty level must be defined. The expected value of including uncertainty (EVIU) evaluates the effect of uncertainty in the specific decision problems and is out of the scope of this review.

Estimating the value of information

The VOI analyses estimate the difference between expected utility of the optimal decision, given new information, and the expected utility of the optimal decision given current information. The complete review of different mathematical solutions is beyond the scope of this review and thus only the EVPI is presented here. Those interested to know more, the book Uncertainty: A guide to Dealing with uncertainty in Quantitative Risk and Policy Analysis (Morgan and Henrion 1992) [4] and recent methodological review by Yokota and Thompson (2004b)[3] describes more detailed the mathematical background of the different VOI analyses and the solutions used in the past analyses.

EVPI is calculated using the following equation:

```EVPI = E(Max(U(di,θ))) - Max(E(U(di,θ))),
```

where E=expectation over uncertain parameters θ, Max=maximum over decision options i, U=utility of decision d (i.e., the value of outcome after a particular decision option i is chosen, measured in money, DALY, or another quantitative metric covering all relevant impacts).

The general formula for EVPII is:

```EVPII = Eθ2(U(Max(Eθ2(U(di,θ2))),θ2)) - Eθ2(U(Max(Eθ1(U(di,θ1))),θ2)),
```

where θ1 is the prior information and θ2 is the posterior (improved) information. EVPPI can be calculated with the same formula in the case where P(θ2)=1 if and only if θ2=θ1. If θ includes all variables of the assessment, the formula gives total, not partial, value of information.

The interpretation of the formula is the following (starting from the innermost parenthesis). The utility of each decision option di is estimated in the world of uncertain variables θ. Expectation over θ is taken (i.e. the probability distribution is integrated over θ), and the best option i of d is selected. The point is that in the first part of the formula, θ is described with the better posterior information, while the latter part is based on the poorer prior information. Once the decision has been made, the expected utility is estimated again based on the better posterior information in both the first and second part of the formula. Finally, the difference between the utility after the better and poorer information, respectively, gives the value of information.

The set-up of the analyses

To be able to perform a VOI analysis a modeller needs information on (i) the available decision options, (ii) the consequences of each option, and (iii) the uncertainties and reliability of the data. In addition to these, both gains and losses of the options must be quantified with common metrics (monetary or non-monetary). In the following chapter these requirements are discussed in more detail.

The first requirement for the VOI analysis is that the available options have been defined. In the economic literature the decision is usually seen e.g. whether or not to invest. In the field of environmental health the decisions could be e.g. choices between different control technologies or choices between available regulations. In ideal case the possible options have been defined explicitly by the authorities or the customer of the study. More often the available options are defined during the risk assessment process and risk communication has a crucial part when identifying the different options. In pure academic research the possible options can be defined by the modeller or the modelling team.

The second requirement is that the consequences of each possible option must be defined (e.g. effect of some control technology for the emissions and consequently to human health). Number of methods, such as DPSEEA or IEHIA, are been used in the field of environmental health to identify and define the causal connections.

Third requirement is that the uncertainties and reliability of the data have been defined explicitly in the model. Again, in the ideal case the uncertainties of the data have been defined or the data is available so that the modeller can assess the uncertainties. In reality, the data is sparse and the uncertainties must be assessed based on e.g. two different point estimates reported in the different studies. Expert elicitation [5] and similar methods are available to define the uncertainties explicitly. In the absence of data the modeller's choice (author judgement) could be used to estimate the uncertainties.

The outcomes of the actions must be quantified with a monetary or non-monetary metric. Again, in the economic analyses the common metric is by definition monetary. In the environmental field the common metric could also be health effect or some summary metric of health effects (e.g. life expectancy, QALY, DALY). Of course, the use of e.g. QALYs increase the complexity and uncertainty of the model.

Applications for risk assessment

The value of information analyses can be used to guide the information gathering and model building. In the decision making, the decisions can be made based on available information or wait and collect more information. The VOI analysis can estimate the value of additional information for the decision and guide the decision between immediate actions and data collection. In the economic literature this is often seen as the main value of the VOI analysis. However, in the field of environmental health and risk assessment, situations where the decision maker can allocate more funding for additional research and data collection are rare, and this kind of exploitation of VOI analysis is more an exception than rule.

Another way to use VOI analyses is to guide the process of model building. In this case, the decision maker is the modeller or the modeller team who makes the decisions of the modelling work. Thus, the VOI analyses can be used like sensitivity analysis method. This use is also the most prominent use of VOI analyses in the field of environmental health and risk assessment. The decisions that can be addressed are e.g. (i) whether (and which parts of) the model should contain explicit uncertainties, (ii) what are the key input parameters or assumptions in the model, and (iii) which parts of the model should be specified more detailed. All of these start from the question whether or not model uncertainties have an effect on decision making.

VOI analyses in past risk assessments

The use of value of information analyses in the medical and environmental field applications has been extensively reviewed by Yokota and Thompson in two different papers [3]. The first review [2] covers issues such as (i) the use of VOI analyses in different fields, (ii) the use of different VOI analyses, and (iii) motivations behind the analyses, while the second review [3] focused in more detail on environmental health applications and the methodological development and problems. The following summary of the use of VOI analyses is based on these two reviews.

The concept of VOI has been defined in the 1960's. The first identified applications in the medical and environmental field are from the 1970's, but only after 1985 the use of VOI analyses have spread more widely and its use has grown rapidly. In most of the analyses the number of uncertain input variables has been 1-4. EVPI or EVSI analyses have been the most common, while the EVPXI and EVSXI analyses have been more exceptional. The reviewers noticed that the VOI analyses have been applied in a number of different fields from toxicology to water contamination studies.

The reviewers' view of the published analyses was that most of them were performed to show the usefulness of the analyses rather than actually use the results of analyses in the decision making. The review showed "a lack of cross-fertilization across topic areas and the tendency of articles to focus on demonstrating the usefulness of the VOI approach rather than applications to actual management decisions"[2]. This result may illustrates the complexity of the environmental and risk assessment field decisions. Authors also concluded that inside the medical and environmental field the different research groups are doing VOI analyses separately without citing or learning from other groups' work.

In the second review, the authors raised several analytical challenges in the VOI analyses [3]). These included e.g. difficulty to model the decisions, valuing the outcomes and characterizing uncertainties. Although the development of the personal computers has increased the analytical possibilities, number of analytical problems still exists.

Standard VOI approach with counter-factual world descriptions

Counter-factual world descriptions mean that we are looking at two or more different world descriptions that are equal in all other respects except for a decision that we are assessing. In the counter-factual world descriptions, different decision options are chosen. By comparing these worlds, it is possible to learn about the impacts of the decision. With perfect information, we could make the theoretically best decision by always choosing the right option. If we think about these worlds as Monte Carlo simulations, we run our model several times to create descriptions about possible worlds. Each iteration (or row in our result table about our objective) is a possible world. For each possible world (i.e., row), we create one or more counter-factual worlds. They are additional columns which differ from the first column only by the decision option. With perfect information, we can go through our optimising table row by row, and for each row pick the decision option (i.e., the column) that is the best. The expected outcome of this procedure, subtracted by the outcome we would get by optimising the expectation (net benefit under uncertainty), is the expected value of perfect information (EVPI). [2] [3] [4] [5]

Screening approach with decisions as random variables

In this case, we do not create counter-factual world descriptions, but only a large number of possible world descriptions. The decision that we are considering is treated like any other uncertain variable in the description, with a probability distribution describing the uncertainty about what actually will be decided. In this case, we are comparing world descriptions that contain a particular decision option with other world descriptions that contain another decision option. It is important to understand that we are not comparing two counter-factual world descriptions, but we are comparing a group or possible world descriptions to another group of world descriptions.

The major benefit of the screening approach is that it is not necessary do define decision variables beforehand. Basically any variable can be taken to be a decision, as long as it is a meaningful as a decision and the model has a number of possible worlds simulated with Monte Carlo or another method such as Bayesian belief network (BBN). The idea is to conditionalise the decision variable to one decision option at a time and then compare these conditionalisations to find out which one of them gives the optimal outcome in the objective.

In this approach, it is not possible to calculate EVPI in such a straightforward way as with counter-factual world descriptions. Therefore, with this approach, we are pretty much restricted to calculating expected value of partial perfect (and imperfect) information, or EVPPI and EVPII, respectively. Some sophisticated mathematical methods may be developed to calculate this, but it is beyond my competence. One approach sounds promising to me at them moment. It is used with probabilistic inversion, i.e. using bunches of probability functions instead of point-wise estimates.[6]

There is a major difference between the two approaches. Counter-factual world descriptions are actually utilising the Do operator described by Pearl [7], which looks at impacts of forced changes of a variable. In contrast, the latter case has the structure of an observational study, which looks at natural changes where several variables change at the same time. Therefore, it is subject to confounders, which are typical problems in epidemiology: a variable is associated with the effect, but not because it is its cause but because it correlates with the true cause.

Because of this confounding effect, the latter method for value-of-information analysis may result in false negatives: a decision seems to be obvious (i.e., the VOI is zero), but a more careful analysis of confounders would show that it is not. Therefore, a value-of-information analysis based on a Bayesian net should be repeated with an analysis of counter-factual world descriptions. In Uninet, counter-factual world descriptions can be created with analytical conditioning, but it does not work with functional nodes, and its applicability is therefore limited.

Conclusions

The value of information is a decision analysis method that has been used and could be used in number of situations in the field of environmental health. The value of information covers a variety of different analyses with different scopes and requirements. The most difficult analytical challenges relate to the assessment of uncertainties in a model, valuing outcomes, and, especially, modelling different decisions. In the field of environmental health and risk assessment, identifying and modelling different decisions is probably the most challenging part of the analysis.

Keywords

Value of information, decision analysis, uncertainty, decision making, optimising

References

1. Value of information in Wikipedia
2. Yokota F. and Thompson K.M. (2004a). Value of information literature analysis: A review of applications in health risk management. Medical Decision Making, 24 (3), pp. 287-298.
3. Yokota F. and Thompson K.M. (2004b) Value of information analysis in environmental health risk management decisions: Past, present, and future. Risk Analysis, 24 (3), pp. 635-650.
4. Morgan M.G. and Henrion M. (1992). Uncertainty: A guide to dealing with uncertainty in quantitative risk and policy analyses. Cambridge University Press. 332 pp.
5. Cooke, R.M. (1991). Experts in uncertainty: Opinion and subjective probability in science. Oxfort university press, New York. 321 pp.
6. Jouni Tuomisto's notebook P42, dated 29.10.2009.
7. Judea Pearl: Causality: Models, Reasoning, and Inference. Cambridge University Press, 2000. ISBN 0521773628, ISBN 978-0521773621

Related files

<mfanonymousfilelist></mfanonymousfilelist>

 Error creating thumbnail: Unable to save thumbnail to destination Error creating thumbnail: Unable to save thumbnail to destination