Difference between revisions of "Quality evaluation criteria"

From Testiwiki
Jump to: navigation, search
(first draft based on own thinking)
 
(new idea of coverage, first half)
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
 
[[Category:Quality control]]
 
[[Category:Quality control]]
{{variable}}
+
{{variable|moderator=Jouni}}
 +
'''Quality evaluation criteria''' are a set of measures that together describe the [[quality of content]] of an information object.
 +
 
 
==Scope==
 
==Scope==
  
Line 9: Line 11:
 
==Definition==
 
==Definition==
  
===Data===
+
* [[Properties of a good assessment]]
 +
 
 +
Previous references:
 +
P-Box: <ref>McGill Research blog on P-Box [http://www.professormcgill.com/blog/category/p-box/]</ref><ref>Scott Ferson, W. Troy Tucker: Sensitivity analysis using probability bounding. Reliability Engineering and System Safety 91 (2006) 1435–1442. [http://www.ramas.com/wttreprints/FersonTucker06.pdf]</ref> approach.
 +
Vines and copulas: <ref>Delft University of Technology, 2nd Vine Copula Workshop, 16-17 December 2008 [http://dutiosc.twi.tudelft.nl/~risk/index.php?view=article&catid=4%3Aall-workshops&id=18%3Aworkshop-2008&option=com_content&Itemid=7]</ref>
 +
 
 +
The quality of the content of a variable can be divided into two main parts based on the golden standard: goodness of the result estimate compared with the truth, and goodness of the description of existing data compared with the actually existing data.
 +
 
 +
In practice, the [[dependencies]] and [[result]] can be compared with the truth, while [[data]] and [[formula]] can be compared with the actually existing data. This is because result and dependencies typically contain few references, while data and formula contain a lot. However, this can vary a lot from one variable to another and should not be used as a strict rule.
  
* NUSAP criteria could be used here?
+
'''Comparison with the truth
  
====Amount of data====
+
Of course, truth is never actually known precisely, but still people are able to give their personal estimates ([[subjective probability|subjective probabilities]]) about the result of a variable. In addition, they are able to estimate which variables are causally related to the variable under consideration (i.e., parent variables). To operationalise this, two concepts are defined.
  
Amount of data can be measured with this question: '''How many sets of independent observations are used as data for defining the object?
+
;[[Result range]]: Result range ''R'' is a range of plausible values within which the true value of the variable is located [r<sub>l</sub>, r<sub>u</sub>]. It is described as a part of the [[result]] of a [[variable]]. What "plausible" exactly means is somewhat fuzzy, as it is based on the evaluation by the group of people who have produced the current version of the variable result.
 +
;[[Coverage]]: Coverage is defined as two subjective probabilities given by a user: probability that the truth is actually below the [[result range]]; and probability that the truth is actually above the [[result range]]. If the variable is not quantitative, but e.g. a discrete probability distribution with non-ordered values, coverage means any values that are not defined in the discrete distribution; in this case, coverage is described by only one probability estimate.
  
Typically, each individual scientific article is one set of observations. However, if several articles are derived from the same observations, they should be counted as one set. There are also intermediate situations. For example, several follow-ups may be published from the same cohort. They are clearly not independent, but a later follow-up clearly includes observations that are not included in the previous one. In this case, the discussion should be whether the previous follow-up has any additional merit given the later one. It might for example be better for describing the impacts of exposures occurring at early stages of the follow-up.
+
Coverage can be estimated for the result, and this is what is usually meant by the word. However, coverage can also be estimated for upstream dependencies. Then, it means the probability that the rank correlation between this variable and a parent is smaller (or larger) than the current estimate (described in [[dependencies]]). It is important to notice that if a dependency is not mentioned at all, this implies a rank correlation between the variable and its parent that is exactly 0.
  
===Dependencies===
+
The individual coverage estimates can be aggregated into a probability distribution that is wide enough to capture the true value with high subjective aggregated probability estimated by the group. The distribution is clearly wider than the aggregated best estimate of the distribution. The usefulness of coverage is that with a draft assessment, coverage can be used in [[VOI]] analysis, and it is unlikely to result in false negative (the distribution being too narrow falsely implying that no further research is needed).
  
* [[Properties of good assessments]]
+
Subjective coverages can be aggregated to what is called group coverage.
  
 
== Result ==
 
== Result ==
  
'''Amount of data is a quantitative measure with the following [[result domain]]:
+
===Coverage===
* Number of independent sources of information. This doesn't need to be an integer, if sources are only partly independent.
+
 
* 0 means a "guesstimate" where the object result is based on general knowledge without any citable sources of information.
+
 
* -1 means a place holder that contains little or no information about the topic; it is just used to make a technically complete object that can be used in testing the usage of the object in e.g. an assessment model.
 
 
   
 
   
 
==See also==
 
==See also==
  
 +
* [http://en.opasnet.org/en-opwiki/index.php?title=Quality_evaluation_criteria&oldid=7317#Amount_of_data Previous idea about "Amount of data"]
 +
* [http://en.opasnet.org/en-opwiki/index.php?title=Quality_evaluation_criteria&oldid=7317#Format_of_the_result Previous idea about a technical classification of quantitative results] (placeholder, guesstimate, result range, marginal distribution, joint distribution)
  
 
==References==
 
==References==
  
 
<references/>
 
<references/>

Latest revision as of 18:26, 23 May 2010



Quality evaluation criteria are a set of measures that together describe the quality of content of an information object.

Scope

What is a set of quality evaluation measures such that it fulfils the following criteria:

Definition

Previous references: P-Box: [1][2] approach. Vines and copulas: [3]

The quality of the content of a variable can be divided into two main parts based on the golden standard: goodness of the result estimate compared with the truth, and goodness of the description of existing data compared with the actually existing data.

In practice, the dependencies and result can be compared with the truth, while data and formula can be compared with the actually existing data. This is because result and dependencies typically contain few references, while data and formula contain a lot. However, this can vary a lot from one variable to another and should not be used as a strict rule.

Comparison with the truth

Of course, truth is never actually known precisely, but still people are able to give their personal estimates (subjective probabilities) about the result of a variable. In addition, they are able to estimate which variables are causally related to the variable under consideration (i.e., parent variables). To operationalise this, two concepts are defined.

Result range
Result range R is a range of plausible values within which the true value of the variable is located [rl, ru]. It is described as a part of the result of a variable. What "plausible" exactly means is somewhat fuzzy, as it is based on the evaluation by the group of people who have produced the current version of the variable result.
Coverage
Coverage is defined as two subjective probabilities given by a user: probability that the truth is actually below the result range; and probability that the truth is actually above the result range. If the variable is not quantitative, but e.g. a discrete probability distribution with non-ordered values, coverage means any values that are not defined in the discrete distribution; in this case, coverage is described by only one probability estimate.

Coverage can be estimated for the result, and this is what is usually meant by the word. However, coverage can also be estimated for upstream dependencies. Then, it means the probability that the rank correlation between this variable and a parent is smaller (or larger) than the current estimate (described in dependencies). It is important to notice that if a dependency is not mentioned at all, this implies a rank correlation between the variable and its parent that is exactly 0.

The individual coverage estimates can be aggregated into a probability distribution that is wide enough to capture the true value with high subjective aggregated probability estimated by the group. The distribution is clearly wider than the aggregated best estimate of the distribution. The usefulness of coverage is that with a draft assessment, coverage can be used in VOI analysis, and it is unlikely to result in false negative (the distribution being too narrow falsely implying that no further research is needed).

Subjective coverages can be aggregated to what is called group coverage.

Result

Coverage

See also

References

  1. McGill Research blog on P-Box [1]
  2. Scott Ferson, W. Troy Tucker: Sensitivity analysis using probability bounding. Reliability Engineering and System Safety 91 (2006) 1435–1442. [2]
  3. Delft University of Technology, 2nd Vine Copula Workshop, 16-17 December 2008 [3]