Calibration

From Testiwiki
Revision as of 19:55, 16 March 2011 by Jouni (talk | contribs) (first draft)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


Calibration and informativeness are measures of expertise. An expert is well calibrated if she is able to correctly predict the probability that her answers actually turn out to be correct. This can be evaluated by observation: if an expert says that she is 80 % sure about the answer, it should mean that when taking ten answers with similar certainty, eventually two of them, on average, should turn out to be incorrect. If the actual result is lower, the expert is said to be overconfident. Calibration is measured against the truth, when it is revealed. Specifically, calibration is a p value for a statistical test about a null hypothesis that an expert is actually calibrated and neither overconfident nor underconfident.

Informativeness measures the spread of a probability distribution. The narrower the distribution, the more informative it is. Informativeness is a relative measure and it is always measured against some other distribution about the same issue.

Calculations

Informativeness is calculated in the following way for a series of Bernoulli probabilities:

+ Show code

where s is a vector of actual probabilities and p is a vector of reference probabilities.

Calibration is calculated in the following way:

+ Show code

where NQ is a vector of the number of trials for each actual probability.

See also

Keywords

Expert elicitation, expert judgement, performance

References


Related files

<mfanonymousfilelist></mfanonymousfilelist>