Evaluating assessment performance

From Testiwiki
Revision as of 13:02, 19 February 2009 by Mikko Pohjola (talk | contribs) (Result: slides)
Jump to: navigation, search



Evaluating assessment performance is a lecture about the factors that constitute the performance, the goodness, of assessments and they can be evaluated within a single general framework.

This page was converted from an encyclopedia article to a lecture as there are other descriptive pages already existing about the same topic. The old content was archived and can be found here.

Scope

Purpose: To summarize what assessments are about overall, what are the factors that constitute the overall performance of assessment, how the factors are interrelated, and how the performance, the goodness of assessment can be evaluated.

Intended audience: Researchers (especially at doctoral student level) in any field of science (mainly natural, not social scientists).

Duration: 1 hour 15 minutes

Definition

In order to understand this lecture it is recommended to also acquaint oneself with the following lectures:

Result

File:Evaluating assessment performance.ppt

  • Introduction through analogy: what makes a mobile phone good?
    • chain from production to use
    • goodness from whose perspective, producer or user? Can they be fit into one framework?
    • phone functionalities (qofC)
    • user interface, appearance design, use context, packaging, logistics, marketing, sales (applicability)
    • mass production/customization: components, code, assembly (efficiency)
  • Assessments
    • serve 2 masters: truth (science) and practical need (societal decision making, policy)
    • must meet the needs of their use
    • must strive for truth
    • both requirements must be met, which is not easy, but possible
    • a business of creating understanding about reality
      • making right questions, providing good answers, getting the (questions and) answers where they are needed
      • getting the questions right (according to need) is primary, getting the answers right is only conditional to the previous
  • Contemporary conventions of addressing performance:
    • quality assurance/control: process approach
    • uncertainty assessment: product approach
  • Performance in the context of assessments as science-based decision support: properties of good assessment
    • takes the production point of view
    • quality of content
      • informativeness, calibration, relevance
    • applicability
      • availability, usability, acceptability
    • efficiency
    • different properties have different points of reference and criteria,
    • a means of managing design and execution or evaluating past work
    • not an orthogonal set: applicability conditional on quality of content, efficiency conditional on both quality of content and applicability
  • Methods of evaluation
    • informativeness and calibration - uncertainty analysis, discrepancy analysis between result and another estimate (assumed as a golden standard)
    • relevance - scope vs. need
    • usability - (assumed) intended user opinion, participant rating for (technical) quality of information objects
    • availability - observed access to assessment information by intended users
    • acceptability of premises - (assumed) acceptance of premises by intended users, or others interested, or affected
    • acceptability of process - scientific acceptability of definition, given scope → peer review
    • efficiency - estimation of spent effort (given outcome)
  • How much addressing of these properties is built in to open assessment and opasnet?