Difference between revisions of "Variable"

From Testiwiki
Jump to: navigation, search
m
(restructured)
Line 21: Line 21:
  
 
{|{{prettytable}}
 
{|{{prettytable}}
! Attribute
+
! [[Attribute]]
! Sub-attributes
+
! Sub-attribute
! Question to be answered
+
! Comments specfic to the variable attributes
! Comments
 
 
|-----
 
|-----
 
| '''Name'''
 
| '''Name'''
 
|  
 
|  
| What is the name of the variable?
+
|  
| Two variables must not have identical names.
 
 
|-----
 
|-----
 
| '''Scope'''
 
| '''Scope'''
 
|  
 
|  
| What is the research question to which the variable answers?
 
 
| This includes a verbal definition of the spatial, temporal, and other limits (system boundaries) of the variable. The scope is defined according to the use purpose of the assessment(s) that the variable belongs to.
 
| This includes a verbal definition of the spatial, temporal, and other limits (system boundaries) of the variable. The scope is defined according to the use purpose of the assessment(s) that the variable belongs to.
 
|-----
 
|-----
| '''Definition'''
+
| rowspan="4" | '''Definition'''
|  
+
| Causality
* Causality
+
| Causality tells what we know about how upstream variables (i.e. causal parents) affect the variable. Causality lists the parents and expresses their functional relationships (the variable as a function of its parents) or probabilistic relationships (conditional probability of the variable given its parents). The expression of causality is '''independent''' of the data about the magnitude of the result of the variable.
* Data
+
|----
* Unit
+
| Data
* Formula
+
| Data tells what we know about the magnitude of the result of the variable. Data describes any non-causal information about the particular part of reality that is being described, such as direct measurements, measured data about an analogous situation (this requires some kind of error model), or expert judgment.
| How can you derive or calculate the answer?
+
|----
| The definition uses algebra or other explicit methods if possible.
+
| Unit
 +
| Unit describes, in what measurement units the result is presented. The units of interconnected variables need to be coherent with each other given the functions describing causal relations. The units of variables can be used to check the coherence of the causal network description. This is a so called [[Plausibility test|unit test]].
 +
|----
 +
| Formula
 +
| Formula {{disclink|Discussion on formula attribute}} is an operationalisation of how to calculate or derive the result based on ''Causality'', ''Data'', and ''Unit'', making a synthesis of the three. Formula uses algebra, computer code, or other explicit methods if possible.
 
|-----
 
|-----
 
| '''Result'''
 
| '''Result'''
 
|  
 
|  
| What is the answer to the question defined in the scope?
+
| A result is an estimate about the particular part of reality that is being described. It is preferably a probability distribution (which can in a special case be a single number), but a result can also be non-numerical such as "very good".  
| If possible, a numerical expression or distribution.
 
 
|}
 
|}
  
Line 54: Line 54:
 
[[image:Variable definition.PNG]]
 
[[image:Variable definition.PNG]]
  
'''Name''' is the identifier of the variable, which of course already more or less describes what the real-world entity the variable describes is. The variable names should be chosen so that they are descriptive, unambiguous and not easily confused with other variables. An example of a good variable name could be e.g. ''daily average of PM<sub>2.5</sub> concentration in Helsinki''.
+
'''Specific issues related to variable attributes
  
'''Scope''' defines the boundaries of the variable - what does it describe and what not? The boundaries can be e.g. spatial, temporal or abstract. In the above example variable, at least the geographical boundary restricts the coverage of the variable to Helsinki and the considered phenomena are restricted to PM<sub>2.5</sub> daily averages. There could also be some further boundary settings defined in the scope of the variable, which are not explicitly mentioned in the name of the variable.
+
In a general form, the formula can be described as
  
'''Definition''' describes how the result of the variable is derived. It consists of sub-attributes to describe the causal relations, data used to estimate the result, and the mathematical formula to calculate the result. Also alternative identified ways to derive the variable result can be described in the definition attribute as reference. The minimum requirement for defining the causality in all variables is to express the potential existence of a causal relation, i.e. that a change in an ''upstream'' variable possibly affects the variables ''downstream''.
+
result = formula(causal parameters, data parameters, unit),  
  
'''Definition has four sub-attributes''' that have particular purposes in the method:
+
:where formula is the function (expressed as computer code for a specified software) for calculating the result using the causal parameters (information from causally upstream variables) and the data parameters (information from observed data) as input.
  
;Causality: Causality tells what we know about how upstream values affect our variable. This sub-attribute lists the upstream variables (i.e. causal parents) of the variable. It expresses their functional relationships (this variable as a function of its parents) or probabilistic relationships (conditional probability of this variable given its parents). The expression of causality is '''independent''' of the data there exists about the magnitude of the result of this variable.
+
It should be noted that the result is the distribution itself, although it can be expressed as some kind of description of the distribution, such as mean and standard deviation. The result should be described in such a detailed way that the full distribution can be reproduced from the information presented under this attribute. A technically straightforward way to do this is to provide a large random sample from the distribution.
  
;Data: Data tells what we know about the magnitude of the result of this variable. This sub-attribute describes any non-causal information about the variable, such as measured data about the variable itself, measured data about an analogous situation (this requires some kind of error model), or expert judgments about the result.
+
The result may be a different number for different ''locations'', such as geographical positions, population subgroups, or other determinants. Then, the result is described as
 
 
;Unit: Unit describes, in what measurement units the result is presented. The units of interconnected variables need to be coherent with each other in a causal network description. The units of variables can be used to check the coherence of the causal network description by the ''unit test'' (see [[Plausibility test]]).
 
 
 
;Formula: Formula {{disclink|Discussion on formula attribute}} is the actual computer code or similar that calculates what is described under titles ''Causality'', ''Data'', and ''Unit'', making a synthesis of the three. In a general form, the formula can be described as
 
 
 
result = formula(parent parameters, data parameters, unit),
 
 
 
:where formula is the function (expressed as computer code for a specified software) for calculating the result using the parent parameters (information from causally upstream variables) and the data parameters (information from observed data) as input.
 
 
 
'''Result''' attribute is an answer to the question presented in the scope of the variable. A result is preferably a probability distribution (which can in a special case be a single number), but a result can also be non-numerical such as "very good". It should be noted that the result is the distribution itself, although it can be expressed as some kind of description of the distribution, such as mean and standard deviation. The result should be described in such a detailed way that the full distribution can be reproduced from the information presented under this attribute. A technically straightforward way to do this is to provide a large random sample from the distribution.
 
 
 
The result may be a different number for different ''locations'', such as geographical positions, population subgroups, or other determinants, Then, the result is described as
 
  
 
   R|x<sub>1</sub>,x<sub>2</sub>,...  
 
   R|x<sub>1</sub>,x<sub>2</sub>,...  
Line 85: Line 73:
  
  
'''Connection to the [[PSSP]] structure'''
 
  
The variable structure is closely connected to [[PSSP]], and the relationships can be described in the following way.
+
'''Technical issues in Mediawiki'''
  
{|{{prettytable}}
+
{{comment|#(number): |This should be moved. Where?|--[[User:Jouni|Jouni]] 20:00, 9 June 2008 (EEST)}}
! PSSP
 
! Variable structure
 
|-----
 
| Purpose
 
| The general purpose of a variable is to describe a particular piece of reality. Scope defines which piece of reality is to be described by this variable.
 
|-----
 
| Structure
 
| Definition describes the structure of the particular piece of reality that the variable describes.
 
|-----
 
| State
 
| Result is an expression of the state of the particular piece of reality that the variable describes.
 
|-----
 
| Performance
 
| Performance is an expression of the uncertainty of the variable, i.e. how well does the variable fulfill its purpose, i.e. describe the piece of reality defined in the scope. On the variable level, performance is evaluated separately for result (parameter uncertainty) and definition (model uncertainty). However, evaluating the performance of a scope of a variable can not be done on the variable level, but instead as relevance on the assessment level.
 
|}
 
 
 
 
 
'''Technical issues in Mediawiki'''
 
  
 
* Each variable is a page in the ''Variable'' namespace. The '''name''' of the variable is also the name of the page. However, draft variables may be parts of other pages.
 
* Each variable is a page in the ''Variable'' namespace. The '''name''' of the variable is also the name of the page. However, draft variables may be parts of other pages.

Revision as of 17:00, 9 June 2008

<accesscontrol>members of projects,,Workshop2008,,beneris,,Erac,,Heimtsa,,Hiwate,,Intarese</accesscontrol> <section begin=glossary />

Variable is a description of a particular piece of reality. It can be a description of physical phenomena, or a description of value judgements. Also decisions included in an assessment are described as variables. Variables are continuously existing descriptions of reality, which develop in time as knowledge about them increases. Variables are therefore not tied into any single assessment, but instead can be included in other assessments. A variable is the basic building block of describing reality.<section end=glossary />


The research question about the structure of a variable
What is a structure of a variable such that it
  • is able to systematically handle all kinds of information about the particular piece of reality that the variable is describing,
  • is able to systematically describe causal relationships between variables,
  • enables both quantitative and qualitative descriptions,
  • is suitable for any kinds of variables, especially physical phenomena, decisions, and value judgements,
  • inherits its main structure from universal objects,
  • complies with the PSSP ontology,
  • can be operationalised in a computational model system,
  • results in variables that are independent of the assessment(s) it belongs to;
  • results in variables that pass the clairvoyant test.


Attribute Sub-attribute Comments specfic to the variable attributes
Name
Scope This includes a verbal definition of the spatial, temporal, and other limits (system boundaries) of the variable. The scope is defined according to the use purpose of the assessment(s) that the variable belongs to.
Definition Causality Causality tells what we know about how upstream variables (i.e. causal parents) affect the variable. Causality lists the parents and expresses their functional relationships (the variable as a function of its parents) or probabilistic relationships (conditional probability of the variable given its parents). The expression of causality is independent of the data about the magnitude of the result of the variable.
Data Data tells what we know about the magnitude of the result of the variable. Data describes any non-causal information about the particular part of reality that is being described, such as direct measurements, measured data about an analogous situation (this requires some kind of error model), or expert judgment.
Unit Unit describes, in what measurement units the result is presented. The units of interconnected variables need to be coherent with each other given the functions describing causal relations. The units of variables can be used to check the coherence of the causal network description. This is a so called unit test.
Formula Formula D↷ is an operationalisation of how to calculate or derive the result based on Causality, Data, and Unit, making a synthesis of the three. Formula uses algebra, computer code, or other explicit methods if possible.
Result A result is an estimate about the particular part of reality that is being described. It is preferably a probability distribution (which can in a special case be a single number), but a result can also be non-numerical such as "very good".


Error creating thumbnail: Unable to save thumbnail to destination

Specific issues related to variable attributes

In a general form, the formula can be described as

result = formula(causal parameters, data parameters, unit), 
where formula is the function (expressed as computer code for a specified software) for calculating the result using the causal parameters (information from causally upstream variables) and the data parameters (information from observed data) as input.

It should be noted that the result is the distribution itself, although it can be expressed as some kind of description of the distribution, such as mean and standard deviation. The result should be described in such a detailed way that the full distribution can be reproduced from the information presented under this attribute. A technically straightforward way to do this is to provide a large random sample from the distribution.

The result may be a different number for different locations, such as geographical positions, population subgroups, or other determinants. Then, the result is described as

  R|x1,x2,... 

where R is the result and x1 and x2 are defining the locations. A dimension means a property along which there are multiple locations and the result of the variable may have different values when the location changes. In this case, x1 and x2 are dimensions, and particular values of x1 and x2 are locations. A variable can have zero, one, or more dimensions. Even if a dimension is continuous, it is usually operationalised in practice as a list of discrete locations. Such a list is called an index, and each location is called a row of the index.

Uncertainty about the true value of the variable is one dimension. The index of the uncertainty dimension is called the Sample index, and it contains a list of integers 1,2,3... . Uncertainty is operationalised as a sequence of random samples from the probability distribution of the result. The ith random sample is located in the ith row of the Sample index.


Technical issues in Mediawiki

--#(number): : This should be moved. Where? --Jouni 20:00, 9 June 2008 (EEST)

  • Each variable is a page in the Variable namespace. The name of the variable is also the name of the page. However, draft variables may be parts of other pages.
  • All attributes except name are second-level (==) sub-titles on the page.
  • Description of the attribute content is added at the end of that content; discussions on the content are added to the Talk page, each discussion under an own descriptive title.
  • References to external sources are added to the text with the <ref>Reference information</ref> tag. The references are located in the end of the page under subtitle References. However, reference is not an attribute of the variable despite it is technically similar.
  • In the formula, computer code for a specific software may be used. The following are in use.
    • Analytica_id: Identifier of the respective node in an Analytica model. <anacode>Place your Analytica code here. Use a space in front of each line.</anacode>
    • <rcode>Place you R code here. Use a space in front of each line.<rcode>

Event-substance

--#(number): : This paragraph should be deleted or removed. Where? --Jouni 00:40, 8 June 2008 (EEST)

Variables are objects of event-medium composite -type. They thus describe both the events that occur within the scope of the variable and the medium where these particular events take place. In practice, the events can only be observed through the changes in the state of the medium, and it is therefore reasonable to describe the events and particular media as such composites rather than as separately.

In open assessment, all the variables included in an assessment must be causally related, directly or indirectly, to the endpoints of the assessment, and the causal relations must be defined. The event-media structure is the carrier of the cause-effect relations between the variables. An event occuring in a medium causes a change in state of that medium leading to another event to occur changing the state of that medium, causing yet another event to occur and so on. In addition to variables, also classes as generalizations of properties possessed by variables can be causally related to each other.


See also