Difference between revisions of "Object-oriented programming in Opasnet"
m (→See also) |
(→Answer) |
||
Line 14: | Line 14: | ||
===Structure of objects=== | ===Structure of objects=== | ||
− | Objects have two different implementations: wiki page in Opasnet, and S4 class object called '' | + | Objects have two different implementations: wiki page in Opasnet, and S4 class object called ''oavariable'' (open assessment variable) in [[R-tools]]. The wiki page is the user-friendly interface for users, and oavariable is the versatile format for efficient, standardised modelling. The default direction for data is long (using the terminology in the merge function). |
{| {{prettytable}} | {| {{prettytable}} | ||
Line 35: | Line 35: | ||
| List of indices that are used to specify the answer. | | List of indices that are used to specify the answer. | ||
| Index columns in the data table | | Index columns in the data table | ||
− | | Slot index = " | + | | Slot index = "vector" (or factor?). A character vector with all indices used. The content is the same as index parameter in t2b tag. |
+ | |---- | ||
+ | | '''marginal | ||
+ | | A Boolean vector with the same size as index. TRUE if an index is indexing a marginal distribution in sample, FALSE if joint distribution. The difference is that in a marginal distribution there are n iterations for each location of the index, while in joint distribution, there are altogether n iterations in such a way that the frequencies of locations match their probabilities. | ||
+ | | Not implemented in wiki. | ||
+ | | Slot marginal = "vector". Especially with indices with lots of locations, joint distribution needs much less memory. | ||
|---- | |---- | ||
| '''observation | | '''observation | ||
| An identifier of an individual when the answer consists of a group of individuals. | | An identifier of an individual when the answer consists of a group of individuals. | ||
− | | Obs column (usually implicit because the default is that each row is an observation) in the data table | + | | ''Obs'' column (usually implicit because the default is that each row is an observation) in the data table. |
− | | | + | | ''Obs'' column in data.frames data and sample. Not explicitly needed as a slot in S4 object. |
|---- | |---- | ||
| '''iteration | | '''iteration | ||
| An identifier of a probabilistic run or iteration. Sometimes it is also called a possible world or realisation. | | An identifier of a probabilistic run or iteration. Sometimes it is also called a possible world or realisation. | ||
− | | Iter column in the data table (data usually not shown probabilistically) | + | | ''Iter'' column in the data table (data usually not shown probabilistically in wiki). |
− | | | + | | ''Iter'' column in the data.frames sample (and rarely in data). Not explicitly needed as a slot in S4 object. |
|---- | |---- | ||
| '''distribution | | '''distribution | ||
Line 65: | Line 70: | ||
| Observations, expert judgement, discussions, and other pieces of information. | | Observations, expert judgement, discussions, and other pieces of information. | ||
| Subheading under Rationale | | Subheading under Rationale | ||
− | | Slot data = "data.frame". The data frame must contain at least one index column, and at least one result column | + | | Slot data = "data.frame". The data frame must contain at least one index column Obs column, and at least one result column. |
|---- | |---- | ||
| '''unit | | '''unit | ||
| The measurement unit(s) that are used in the answer to measure the topic. | | The measurement unit(s) that are used in the answer to measure the topic. | ||
| Subheading under Rationale with plain text. Also mentioned in the data table with parameter unit. | | Subheading under Rationale with plain text. Also mentioned in the data table with parameter unit. | ||
− | | Slot unit = " | + | | Slot unit = "vector". The format used is ''kg m^2 /s^2'' where a space implies a multiplication. Unit is a vector with length > 1, if different rows in data have different units. |
|---- | |---- | ||
| '''dependencies | | '''dependencies | ||
| List of upstream objects that are causally related to this object. | | List of upstream objects that are causally related to this object. | ||
| Subheading under Rationale with a list of links to upstream (and sometimes downstream) wiki pages. | | Subheading under Rationale with a list of links to upstream (and sometimes downstream) wiki pages. | ||
− | | Slot dependencies = " | + | | Slot dependencies = "vector". A character vector where entries have the format "Op_fi:Vaativuusluokkien keskipalkat". If the wiki identifier is omitted, the default is op_en. |
|---- | |---- | ||
| '''formula | | '''formula | ||
− | | A computer code or algorithm to derive the answer from rationale and objects listed in dependencies. The formula may assume a deterministic dependency (e.g. y | + | | A computer code or algorithm to derive the answer from rationale and objects listed in dependencies. The formula may assume a deterministic dependency (e.g. y <- k*x + b), a conditional probability structure (y ~ dnorm(x, sd), or a rank correlation matrix. |
| Subheading under Rationale, often using <rcode> tags. | | Subheading under Rationale, often using <rcode> tags. | ||
| Slot formula = "list". There may be several competing algorithms. Each of them is described (as a function?) as one entry in the list. When implementing the formula, the algorithm that is implemented is randomly selected for each iteration with equal probability unless otherwise specified in formula.prob. | | Slot formula = "list". There may be several competing algorithms. Each of them is described (as a function?) as one entry in the list. When implementing the formula, the algorithm that is implemented is randomly selected for each iteration with equal probability unless otherwise specified in formula.prob. | ||
Line 89: | Line 94: | ||
|} | |} | ||
− | === | + | ===Methods=== |
R code should be developed in such a way that there are object-specific implementations of critical functions. The user should see straightforward content, and all messy indexing etc should happen behind scenes. | R code should be developed in such a way that there are object-specific implementations of critical functions. The user should see straightforward content, and all messy indexing etc should happen behind scenes. | ||
+ | |||
+ | These methods should be implemented for oavariable objects. | ||
+ | * show, print: show the data slot. | ||
+ | * plot: plot the sample, showing one (the first by default) marginal index with all locations and all other marginal indices with the first location only. | ||
+ | * tidy: applies to data: remove id column; add Obs and Iter columns if they do not exist; Change the direction from wide to long. | ||
+ | * createSample: create sample directly from data using interp.input. | ||
+ | * GetSample, GetData: extract sample and data from the object, respectively. | ||
+ | * Ops: applies to sample: merge two oavariables based on index columns, then perform the Ops operation to the result columns. | ||
+ | * standardUnits: Based on units, transform the result column of data to SI units using [[Unit transformations]] table; then update unit. | ||
+ | * demarginalize: turn one specified index from marginal to joint format. | ||
==Rationale== | ==Rationale== |
Revision as of 16:32, 3 April 2012
This page is a method.
The page identifier is Op_en5529 |
---|
Moderator:Jouni (see all) |
This page is a stub. You may improve it into a full page, and then a rating bar will appear here. |
Upload data
|
Object-oriented programming is an approach where programs (or, in Opasnet, typically assessment models) have a modular structure in such a way that each part is considered as a separate object that has specific properties and interacts with other objects in standard ways.
Contents
Question
How should object-oriented programming be utilised in Opasnet in such a way that
- it has seamless connections to R-tools,
- it is easy to understand by non-expert users and contributors,
- it uses the variable structure and other information structures (e.g. universal object) used in open assessment, and
- it enables standards for typical processes in environmental health assessments (such as distribution modelling, life tables, decision optimising, etc.).
Answer
Structure of objects
Objects have two different implementations: wiki page in Opasnet, and S4 class object called oavariable (open assessment variable) in R-tools. The wiki page is the user-friendly interface for users, and oavariable is the versatile format for efficient, standardised modelling. The default direction for data is long (using the terminology in the merge function).
Attribute | What it contains | How implemented in the wiki | How implemented in the R-tools as a S4 class object oavar |
---|---|---|---|
question | A research question that defines the topic of the object | First main heading | Slot question = "character". Contains the question as text. |
answer | The current best answer to the question, shown as text, data table, or distribution. | Second main heading; contains a single data table. NOTE! The data table is actually under ratonale/data but often it is the same as answer. The actual answer is precisely described by distribution and sample (see below). | Only sub-attributes are implemented. |
index | List of indices that are used to specify the answer. | Index columns in the data table | Slot index = "vector" (or factor?). A character vector with all indices used. The content is the same as index parameter in t2b tag. |
marginal | A Boolean vector with the same size as index. TRUE if an index is indexing a marginal distribution in sample, FALSE if joint distribution. The difference is that in a marginal distribution there are n iterations for each location of the index, while in joint distribution, there are altogether n iterations in such a way that the frequencies of locations match their probabilities. | Not implemented in wiki. | Slot marginal = "vector". Especially with indices with lots of locations, joint distribution needs much less memory. |
observation | An identifier of an individual when the answer consists of a group of individuals. | Obs column (usually implicit because the default is that each row is an observation) in the data table. | Obs column in data.frames data and sample. Not explicitly needed as a slot in S4 object. |
iteration | An identifier of a probabilistic run or iteration. Sometimes it is also called a possible world or realisation. | Iter column in the data table (data usually not shown probabilistically in wiki). | Iter column in the data.frames sample (and rarely in data). Not explicitly needed as a slot in S4 object. |
distribution | A joint probability distribution (with indices as dimensions) describing the answer mathematically. | Not shown | Slot distribution = "distribution?". A distribution created with e.g. dnorm(0,1). We don't know yet how to actually implement this and how the indices are included. |
sample | A random sample from the distribution (default is 10000 iterations). | Not shown | Slot sample = "data.frame". The data frame must contain columns Iter, one column for each index, and at least one result column. There may also be a column Obs. |
rationale | Any information that is needed to convince a critical reader that the answer is good. | Third main heading | Only sub-attributes are implemented. |
data | Observations, expert judgement, discussions, and other pieces of information. | Subheading under Rationale | Slot data = "data.frame". The data frame must contain at least one index column Obs column, and at least one result column. |
unit | The measurement unit(s) that are used in the answer to measure the topic. | Subheading under Rationale with plain text. Also mentioned in the data table with parameter unit. | Slot unit = "vector". The format used is kg m^2 /s^2 where a space implies a multiplication. Unit is a vector with length > 1, if different rows in data have different units. |
dependencies | List of upstream objects that are causally related to this object. | Subheading under Rationale with a list of links to upstream (and sometimes downstream) wiki pages. | Slot dependencies = "vector". A character vector where entries have the format "Op_fi:Vaativuusluokkien keskipalkat". If the wiki identifier is omitted, the default is op_en. |
formula | A computer code or algorithm to derive the answer from rationale and objects listed in dependencies. The formula may assume a deterministic dependency (e.g. y <- k*x + b), a conditional probability structure (y ~ dnorm(x, sd), or a rank correlation matrix. | Subheading under Rationale, often using <rcode> tags. | Slot formula = "list". There may be several competing algorithms. Each of them is described (as a function?) as one entry in the list. When implementing the formula, the algorithm that is implemented is randomly selected for each iteration with equal probability unless otherwise specified in formula.prob. |
formula.prob | A list of probabilities assigned to the competing algorithms in formula. The default is that each has an equal probability. | A detail in <rcode> code. | Slot formula.prob = "vector". Should have the same size as formula. |
Methods
R code should be developed in such a way that there are object-specific implementations of critical functions. The user should see straightforward content, and all messy indexing etc should happen behind scenes.
These methods should be implemented for oavariable objects.
- show, print: show the data slot.
- plot: plot the sample, showing one (the first by default) marginal index with all locations and all other marginal indices with the first location only.
- tidy: applies to data: remove id column; add Obs and Iter columns if they do not exist; Change the direction from wide to long.
- createSample: create sample directly from data using interp.input.
- GetSample, GetData: extract sample and data from the object, respectively.
- Ops: applies to sample: merge two oavariables based on index columns, then perform the Ops operation to the result columns.
- standardUnits: Based on units, transform the result column of data to SI units using Unit transformations table; then update unit.
- demarginalize: turn one specified index from marginal to joint format.
Rationale
See also
References
Related files
<mfanonymousfilelist></mfanonymousfilelist>