Difference between revisions of "Input.interp"

Revision as of 12:00, 28 December 2011

This page is a method. The page identifier is Op_en5364
Moderator:Jouni (see all)
This page is a stub. You may improve it into a full page, and then a rating bar will appear here.
Upload data Show results

input.interp is an R function that interprets model inputs from a user-friendly format into explicit and exact mathematical format. The purpose is to make it easy for a user to give input without a need to worry about technical modelling details.

Question

What should be a list of important user input formats, and how should they be interpreted?

Answer

The basic feature is that if a text string can be converted to a meaningful numeric object, it will be. This function can be used when data is downloaded from Opasnet Base: if Result.Text contains this kind of numeric information, it is converted to numbers and fused with Result.

n is the number of iterations in the model. # is any numeric character in the text string.

Example	Regular expression	Interpretation	Output in R
12 000	# #	12000. Text is interpreted as number if space removal makes it a number.	as.numeric(gsub(" ", "", Result.text))
12,345	#,#	12.345. Commas are interpreted as decimal points.	as.numeric(gsub(",", ".", Result.text)) # Note! Do not use comma as a thousand separator!
-14,23	-#	-14.23. Minus in the beginning of entry is interpreted as minus, not a sign for a range.
50 - 125	# - #	Uniform distribution between 50 and 125	data.frame(obs=1:n, result=runif(n,50,125))
-12 345 - -23,56		Uniform distribution between -12345 and -23.56.
1 - 50	# - #	Loguniform distribution between 1 and 50 (Lognormality is assumed if the ratio of upper to lower is => 30)
3.1 ± 1.2 or 3.1 +- 1.2	# ± # or # +- #	Normal distribution with mean 3.1 and SD 1.2	data.frame(obs=1:n, result=rnorm(n,3.1,1.2))
2.4 (1.8 - 3.0)	# (# - #)	Normal distribution with mean 2.4 and 95 % confidence interval from 1.8 to 3.0	data.frame(obs=1:n, result=rnorm(n,2.4,(3.0-1.8)/2/1.96))
2.4 (2.0 - 3.2)	# (# - #)	Lognormal distribution with mean 2.4 and 95 % confidence interval from 2.0 to 3.0. Lognormality is assumed if the difference from mean to upper limit is => 50 % greater than from mean to lower limit.
24 - 35 (odds 5:1)	# - # (odds #:#)	Odds is five to one that the truth is between 24 and 35. How to calculate this, I don't know yet, but there must be a prior.	⇤# : I am not sure whether this is actually needed. Who expresses uncertainties in this way? --Jouni 14:00, 28 December 2011 (EET)
2;4;7		Each entry (2, 4, and 7 in this case) are equally likely to occur. Entries can also be text.
* (in index, or explanatory, columns)		The result applies to all locations of this index.	With merge() function, this column is not used as a criterion when these rows are merged.

How to actually make this happen in R?

Make a temporary result temp by removing all spaces from Result.Text. Columns: Indices,Result.Result.Text,temp (Indices contains all explanatory columns.)
Replace all "," with "."
Check if there are parentheses "()". If yes, assume that they contain 95 % CI.
Check if there are ranges "#-#".
Divide the rows of the data.frame into two new data.frames with the same list of columns (Indices,Result).
- If temp is a syntactically correct distribution, take the row to data.frame A and replace Result with temp.
- Otherwise, take the row to data.frame B and replace Result with Result.Text if that is not NA.
Create a new data.frame with index Iter = 1:n.
Make a random sample from each probability distribution in data.frame A using Iter.
Merge the data.frame B with Iter.
Join data.frames A and B with rbind(). Columns: Iter,Index,Result.

@@ Line 34: / Line 34: @@
 | 2.4 (2.0 - 3.2) || # (# - #) ||Lognormal distribution with mean 2.4 and 95 % confidence interval from 2.0 to 3.0. Lognormality is assumed if the difference from mean to upper limit is => 50 % greater than from mean to lower limit.||
 |----
-| 24 - 35 (odds 5:1) || # - # (odds #:#) || Interpretation: odds is five to one that the truth is between 24 and 35. How to calculate this, I don't know yet, but there must be a prior.||
+| 24 - 35 (odds 5:1) || # - # (odds #:#) || Odds is five to one that the truth is between 24 and 35. How to calculate this, I don't know yet, but there must be a prior.|| {{attack|# |I am not sure whether this is actually needed. Who expresses uncertainties in this way?|--[[User:Jouni|Jouni]] 14:00, 28 December 2011 (EET)}}
 |----
+| 2;4;7 || || Each entry (2, 4, and 7 in this case) are equally likely to occur. Entries can also be text.||
+|----
+| * (in index, or explanatory, columns) || || The result applies to all locations of this index.|| With merge() function, this column is not used as a criterion when these rows are merged.
 |}
+How to actually make this happen in R?
+# Make a temporary result ''temp'' by removing all spaces from Result.Text. Columns: ''Indices,Result.Result.Text,temp'' (Indices contains all explanatory columns.)
+# Replace all "," with "."
+# Check if there are parentheses "()". If yes, assume that they contain 95 % CI.
+# Check if there are ranges "#-#".
+# Divide the rows of the data.frame into two new data.frames with the same list of columns (''Indices,Result'').
+#* If temp is a syntactically correct distribution, take the row to data.frame A and replace ''Result'' with ''temp''.
+#* Otherwise, take the row to data.frame B and replace ''Result'' with ''Result.Text'' if that is not NA.
+# Create a new data.frame with index Iter = 1:n.
+# Make a random sample from each probability distribution in data.frame A using Iter.
+# Merge the data.frame B with Iter.
+# Join data.frames A and B with rbind(). Columns: ''Iter,Index,Result''.

Difference between revisions of "Input.interp"

Revision as of 12:00, 28 December 2011

Question

Answer

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Page Tools

Tools

In other websites