From Testiwiki
Jump to: navigation, search


  • object: an ovariable
  • function_names: character vector of names of function objects in R, "Q0.025" and "Q0.975" are defined in the package
  • marginals: character vector of column names to be included in the final summary

Function_names and marginals can be left undefined in which case by default the summary will be about the distribution if available.


Summary defines how summaries of ovariables are shown.

What the summary function should do for ovariables?

  • Use parameters
    • x (ovariable),
    • rescol (optional, default is paste(x@name, "Result", sep = "")),
    • oper (a character vector of function names),
    • marginal (a vector of column names, or a logical vector with the same length as ncol(data).
  • Define data by doing EvalOutput(x)@output.
  • Define result column to be used in tapply based on rescol.
  • Define conditional columns based on marginal. If marginal is not given, use all columns except a) the result columns and b) all columns that end "Description", "Unit", or "Source". Note that if you have "jointed" some columns from marginals to joint and then you still use them as marginals, the function works but the result will be wrong.

--# : We can restrict selectable columns to known marginals (which are determined by CheckMarginals and altered by the jointing opreations automatically). --Teemu R 15:31, 27 February 2013 (EET)

--# : The most common summary will probably be with respect to the distribution alone. To do this we tapply the relevant result over all known marginals except "Iteration". --Teemu R 15:31, 27 February 2013 (EET)

  • Within oper, character strings can be used as functions in tapply if they correspond to a function that does not need other parameters than data$rescol. However, if the user wants to do e.g. quantile(data$rescol, 0.025), then this is given in oper as "Q0.025", and that string is converted to a temporary function in this way: assign(paste("temporary.function", i, sep = ""), function(x) {quantile(x, as.numeric(gsub("Q", "", oper[i])))}). Functions other than quantile with parameters may be defined in a similar way.

# : Do summaries have to described in table form outside of the code? If not then we can pass a list of functions as a parameter and we can define the most common ones beforehand so that we have a Q0.025(x) function ready and just write summary(var, oper = list(Q0.025)). Of course by default we could include the usual mean, sd, median, Q0.025, Q0.975. --Teemu R 15:31, 27 February 2013 (EET)

  • Do as.data.frame(as.table(tapply(rescol, marginal, operi))) for each item i in list oper. You will get a list of data.frames.
  • Remove empty rows from all data.frames with out[!is.na(out$Freq), ].
  • Merge these data.frames using marginal columns as conditions, and using all = TRUE so that empty cells will appear if one of the functions is not able to produce a result for a row but another is.
  • Create an ovariable with the name paste(x@name, "Summary", sep = ""), dependencies = x, function = summary(), output = data.frame just created.

# : Or is it just easier to use the data.frame as the output? --Jouni 11:38, 26 February 2013 (EET)

# : I think summary should be used for temporary purposes i.e. when we want to print a table of the data. OpasnetUtils/CollapseMarginal is the function used for optimization of an assessment by summarizing results at some nodes. So a data.frame is more suitable in my opinion. --Teemu R 15:31, 27 February 2013 (EET)
  • Return the ovariable.



See also