File:Variable correlator.ANA

From Testiwiki
Jump to: navigation, search
Variable_correlator.ANA(file size: 14 KB, MIME type: text/rtf)
Warning: This file type may contain malicious code. By executing it, your system may be compromised.

This Analytica module creates correlations between variables based on rank correlation matrix.

Note! The same functionality can be found from R package mc2d with function cornode.

This model reorders a group of probabilistic variable samples so that they mimic a desired correlation structure as closely as possible.

Input variables

Data: The variables to be correlated. They may be of any distribution type; lognormal, beta, normal, etc., even different distributions for each variable are allowed. They must all have the same sample size. All the variables must be in one array indexed by Col index, one variable per one row of Col. If you are explicitly providing the samples (not analytically specifying distributions) then index this node by Run, and set the sample size in the Uncertainty Setup menu to equal the number of samples you have. The array data MUST have indexes Run and Col, but it may have other dimensions as well. Each additional dimension is treated independently.

Correl: This is the correlation matrix of the distributions from which I would like to sample. Being a Spearman correlation matrix, it must be symmetric. It must be indexed by Col and Row indexes.

Col, Row: Indexes for data variables and correlation matrix. These are NOT parameters in the function but global variables. They must be of the same size and content. It is recommended that they are lists of numbers 1..number of variables. The indexes are located beside the Correlate function. Note that if you have several groups of correlated variables, you must use the same Col and Row indexes even if the groups are not of the same size.


For more information on this method see Iman, R.L., and Conover, W.J., "A distribution free approach to inducing rank correlation among input variables", Commun. Statist.-Simula. Computa. (Marcel Dekker,Inc.), 11(3), 1982, 311-334.

This model reorders a group of probabilistic variable samples so that they mimic a desired correlation structure as closely as possible.

The Scores local variable requires columns with identical numbers, but rearranged (i.e., Median Hypercube sampling). This variable should be hardwired to a MHS sample of the desired sample size (or a set of Van Der Warden scores-see Iman and Conover ref) if other portions of your model require selection of a different sampling technique.

The procedure

Var b: This takes the Cholesky decomposition of the correlation matrix. The desired correlation matrix, reindexed so that the inner product can be taken.

Var a: In order to generate the joint normal we start with a vector of independent unit normal variables. It should contain the same numbers (scores) in each column, rearranged randomly. Median Latin Hypercube sampling provides the appropriate sample. If you want to use another sampling technique in your model, you should hard-wire this node to contain an MLH sample of the appropriate sample size (or a set of Van Der Warden Scores-see the Iman and Conover ref). The calculations adjust the transformed matrix to account for the fact that the scores' correlation matrix is not the identity matrix; i.e., the scores samples are slightly correlated. The correlation matrix for the scores. The decomposition of the scores correlation matrix. The inversion of the decomposed correlation matrix. The matrix [inner] product of the decomposed desired correlation matrix (Var b) and the transformed score correlation matrix. The transpose of the product provides us with a final matrix. In the next step we multiply the scores by this matrix to create a sample with the precise correlation matrix desired. After adjusting for the correlation in the Scores, we arrive at a set of score samples that have exactly the desired correlation. The transformation of the unit normal scores affects their marginal distributions slightly. The ranks of the unit normal sample points are used to reorder the desired sample. Used to rerank the samples in the desired distributions, simulating the desired correlation while preserving the marginal distributions of the sample.

Var c: The ranks of your samples to be correlated. This reordered sample matrix is the primary output of this model. It contains the samples reordered so that they mimic (as closely as possible) the desired correlation structure. Larger sample sizes provide samples which more closely match the desired correlation structure.

This work is based on 'Correlated distributions.ana' file that accompanies Analytica program. It was further developed by Jouni Tuomisto and Marko Tainio, National Public Health Institute (KTL), Finland, in 2005.

File history

Click on a date/time to view the file as it appeared at that time.

current07:18, 10 September 2008 (14 KB)Jouni (talk | contribs)Version 11 Jul 2005. Original location [http://ytoswww/yhteiset/Huippuyksikko/Tutkimus/R20_Mallit/Functions/Correl_1.ANA].
07:16, 10 September 2008 (2 KB)Jouni (talk | contribs)First version by Jouni Tuomisto, 4 Dec 2002. Original location: [http://ytoswww/yhteiset/Huippuyksikko/Tutkimus/R20_Mallit/TuomistoKorrelaatiofunktio2002.ANA]. This Analytica module creates correlations between variables based on rank correlation mat
  • You cannot overwrite this file.

There are no pages that link to this file.