Difference between revisions of "Goherr: Fish consumption study"

From Testiwiki
Jump to: navigation, search
(Bayes model: tried to develop a fancier model but failed)
(Bayes model)
Line 223: Line 223:
 
* Model run 3.3.2017. All variables assumed independent. [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=lwlSwXazIDHDyJJg]
 
* Model run 3.3.2017. All variables assumed independent. [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=lwlSwXazIDHDyJJg]
 
* Model run 3.3.2017. p has more dimensions. [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=ZmbNUuZeb7UOf8NP]
 
* Model run 3.3.2017. p has more dimensions. [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=ZmbNUuZeb7UOf8NP]
 +
* Model run 25.3.2017. Several model versions: strange binomial+multivarnormal, binomial, fractalised multivarnormal [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=pKe0s2Ldm1mbIVuO]
  
 
<rcode name="bayes" label="Initiate Bayes model (for developers only)" graphics=1>
 
<rcode name="bayes" label="Initiate Bayes model (for developers only)" graphics=1>
Line 228: Line 229:
  
 
library(OpasnetUtils)
 
library(OpasnetUtils)
 +
library(ggplot2)
 
library(reshape2)
 
library(reshape2)
 
library(rjags)
 
library(rjags)
Line 276: Line 278:
  
 
qlen <-  c(4,2,2,2,6,2,2,7,7,7,5,2,7,7,7,5) # Number of options in each question
 
qlen <-  c(4,2,2,2,6,2,2,7,7,7,5,2,7,7,7,5) # Number of options in each question
 +
  
 
agel
 
agel
Line 282: Line 285:
 
fisl
 
fisl
  
############ Version with both binomial and multivariate normal
+
#################################################
 +
cat("Version with both binomial and multivariate normal.\n")
  
 
mod <- textConnection("
 
mod <- textConnection("
Line 288: Line 292:
 
     for(i in 1:S) {
 
     for(i in 1:S) {
 
       for(j in 1:4) {
 
       for(j in 1:4) {
         surv[i,j] ~ dbin(phi(z[j]), qlen[j])
+
         surv[i,j] ~ dbin(p[j], qlen[j])
 
       }
 
       }
 +
    surv.scaled[i,1:4] ~ dmnorm(mu0[], Omega[,])
 
     }
 
     }
     z[1:4] ~ dmnorm(mu[], Omega[,])
+
     for(j in 1:4) {
     mu[1:4] ~ dmnorm(mu0[1:4], S2[1:4,1:4])
+
      p[j] ~ dbeta(1,1)
 +
     }
 +
#    mu[1:4] ~ dmnorm(mu0[1:4], S2[1:4,1:4])
 
     Omega[1:4,1:4] ~ dwish(S3[1:4,1:4],4)
 
     Omega[1:4,1:4] ~ dwish(S3[1:4,1:4],4)
 
   }
 
   }
Line 301: Line 308:
 
   data = list(
 
   data = list(
 
     surv = surv[eatherr,13:16],
 
     surv = surv[eatherr,13:16],
 +
    surv.scaled = scale(surv[eatherr,13:16]),
 
     qlen = qlen[13:16],
 
     qlen = qlen[13:16],
 
     S = length(eatherr),
 
     S = length(eatherr),
     mu0 = rep(0.5,4),
+
     mu0 = rep(0,4),
    S2 = diag(4)/100000,
+
#    S2 = diag(4)/100000,
 
     S3 = diag(4)/10000
 
     S3 = diag(4)/10000
 
   ),
 
   ),
Line 312: Line 320:
  
 
update(jags, 100)
 
update(jags, 100)
samps.j <- jags.samples(jags, c('mu','Omega', 'z'), 1000)
+
samps.j <- jags.samples(jags, c('Omega', 'p'), 1000)
 
j <- array(
 
j <- array(
   samps.j$mu,
+
   samps.j$p,
 
   dim = c(4,1000,4),
 
   dim = c(4,1000,4),
 
   dimnames = list(Question = 1:4, Iter = 1:1000, Seed = 1:4)
 
   dimnames = list(Question = 1:4, Iter = 1:1000, Seed = 1:4)
 
)
 
)
 
scatterplotMatrix(t(j[,,1]))
 
scatterplotMatrix(t(j[,,1]))
samps.c <- coda.samples(jags, c('mu','Omega','z'), 1000)
+
j.melt <- melt(j)
 +
ggplot(j.melt, aes(x = Iter, y = value, colour = as.character(Seed)))+
 +
  geom_line()+facet_wrap(~ Question)
 +
 
 +
j.Omega <- array(
 +
  samps.j$Omega,
 +
  dim = c(4,4,1000,4),
 +
  dimnames = list(Q1 = 1:4, Q2 = 1:4, Iter = 1:1000, Seed = 1:4)
 +
)
 +
scatterplotMatrix(t(j.Omega[1,,,1]))
 +
 
 +
j.Omega.melt <- melt(j.Omega)
 +
ggplot(j.Omega.melt, aes(x = Iter, y = value, colour = as.character(Seed)))+
 +
  geom_line()+facet_grid(Q1 ~ Q2)
 +
 
 +
samps.c <- coda.samples(jags, c('Omega','p'), 1000)
 
plot(samps.c)
 
plot(samps.c)
  
##############################
+
############################################################
 
+
cat("Version with multivariate normal.\n")
############ Version with multivariate normal
 
  
 
mod <- textConnection("
 
mod <- textConnection("
Line 360: Line 382:
 
plot(samps.c)
 
plot(samps.c)
  
##############################
+
###########################################################
 +
cat("Version with multivariate normal and fractile conversion.\n")
 +
 
 +
# Assume linearity in the responses, but it is artificially binned into survey answers.
 +
# Assume that answers are uncorrelated with a subgroup of respondents that have
 +
# answered in the same way to two questions. So, for each answer column d, you can
 +
#1 shuffle the vector d (keeping in mind the original order)
 +
#2 sort it based on the answer
 +
#3 renumber it by (1:length(d) - 0.5)/length(d) to get a fractile from 0..1.
 +
#4 transform it to normal distribution by qnorm(d)
 +
#5 reorder d to the original order
 +
# Then, use multivariate normal distribution to estimate parameters.
 +
 
 +
 
 +
fractalise <- function(d) {
 +
  f <- sample(seq_along(d)) # Create a random order for data
 +
  g <- order(d[f]) # First randomorder and then put in order by size
 +
  h <- qnorm((seq_along(d)-0.5)/length(d)) # 'z scores' of data points
 +
  return(h[order(g)][order(f)]) # Reorder fractiles to original order
 +
}
 +
 
 +
surv.f <- surv[1:nrow(surv) %in% eatherr & !is.na(rowSums(surv[13:16])) , ]
 +
surv.f[13:16] <- lapply(surv.f[13:16], fractalise)
 +
 
 +
 
 +
mod <- textConnection("
 +
  model{
 +
    for(i in 1:S) {
 +
      surv.f[i,1:4] ~ dmnorm(mu[], Omega[,])
 +
    }
 +
    for(j in 1:4) {
 +
      mu[j] ~ dnorm(0, 3)
 +
    }
 +
    Omega[1:4,1:4] ~ dwish(S3[1:4,1:4],5)
 +
    surv.pred[1:4] ~ dmnorm(mu[1:4], Omega[1:4,1:4])
 +
    Opri[1:4,1:4] ~ dwish(S3[1:4,1:4],5)
 +
  }
 +
")
 +
 
 +
jags <- jags.model(
 +
  mod,
 +
  data = list(
 +
    surv.f = surv.f,
 +
    S = nrow(surv.f),
 +
#    mu0 = rep(0.5,4), # hyperprior for mean
 +
#    S2 = diag(4)/100000, # hyperprior for mean
 +
    S3 = diag(4)*5#/10000 # hyperprior for covariance table
 +
  ),
 +
  n.chains = 4,
 +
  n.adapt = 100
 +
)
 +
 
 +
update(jags, 100)
 +
samps.j <- jags.samples(jags, c('mu','Omega', 'surv.pred', 'Opri'), 1000)
 +
j <- array(
 +
  samps.j$surv.pred,
 +
  dim = c(4,1000,4),
 +
  dimnames = list(Question = 1:4, Iter = 1:1000, Seed = 1:4)
 +
)
 +
 
 +
oprint(cor(t(samps.j$surv.pred[,,1])))
 +
oprint(cor(surv.f[13:16]))
 +
scatterplotMatrix(t(j[,,1]))
 +
scatterplotMatrix(surv.f[13:16])
 +
 
 +
j2 <- array(
 +
  samps.j$Opri,
 +
  dim = c(4,4,1000,4),
 +
  dimnames = list(Q1 = 1:4, Q2 = 1:4, Iter = 1:1000, Seed = 1:4)
 +
)
 +
j2.cov <- apply(j2[,,,1], MARGIN = 1:2, FUN = mean)
 +
oprint(j2.cov)
 +
scatterplotMatrix(t(j2[1,,,1]))
 +
 
 +
samps.c <- coda.samples(jags, c('mu','Omega', 'surv.pred'), 1000)
 +
plot(samps.c)
  
########### MODEL WITH BINOMIAL DISTRIBUTION ASSUMPTION
+
################################################################3
 +
cat("MODEL WITH BINOMIAL DISTRIBUTION ASSUMPTION.\n")
  
 
# Interesting fish eating questions
 
# Interesting fish eating questions
Line 544: Line 642:
 
scatterplotMatrix(t(p.pred[1,1,3,,,1]))
 
scatterplotMatrix(t(p.pred[1,1,3,,,1]))
  
objects.store(ans.pred, p.pred, p2.pred)
+
#objects.store(ans.pred, p.pred, p2.pred)
cat("Arrays ans.pred (answer), p.pred (probability to eat), p2.pred (p of answer) stored.\n")
+
#cat("Arrays ans.pred (answer), p.pred (probability to eat), p2.pred (p of answer) stored.\n")
 
</rcode>
 
</rcode>
  

Revision as of 11:00, 25 March 2017


Question

How Baltic herring and salmon are used as human food in Baltic sea countries? Which determinants affect on people’s eating habits of these fish species?

Answer

Survey data will be analysed during winter 2016-2017 and results will be updated here.

+ Show code

Rationale

Survey of eating habits of Baltic herring and salmon in Denmark, Estonia, Finland and Sweden has been done in September 2016 by Taloustutkimus oy. Content of the questionnaire can be accessed in Google drive. The actual data will be uploaded to Opasnet base on Octobere 2016.

The R-code to analyse the survey data will be provided on this page later on.

Data

Original datafile File:Goherr fish consumption.csv

Preprocessing

This code is used to preprocess the original questionnaire data from the above .csv file and to store the data as a usable variable to Opasnet base.

+ Show code

Bayes model

  • Model run 3.3.2017. All variables assumed independent. [1]
  • Model run 3.3.2017. p has more dimensions. [2]
  • Model run 25.3.2017. Several model versions: strange binomial+multivarnormal, binomial, fractalised multivarnormal [3]

+ Show code

Calculations

This code calculates how much (g/day) Baltic herring and salmon are eaten based on an Bayesian model build up based on the questionnaire data.

+ Show code

Assumptions

The following assumptions are used:

Assumptions for calculations(-)
ObsVariablevalueExplanationResult
1freq6times per year260 - 364
2freq5times per year104 - 208
3freq4times per year52
4freq3times per year12 - 36
5freq2times per year2 - 5
6freq1times per year0.5 - 0.9
7freq0times per year0
8amdish0grams / serving20 - 50
9amdish1grams / serving70 - 100
10amdish2grams / serving120 - 150
11amdish3grams / serving170 - 200
12amdish4grams / serving220 - 250
13amdish5grams / serving270 - 300
14amdish6grams / serving450 - 500
15ingridientfraction0.1 - 0.3
16amside0grams / serving20 - 50
17amside1grams / serving70 - 100
18amside2grams / serving120 - 150
19amside3grams / serving170 - 200
20amside4grams / serving220 - 250

Questionnaire


Dependencies

The survey data will be used as input in the benefit-risk assessment of Baltic herring and salmon intake, which is part of the WP5 work in Goherr-project.

Formula

See also

Keywords

References


Related files

<mfanonymousfilelist></mfanonymousfilelist>

Goherr: Fish consumption study. Opasnet . [4]. Accessed 17 May 2024.