Difference between revisions of "Goherr: Fish consumption study"

From Testiwiki
Jump to: navigation, search
(Preprocessing: surv preprocessing added)
(Analyses)
Line 266: Line 266:
 
library(ggplot2)
 
library(ggplot2)
 
library(reshape2)
 
library(reshape2)
 +
library(car)
 +
library(vegan)
  
objects.latest("Op_en7749", "preprocess") # [[Goherr: Fish consumption study]]: survey, surcol
+
objects.latest("Op_en7749", "preprocess2") # [[Goherr: Fish consumption study]]: survey, surcol
  
 
temp <- sapply(survey, as.numeric) # Can be done for surv to get a smaller matrix
 
temp <- sapply(survey, as.numeric) # Can be done for surv to get a smaller matrix
Line 285: Line 287:
 
   theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.4))+
 
   theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.4))+
 
   scale_fill_gradient2(low = "#480610", mid = "#FFFFFF", high = "#06480F", midpoint = 0, space = "Lab", guide = "colourbar")
 
   scale_fill_gradient2(low = "#480610", mid = "#FFFFFF", high = "#06480F", midpoint = 0, space = "Lab", guide = "colourbar")
</rcode>
 
  
==== Bayes model ====
+
####################### Descriptive statistics
  
* Model run 3.3.2017. All variables assumed independent. [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=lwlSwXazIDHDyJJg]
+
oprint(cor(surv, use = "pairwise.complete.obs"))
* Model run 3.3.2017. p has more dimensions. [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=ZmbNUuZeb7UOf8NP]
+
# --> Baltic salmon and herring eating are correlated, so they should be estimated together
* Model run 25.3.2017. Several model versions: strange binomial+multivarnormal, binomial, fractalised multivarnormal [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=pKe0s2Ldm1mbIVuO]
 
* Model run 27.3.2017 [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=2hY9p2r8CTJi3Qwq]
 
* Other models except multivariate normal were [http://en.opasnet.org/en-opwiki/index.php?title=Goherr:_Fish_consumption_study&oldid=40185 archived] and removed from active code 29.3.2017.
 
* Model run 29.3.2017 with raw data graphs [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=BB8nePJb7hzSw6Ha]
 
* Model run 29.3.2017 with salmon and herring ovariables stored [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=2Hz4tYjrQLnUfIXw]
 
* Model run 13.4.2017 with first version of coordinate matrix and principal coordinate analysis [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=2k2dKhYPc2UkOCY5]
 
 
 
<rcode name="bayes" label="Initiate Bayes model (for developers only)" graphics=1>
 
# This is code Op_en7749/bayes on page [[Goherr: Fish consumption study]]
 
 
 
library(OpasnetUtils)
 
library(ggplot2)
 
library(reshape2)
 
library(rjags)
 
library(car)
 
library(vegan)
 
#library(gridExtra) # Error: package ‘gridExtra’ was built before R 3.0.0: please re-install it
 
 
 
# Fish intake in humans
 
# Data from data.frame survey from page [[Goherr: Fish consumption study]]
 
# Start with salmon questions 46:49 (amounts eaten)
 
# Some preprocessing should be moved away from this code.
 
# Model assumes that questionnaire data follows binomial distribution with parameter p and max value levels.
 
# p depends on country (4 groups), sex (2), age (2) that are known individually.
 
# We should test whether sex makes a difference or whether it can be omitted.
 
# Predicted p for each question, determined by the group, are produced from the bayes model.
 
# p is used to sample answers, which are then combined with estimates of amounts related to each answer.
 
# Total amount eaten of each fish per modelled individual is finally calculated.
 
 
 
objects.latest("Op_en7749", "preprocess2") # [[Goherr: Fish consumption study]]: survey, surcol
 
 
 
agel <- as.character(unique(survey$Ages))
 
countryl <- sort(as.character(unique(survey$Country)))
 
genderl <- sort(as.character(unique(survey$Gender)))
 
fisl <- c("Salmon", "Herring")
 
 
 
# Interesting fish eating questions
 
surv <- survey[c(1,3,158,16,29,30,31,46:49,86,95:98)]
 
colnames(surv)
 
#[1] "Country"                      "Gender"                     
 
#[3] "Ages"                          "Fish eating"                 
 
#[5] "How often eat fish"            "Salmon eating"               
 
#[7] "Baltic salmon"                "How often Baltic salmon"     
 
#[9] "How much Baltic salmon"        "How often side Baltic salmon"
 
#[11] "How much side Baltic salmon"  "Eat Baltic herring"         
 
#[13] "How often Baltic herring"      "How much Baltic herring"     
 
#[15] "How often side Baltic herring" "How much side Baltic herring"
 
test <- sapply(unique(surv[c(4,7,12)]), unique)#function(x) sum(is.na(x)))
 
  
 
oprint(table(surv[c(12,7,4)], useNA = "ifany"))
 
oprint(table(surv[c(12,7,4)], useNA = "ifany"))
 
oprint(table(surv[c(13,12)], useNA = "ifany"))
 
oprint(table(surv[c(13,12)], useNA = "ifany"))
# For estimating distributions, we should
 
#1 remove people with Fish eating = No (142)
 
#2 merge Eat Baltic herring = I don't know with No (How often BH = NA always)
 
#3 merge Baltic salmon = NA with No (because they usually have answered BH questions)
 
 
oprint(table(is.na(rowSums(sapply(surv[4:16], as.numeric)))))
 
# BUT: there are so many missing values, that we just model BH and BS separately now.
 
 
#surv$Ages <- match(surv$Ages, agel) # Not a factor, coerce to integer
 
surv <- as.data.frame(lapply(surv, FUN = function(x) as.integer(x))) # Coerce to integers
 
surv[is.na(surv[[12]]) | surv[[12]] == 3 , 12] <- 1 # Eat Baltic herring: I don't know --> No
 
surv[4:16] <- lapply(surv[4:16], FUN = function(x) x-1) # Scale to start from zero
 
 
# Row numbers for respondents that have eaten fish, Baltic salmon, and Baltic herring
 
eatfish <- surv[[4]] %in% 1
 
eatsalm <- !(surv[[7]] %in% 0 | is.na(rowSums(surv[7:11])))
 
eatherr <- surv[[12]] %in% 1 & !is.na(rowSums(surv[13:16]))
 
 
# Oletetaan, että covarianssimatriisi on vakio kaikille maille ja sukupuolille yms
 
# mutta keskiarvo on spesifi näille ja kysymykselle.
 
 
#qlen <-  c(4,2,2,2,6,2,2,7,7,7,5,2,7,7,7,5) # Number of options in each question
 
# qlen not needed when dbinom is not used.
 
 
agel
 
countryl
 
genderl
 
fisl
 
 
####################### Descriptive statistics
 
 
oprint(cor(surv, use = "pairwise.complete.obs"))
 
# --> Baltic salmon and herring eating are correlated, so they should be estimated together
 
  
 
############################# Plot original data
 
############################# Plot original data
Line 434: Line 354:
 
)
 
)
  
############################################################
+
##################### CORRELATION MATRIX
 +
 
 +
temp <- sapply(survey, as.numeric) # Can be done for surv to get a smaller matrix
 +
 
 +
survey_correlations <- (cor(temp, method="spearman", use="pairwise.complete.obs"))
 +
 
 +
temp <- colnames(survey_correlations)
 +
 
 +
melted_correlations <- melt(survey_correlations)
 +
 
 +
melted_correlations$Var1 <- factor(melted_correlations$Var1, levels=temp)
 +
melted_correlations$Var2 <- factor(melted_correlations$Var2, levels=temp)
 +
melted_correlations$value <- ifelse(melted_correlations$value >= 0.99,NA,melted_correlations$value)
 +
 
 +
ggplot(melted_correlations, aes(x = Var1, y = Var2, fill = value, label= round(value, 2)))+
 +
  geom_raster()+
 +
  theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.4))+
 +
  scale_fill_gradient2(low = "#480610", mid = "#FFFFFF", high = "#06480F", midpoint = 0, space = "Lab", guide = "colourbar")
 +
 
 +
############################### PRINCIPAL COORDINATE ANALYSIS (PCoA)
 +
 
 +
#tämä osa valmistaa sen datan.
 +
hypocols1 <- c(46:49,95:98)
 +
answ <- sapply(survey[hypocols1], FUN=as.numeric)
 +
answ <- as.matrix(answ[!is.na(rowSums(answ)),])
 +
 
 +
pcoa_caps <- capscale(t(answ) ~ 1, distance="euclidean") ##PCoA done
 +
 
 +
## Kuva koko hypoteeseista
 +
 
 +
traits <- as.factor(c(rep("bipedalism", 10), rep("brain", 10), rep("hairlessness", 8),
 +
                      rep("fat", 4), rep("larynx", 3), rep("speech", 7), rep("other", 9)))
 +
colstr <- c("palevioletred1","royalblue1","seagreen1","violet","khaki2","skyblue", "orange")
 +
trait.cols <- colstr[traits]
 +
 
 +
hypo_sizes <- (5 - colMeans(answ))
 +
leg_sizes <- c(4, 3, 2, 1, 0.01)
 +
 
 +
#pdf(file="pcoa_plot.pdf", height=6, width=7.5)
 +
plot(pcoa_caps, display = c("sp", "wa"), type="n")#, xlim=c(-6,4.5)) ## PCoA biplot, full scale
 +
points(pcoa_caps, display= c("sp"), col="gray40") # adding the people points
 +
points(pcoa_caps, display= c("wa"), pch=19)#, cex=hypo_sizes, col=trait.cols)
 +
text(pcoa_caps, display=c("wa"), srt=25, cex=0.5)
 +
#legend(x=-6, y=3.8, levels(traits), fill=colstr, bty="n", cex=1)
 +
#legend(-6, -2, legend=c("Very likely", "Moderately likely",
 +
#                        "No opinion", "Moderately unlikely", "Very unlikely"),
 +
#      pch=21, pt.cex = leg_sizes, bty="n", cex=1)
 +
#dev.off()
 +
</rcode>
 +
 
 +
==== Bayes model ====
 +
 
 +
* Model run 3.3.2017. All variables assumed independent. [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=lwlSwXazIDHDyJJg]
 +
* Model run 3.3.2017. p has more dimensions. [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=ZmbNUuZeb7UOf8NP]
 +
* Model run 25.3.2017. Several model versions: strange binomial+multivarnormal, binomial, fractalised multivarnormal [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=pKe0s2Ldm1mbIVuO]
 +
* Model run 27.3.2017 [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=2hY9p2r8CTJi3Qwq]
 +
* Other models except multivariate normal were [http://en.opasnet.org/en-opwiki/index.php?title=Goherr:_Fish_consumption_study&oldid=40185 archived] and removed from active code 29.3.2017.
 +
* Model run 29.3.2017 with raw data graphs [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=BB8nePJb7hzSw6Ha]
 +
* Model run 29.3.2017 with salmon and herring ovariables stored [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=2Hz4tYjrQLnUfIXw]
 +
* Model run 13.4.2017 with first version of coordinate matrix and principal coordinate analysis [http://en.opasnet.org/en-opwiki/index.php?title=Special:RTools&id=2k2dKhYPc2UkOCY5]
 +
 
 +
<rcode name="bayes" label="Initiate Bayes model (for developers only)" graphics=1>
 +
# This is code Op_en7749/bayes on page [[Goherr: Fish consumption study]]
 +
 
 +
library(OpasnetUtils)
 +
library(ggplot2)
 +
library(reshape2)
 +
library(rjags)
 +
library(car)
 +
library(vegan)
 +
library(MASS)
 +
#library(gridExtra) # Error: package ‘gridExtra’ was built before R 3.0.0: please re-install it
 +
 
 +
# Fish intake in humans
 +
# Data from data.frame survey from page [[Goherr: Fish consumption study]]
 +
 
 +
objects.latest("Op_en7749", "preprocess2") # [[Goherr: Fish consumption study]]: survey, surv, ...
 +
 
 
cat("Version with multivariate normal.\n")
 
cat("Version with multivariate normal.\n")
ggplot(data.frame(A=1:2, B=1:2), aes(A, y=B))+geom_point()+
 
  labs(title="Version with multivariate normal.")
 
  
 
mod <- textConnection("
 
mod <- textConnection("
Line 474: Line 469:
 
samps.j <- jags.samples(
 
samps.j <- jags.samples(
 
   jags,  
 
   jags,  
   c('muh','Omegah', 'ansh.pred', 'mus', 'Omegas', 'anss.pred'),  
+
   c('mus', 'Omegas', 'anss.pred','muh','Omegah','ansh.pred'),  
 
   1000
 
   1000
 
)
 
)
jh <- array(
+
js <- array(
   samps.j$ansh.pred,
+
   c(
   dim = c(4,1000,4),
+
    samps.j$mus[,,1],
   dimnames = list(Question = 1:4, Iter = 1:1000, Seed = 1:4)
+
    samps.j$Omegas[,1,,1],
 +
    samps.j$Omegas[,2,,1],
 +
    samps.j$Omegas[,3,,1],
 +
    samps.j$Omegas[,4,,1],
 +
    samps.j$anss.pred[,,1],
 +
    samps.j$muh[,,1],
 +
    samps.j$Omegah[,1,,1],
 +
    samps.j$Omegah[,2,,1],
 +
    samps.j$Omegah[,3,,1],
 +
    samps.j$Omegah[,4,,1],
 +
    samps.j$ansh.pred[,,1]
 +
  ),
 +
   dim = c(4,1000,6,2),
 +
   dimnames = list(
 +
    Question = 1:4,
 +
    Iter = 1:1000,
 +
    Parameter = c("mu","Omega1", "Omega2", "Omega3", "Omega4", "ans.pred"),
 +
    Fish = c("Salmon", "Herring")
 +
  )
 
)
 
)
scatterplotMatrix(t(jh[,,1]))
 
  
js <- array(
+
fish.param <- list(
   samps.j$anss.pred,
+
   mu = apply(js[,,1,], MARGIN = c(1,3), FUN = mean),
  dim = c(4,1000,4),
+
   Omega = lapply(
   dimnames = list(Question = 1:4, Iter = 1:1000, Seed = 1:4)
+
    1:2,
 +
    FUN = function(x) {
 +
      solve(apply(js[,,2:5,], MARGIN = c(1,3,4), FUN = mean)[,,x])
 +
    } # solve matrix: precision->covariace
 +
  )
 
)
 
)
scatterplotMatrix(t(js[,,1]))
 
  
samps.c <- coda.samples(
+
jsp <- Ovariable(
   jags,  
+
  "jsp",
  c('muh','Omegah', 'ansh.pred', 'mus', 'Omegas', 'anss.pred'),  
+
  dependencies = data.frame(Name = "fish.param"),
   1000
+
   formula = function(...) {
 +
    jsp <- lapply(
 +
      1:2,
 +
      FUN = function(x) {
 +
        mvrnorm(openv$N, fish.param$mu[,x], fish.param$Omega[[x]])
 +
      }
 +
    )
 +
   
 +
    jsp <- rbind(
 +
      cbind(
 +
        Fish = "Salmon",
 +
        Iter = 1:nrow(jsp[[1]]),
 +
        as.data.frame(jsp[[1]])
 +
      ),
 +
      cbind(
 +
        Fish = "Herring",
 +
        Iter = 1:nrow(jsp[[2]]),
 +
        as.data.frame(jsp[[2]])
 +
      )
 +
    )
 +
    jsp <- melt(jsp, id.vars = c("Iter", "Fish"), variable.name = "Question", value.name = "Result")
 +
    return(jsp)
 +
   }
 
)
 
)
plot(samps.c)
 
  
jhm <- melt(jh[,,1], value.name = "Result")
+
oftenS <- Ovariable(
jsm <- melt(js[,,1], value.name = "Result")
+
  "oftenS",
 +
  dependencies = data.frame(Name="jsp"),
 +
  formula = function(...) {
 +
    jsp[jsp$Fish == "Salmon" & jsp$Question == "1" , !colnames(jsp@output) %in% c("Fish", "Question")]
 +
  }
 +
)
  
oftenh  <- Ovariable("oftenh", data = jsm[jsm$Question == 1, 2:3])
+
muchS <- Ovariable(
muchh   <- Ovariable("muchh" , data = jsm[jsm$Question == 2, 2:3])
+
  "muchS",
ofsideh <- Ovariable("ofsideh", data = jsm[jsm$Question == 3, 2:3])
+
   dependencies = data.frame(Name="jsp"),
musideh <- Ovariable("musideh", data = jsm[jsm$Question == 4, 2:3])
+
  formula = function(...) {
oftens  <- Ovariable("oftens", data = jhm[jsm$Question == 1, 2:3])
+
    jsp[jsp$Fish == "Salmon" & jsp$Question == "2" , !colnames(jsp@output) %in% c("Fish", "Question")]
muchs  <- Ovariable("muchs" , data = jhm[jsm$Question == 2, 2:3])
+
  }
ofsides <- Ovariable("ofsides", data = jhm[jsm$Question == 3, 2:3])
+
)
musides <- Ovariable("musides", data = jhm[jsm$Question == 4, 2:3])
 
  
#objects.store(oftenh, muchh, ofsideh, musideh, oftens, muchs, ofsides, musides)
+
oftensideS <- Ovariable(
#cat("Ovariables oftenh, muchh, ofsideh, musideh, oftens, muchs, ofsides, musides stored.\n")
+
  "oftensideS",
 +
  dependencies = data.frame(Name="jsp"),
 +
  formula = function(...) {
 +
    jsp[jsp$Fish == "Salmon" & jsp$Question == "3" , !colnames(jsp@output) %in% c("Fish", "Question")]
 +
  }
 +
)
  
##################### CORRELATION MATRIX
+
muchsideS <- Ovariable(
 +
  "muchsideS",
 +
  dependencies = data.frame(Name="jsp"),
 +
  formula = function(...) {
 +
    jsp[jsp$Fish == "Salmon" & jsp$Question == "4" , !colnames(jsp@output) %in% c("Fish", "Question")]
 +
  }
 +
)
  
temp <- sapply(survey, as.numeric) # Can be done for surv to get a smaller matrix
+
oftenH <- Ovariable(
 +
  "oftenH",
 +
  dependencies = data.frame(Name="jsp"),
 +
  formula = function(...) {
 +
    jsp[jsp$Fish == "Herring" & jsp$Question == "1" , !colnames(jsp@output) %in% c("Fish", "Question")]
 +
  }
 +
)
  
survey_correlations <- (cor(temp, method="spearman", use="pairwise.complete.obs"))
+
muchH <- Ovariable(
 +
  "muchH",
 +
  dependencies = data.frame(Name="jsp"),
 +
  formula = function(...) {
 +
    jsp[jsp$Fish == "Herring" & jsp$Question == "2" , !colnames(jsp@output) %in% c("Fish", "Question")]
 +
  }
 +
)
  
temp <- colnames(survey_correlations)
+
oftensideH <- Ovariable(
 +
  "oftensideH",
 +
  dependencies = data.frame(Name="jsp"),
 +
  formula = function(...) {
 +
    jsp[jsp$Fish == "Herring" & jsp$Question == "3" , !colnames(jsp@output) %in% c("Fish", "Question")]
 +
  }
 +
)
  
melted_correlations <- melt(survey_correlations)
+
muchsideH <- Ovariable(
 +
  "muchsideH",
 +
  dependencies = data.frame(Name="jsp"),
 +
  formula = function(...) {
 +
    jsp[jsp$Fish == "Herring" & jsp$Question == "4" , !colnames(jsp@output) %in% c("Fish", "Question")]
 +
  }
 +
)
  
melted_correlations$Var1 <- factor(melted_correlations$Var1, levels=temp)
+
amount <- Ovariable(
melted_correlations$Var2 <- factor(melted_correlations$Var2, levels=temp)
+
  "amount",
melted_correlations$value <- ifelse(melted_correlations$value >= 0.99,NA,melted_correlations$value)
+
  dependencies = data.frame(Name = c(
 +
    "oftenS", "muchS",
 +
    "oftensideS", "muchsideS",
 +
    "oftenH", "muchH",
 +
    "oftensideH", "muchsideH"
 +
  ))
 +
  , formula = function(...) {
 +
    oftenS * muchS + oftensideS * muchsideS + oftenH * muchH + oftensideH * muchsideH
 +
  }
 +
)
 +
amount <- EvalOutput(amount)
 +
oprint(head(amount@output))
 +
scatterplotMatrix(jsp[[1]])
 +
scatterplotMatrix(t(js[,,6]))
  
ggplot(melted_correlations, aes(x = Var1, y = Var2, fill = value, label= round(value, 2)))+
+
head(t(js[,,6]))
  geom_raster()+
+
ggplot(melt(js), aes(x=value, colour=Var2))+geom_density()
  theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.4))+
+
ggplot(as.data.frame(js), aes(x = anss.pred, y = Sampled))+geom_point()+stat_ellipse()
   scale_fill_gradient2(low = "#480610", mid = "#FFFFFF", high = "#06480F", midpoint = 0, space = "Lab", guide = "colourbar")
+
coda.j <- coda.samples(
 +
   jags,
 +
  c('mus', 'Omegas', 'anss.pred', 'mu0'),  
 +
  1000
 +
)
  
############################### PRINCIPAL COORDINATE ANALYSIS (PCoA)
+
plot(coda.j)
 
+
jh <- array(
#tämä osa valmistaa sen datan.
+
  samps.j$ansh.pred,
hypocols1 <- c(46:49,95:98)
+
  dim = c(4,1000,4),
answ <- sapply(survey[hypocols1], FUN=as.numeric)
+
  dimnames = list(Question = 1:4, Iter = 1:1000, Seed = 1:4)
answ <- as.matrix(answ[!is.na(rowSums(answ)),])
+
)
 
+
scatterplotMatrix(t(jh[,,1]))
pcoa_caps <- capscale(t(answ) ~ 1, distance="euclidean") ##PCoA done
 
 
 
## Kuva koko hypoteeseista
 
 
 
traits <- as.factor(c(rep("bipedalism", 10), rep("brain", 10), rep("hairlessness", 8),
 
                      rep("fat", 4), rep("larynx", 3), rep("speech", 7), rep("other", 9)))
 
colstr <- c("palevioletred1","royalblue1","seagreen1","violet","khaki2","skyblue", "orange")
 
trait.cols <- colstr[traits]
 
 
 
hypo_sizes <- (5 - colMeans(answ))
 
leg_sizes <- c(4, 3, 2, 1, 0.01)
 
 
 
#pdf(file="pcoa_plot.pdf", height=6, width=7.5)
 
plot(pcoa_caps, display = c("sp", "wa"), type="n")#, xlim=c(-6,4.5)) ## PCoA biplot, full scale
 
points(pcoa_caps, display= c("sp"), col="gray40") # adding the people points
 
points(pcoa_caps, display= c("wa"), pch=19)#, cex=hypo_sizes, col=trait.cols)
 
text(pcoa_caps, display=c("wa"), srt=25, cex=0.5)
 
#legend(x=-6, y=3.8, levels(traits), fill=colstr, bty="n", cex=1)
 
#legend(-6, -2, legend=c("Very likely", "Moderately likely",
 
#                        "No opinion", "Moderately unlikely", "Very unlikely"),
 
#      pch=21, pt.cex = leg_sizes, bty="n", cex=1)
 
#dev.off()
 
  
 +
#objects.store(oftenh, muchh, ofsideh, musideh, oftens, muchs, ofsides, musides)
 +
#cat("Ovariables oftenh, muchh, ofsideh, musideh, oftens, muchs, ofsides, musides stored.\n")
 
</rcode>
 
</rcode>
  

Revision as of 15:15, 20 April 2017


Question

How Baltic herring and salmon are used as human food in Baltic sea countries? Which determinants affect on people’s eating habits of these fish species?

Answer

Survey data will be analysed during winter 2016-2017 and results will be updated here.

+ Show code

Rationale

Survey of eating habits of Baltic herring and salmon in Denmark, Estonia, Finland and Sweden has been done in September 2016 by Taloustutkimus oy. Content of the questionnaire can be accessed in Google drive. The actual data will be uploaded to Opasnet base on Octobere 2016.

The R-code to analyse the survey data will be provided on this page later on.

Data

Original datafile File:Goherr fish consumption.csv

Preprocessing

This code is used to preprocess the original questionnaire data from the above .csv file and to store the data as a usable variable to Opasnet base. The code stores a data.frame named survey.

  • Model run 13.4.2017 [1]
  • Model run 20.4.2017 [2] (contains surv and helping vectors)

+ Show code

Analyses

Error creating thumbnail: Unable to save thumbnail to destination
Correlation matrix of all questions in the survey (answers converted to numbers).

Model must contain predictors such as country, gender, age etc. Maybe we should first study what determinants are important? Model must also contain determinants that would increase or decrease fish consumption. This should be conditional on the current consumption. How? Maybe we should look at principal coordinates analysis with all questions to see how they behave.

Also look at correlation table to see clusters.

Some obvious results:

  • If reports no fish eating, many subsequent answers are NA.
  • No vitamins correlates negatively with vitamin intake.
  • Unknown salmon correlates negatively with the types of salmon eaten.
  • Different age categories correlate with each other.

However, there are also meaningful negative correlations:

  • Country vs allergy
  • Country vs Norwegian salmon and Rainbow trout
  • Country vs not traditional.
  • Country vs recommendation awareness
  • Allergy vs economic wellbeing
  • Baltic salmon use (4 questions) vs Don't like taste and Not used to
  • All questions between Easy to cook ... Traditional dish

Meaningful positive correlations:

  • All questions between Baltic salmon ... Rainbow trout
  • How often Baltic salmon/herring/side salmon/side herring
  • How much Baltic salmon/herring/side salmon/side herring
  • Better availability ... Recommendation
  • All questions between Economic wellbeing...Personal aims
  • Omega3, Vitamin D, and Other vitamins

Study plan:

  • Determinants

+ Show code

Bayes model

  • Model run 3.3.2017. All variables assumed independent. [3]
  • Model run 3.3.2017. p has more dimensions. [4]
  • Model run 25.3.2017. Several model versions: strange binomial+multivarnormal, binomial, fractalised multivarnormal [5]
  • Model run 27.3.2017 [6]
  • Other models except multivariate normal were archived and removed from active code 29.3.2017.
  • Model run 29.3.2017 with raw data graphs [7]
  • Model run 29.3.2017 with salmon and herring ovariables stored [8]
  • Model run 13.4.2017 with first version of coordinate matrix and principal coordinate analysis [9]

+ Show code

Calculations

This code calculates how much (g/day) Baltic herring and salmon are eaten based on an Bayesian model build up based on the questionnaire data.

+ Show code

Assumptions

The following assumptions are used:

Assumptions for calculations(-)
ObsVariablevalueExplanationResult
1freq6times per year260 - 364
2freq5times per year104 - 208
3freq4times per year52
4freq3times per year12 - 36
5freq2times per year2 - 5
6freq1times per year0.5 - 0.9
7freq0times per year0
8amdish0grams / serving20 - 50
9amdish1grams / serving70 - 100
10amdish2grams / serving120 - 150
11amdish3grams / serving170 - 200
12amdish4grams / serving220 - 250
13amdish5grams / serving270 - 300
14amdish6grams / serving450 - 500
15ingridientfraction0.1 - 0.3
16amside0grams / serving20 - 50
17amside1grams / serving70 - 100
18amside2grams / serving120 - 150
19amside3grams / serving170 - 200
20amside4grams / serving220 - 250

Questionnaire


Dependencies

The survey data will be used as input in the benefit-risk assessment of Baltic herring and salmon intake, which is part of the WP5 work in Goherr-project.

Formula

See also

Keywords

References


Related files

<mfanonymousfilelist></mfanonymousfilelist>

Goherr: Fish consumption study. Opasnet . [10]. Accessed 17 May 2024.