Dear List,

Could anyone explain me why overallĀ log likelihood in an MclustDA is

not a sum of log likelihoods in the models fitted to the groups? See the

numbers in this simple example:

library(mclust)

attach(iris)

m<-MclustDA(Sepal.Length,class=Species)

logLik(m)

m$models$ setosa$loglik

m$models$ versicolor$loglik

m$models$ virginica$loglik

I recognized that overall log likelihood is calculated by a rather

tricky way: likelihoods of all models are calculated for all objects

(without regarding their a priori classification), then (weighted?)

average of these likelihoods are calculated, and the overall log

likelihood is the sum of logarithms of these averages.

This code illustrate this way of calculation:

likelihood<-with(m$models$

setosa,dnorm(Sepal.Length,mean=parameters$mean,sd=sqrt(parameters$variance$sigmasq)))/3

likelihood<-likelihood+with(m$models$

versicolor,dnorm(Sepal.Length,mean=parameters$mean,sd=sqrt(parameters$variance$sigmasq)))/3

likelihood<-likelihood+with(m$models$

virginica,dnorm(Sepal.Length,mean=parameters$mean,sd=sqrt(parameters$variance$sigmasq)))/3

sum(log(likelihood))

Why this is the correct way of calculation? It also would be useful if

you could recommend a literature that answer to my question.

Thanks!

Zoltan

_______________________________________________

R-sig-ecology mailing list

[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology