In a), we have

log(mu_a) = t(X) %*% beta, Y_i ~ N(mu_a, sigma^2)

ie we are modelling mu_a in terms of explanatory variables X and

parameters beta, and the link function operates on mu_a. mu_a is

estimated by mean(Y_i).

in b) we have

mu_b = t(X) %*% beta, log(Y_i) ~ N(mu_b, sigma^2)

Now, mean(Y_i) estimates mu_a, and mean(log(Y_i) ) estimates mu_b, but

clearly mu_a != mu_b because mean(log(x)) != log(mean(x))

So they are different models entirely. Comparing these models is

slightly tricky, because taking log(Y_i) means that you need to use the

change of variable formula to make the likelihood in b) comparable to

the likelihood of a). You can't just compare AIC's or the deviances for

example.

hope this helps,

Simon.

where mu_i is some function of On Thu, 2008-04-24 at 13:38 +1200, Tomas

Easdale wrote:

> Hi there,

>

> I am using glms. Could someone please explain what's the difference

> between (a) using a gaussian family distribution with a LOG link

> function and (b) LOG transforming the response variable with a normal

> distribution (Gaussian family distribution with identity link function).

> The outputs differ and clearly one option or the other will result in

> better fits depending on the dataset (everything else equal) but I want

> to understand why is this so.

>

> Thanks in advance,

>

> Toms Easdale

> Landcare Research, NZ

>

>

>

> [[alternative HTML version deleted]]

>

> _______________________________________________

> R-sig-ecology mailing list

>

[hidden email]
>

https://stat.ethz.ch/mailman/listinfo/r-sig-ecology--

Simon Blomberg, BSc (Hons), PhD, MAppStat.

Lecturer and Consultant Statistician

Faculty of Biological and Chemical Sciences

The University of Queensland

St. Lucia Queensland 4072

Australia

Room 320 Goddard Building (8)

T: +61 7 3365 2506

http://www.uq.edu.au/~uqsblombemail: S.Blomberg1_at_uq.edu.au

Policies:

1. I will NOT analyse your data for you.

2. Your deadline is your problem.

The combination of some data and an aching desire for

an answer does not ensure that a reasonable answer can

be extracted from a given body of data. - John Tukey.

_______________________________________________

R-sig-ecology mailing list

[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology