Hi there,
I am using glms. Could someone please explain what's the difference between (a) using a gaussian family distribution with a LOG link function and (b) LOG transforming the response variable with a normal distribution (Gaussian family distribution with identity link function). The outputs differ and clearly one option or the other will result in better fits depending on the dataset (everything else equal) but I want to understand why is this so. Thanks in advance, Tomás Easdale Landcare Research, NZ [[alternative HTML version deleted]] _______________________________________________ R-sig-ecology mailing list [hidden email] https://stat.ethz.ch/mailman/listinfo/r-sig-ecology |
In a), we have
log(mu_a) = t(X) %*% beta, Y_i ~ N(mu_a, sigma^2) ie we are modelling mu_a in terms of explanatory variables X and parameters beta, and the link function operates on mu_a. mu_a is estimated by mean(Y_i). in b) we have mu_b = t(X) %*% beta, log(Y_i) ~ N(mu_b, sigma^2) Now, mean(Y_i) estimates mu_a, and mean(log(Y_i) ) estimates mu_b, but clearly mu_a != mu_b because mean(log(x)) != log(mean(x)) So they are different models entirely. Comparing these models is slightly tricky, because taking log(Y_i) means that you need to use the change of variable formula to make the likelihood in b) comparable to the likelihood of a). You can't just compare AIC's or the deviances for example. hope this helps, Simon. where mu_i is some function of On Thu, 2008-04-24 at 13:38 +1200, Tomas Easdale wrote: > Hi there, > > I am using glms. Could someone please explain what's the difference > between (a) using a gaussian family distribution with a LOG link > function and (b) LOG transforming the response variable with a normal > distribution (Gaussian family distribution with identity link function). > The outputs differ and clearly one option or the other will result in > better fits depending on the dataset (everything else equal) but I want > to understand why is this so. > > Thanks in advance, > > Toms Easdale > Landcare Research, NZ > > > > [[alternative HTML version deleted]] > > _______________________________________________ > R-sig-ecology mailing list > [hidden email] > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology Simon Blomberg, BSc (Hons), PhD, MAppStat. Lecturer and Consultant Statistician Faculty of Biological and Chemical Sciences The University of Queensland St. Lucia Queensland 4072 Australia Room 320 Goddard Building (8) T: +61 7 3365 2506 http://www.uq.edu.au/~uqsblomb email: S.Blomberg1_at_uq.edu.au Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem. The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. - John Tukey. _______________________________________________ R-sig-ecology mailing list [hidden email] https://stat.ethz.ch/mailman/listinfo/r-sig-ecology |
Free forum by Nabble | Edit this page |