Correction: negative binomial mixed model using glmmadmb

classic Classic list List threaded Threaded
2 messages Options
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Correction: negative binomial mixed model using glmmadmb

Heather Major
A small correction to my original post: the arrival fixed effect is a number (a count) and ranges between 0 - 52.


************************
Hello, I am new to R and working to understand the programming language and how the different tests work. I've jumped a bit in the deep-end, as I moved to R because SPSS couldn't handle the model I wanted to run and I don't have access of SAS (which was, until recently my go-to for stats).

I have done my best to work through a number of examples, but that hasn't helped me figure out how to proceed with my analysis.

I am using the glmmADMB package to analyze count data of arrivals at a seabird colony.

Data:
Fixed effects:
Arrivals: # of individuals arriving at the colony site in one-hour long intervals
TAS: Time After Sunset (factor with four categories: 3,4,5, &6)
MA: Moon Absence (ratio variable of the proportion of moon absent during the night, ranging from 0 (full moon present) to 1 (no moon present)).
CC: Cloud Cover (ratio variable of proportion of sky covered by clouds, 0 = no clouds 1 = complete overcast sky.
WS: wind speed (ratio variable of wind speed in meters per second)
WH: wave height (ratio variable of wave height in meters)
Random effects:
JDOY: Julian Day of Year (factor: includes 50 days)

Model:
glmm2<-glmmadmb(Arrivals~ (1|JDOY)+TAS+MA+CC+CWS+CWH+TAS*MA+TAS*CC+TAS*CWS+TAS*CWH+TAS*MA*CC+MA*CC, data = murrelet, family="nbinom")

n=188

This model runs fine (i.e., no errors). I have also run the same model as a poisson, it also runs well, but the mean and variance are not equal (hence the negative binomial distribution). I would like to use AIC to draw inference from my data and have seven other candidate models (the one shown above is the global model). To do this, I need to extract an estimate of c-hat for the global model to include in my calculation of QAICc for model selection. This is where I get stuck.

I have tried a number of formulas presented in different pdfs to calculate c-hat, including using the code:
dfun <- function(object) {
  MM<-Anova(object)
  df_residuals<-max(MM$Df)
  res<-residuals(object)
  res[is.na(res)==T]<-0
  return(sum(1 * res^2)/df_residuals)
}

This code does give me a value for c-hat, but it is VERY high (45.23), suggesting something is structurally wrong with my model. I have also tried running the model with 'zeroInflation = TRUE', the c-hat is higher (47.28).

I've tried a number of other methods, but the above code is the only one that has resulted in anything other than an error message or a NULL result.

What I don't understand is what the above code is doing and because of that I am unsure of what my next step should be.

My specific questions are:

1.     Is this an appropriate method to calculate c-hat, and if so, what is the code doing (i.e., how is it calculating c-hat, is it chi-square deviance divided by residual df or something else)?

2.     If this code is correct and doing what I hope it is doing (see previous question), what is my next step? How do I figure out what the structural issue is with my data?

3.     Am I missing something (probably very simple, but very important) that is causing these issues?


Thanks for any help/advice,

Heather

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-ecology mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Correction: negative binomial mixed model using glmmadmb

Ben Bolker-4
Heather Major <heather.major@...> writes:

>
> A small correction to my original post: the arrival
> fixed effect is a number (a count) and ranges between 0 - 52.
>

> Hello, I am new to R and working to understand the programming
> language and how the different tests work.  I've jumped a bit in the
> deep-end, as I moved to R because SPSS couldn't handle the model I
> wanted to run and I don't have access of SAS (which was, until
> recently my go-to for stats).
 
> I have done my best to work through a number of examples, but that
> hasn't helped me figure out how to proceed with my analysis.
 
> I am using the glmmADMB package to analyze count data of arrivals at
>  a seabird colony.
 
> Data:
> Fixed effects:
> Arrivals: # of individuals arriving at the colony site in one-hour
  long intervals
> TAS: Time After Sunset (factor with four categories: 3,4,5, &6)
> MA: Moon Absence (ratio variable of the proportion of moon absent
   during the night, ranging from 0 (full
> moon present) to 1 (no moon present)).
> CC: Cloud Cover (ratio variable of proportion of sky covered by clouds,
   0 = no clouds 1 = complete overcast sky.
> WS: wind speed (ratio variable of wind speed in meters per second)
> WH: wave height (ratio variable of wave height in meters)
> Random effects:
> JDOY: Julian Day of Year (factor: includes 50 days)
>
> Model:
> glmm2<-glmmadmb(Arrivals~
 (1|JDOY)+TAS+MA+CC+CWS+CWH+TAS*MA+TAS*CC+TAS*CWS+TAS*CWH+TAS*MA*CC+MA*CC,
    data = murrelet, family="nbinom")

 You don't need all those *: A*B is equivalent to A+B+A:B (in R
: means 'interaction' (* in SAS), * means 'main effects plus all
interactions'; I _think_

 (1|JDOY)+TAS*MA*CC+TAS*(CWS+CWH)

is equivalent.

>
> n=188
>

You might be pushing these data too hard; what is the number of parameters
(length(fixef(fitted_model)) or
ncol(model.matrix(~TAS*MA*CC+TAS*(CWS+CWH),data=your_data) ...)
You need 10-20 data points per parameter ...

> This model runs fine (i.e., no errors). I have also run the same
> model as a poisson, it also runs well, but the mean and variance are
> not equal (hence the negative binomial distribution). I would like
> to use AIC to draw inference from my data and have seven other
> candidate models (the one shown above is the global model). To do
> this, I need to extract an estimate of c-hat for the global model to
> include in my calculation of QAICc for model selection. This is
> where I get stuck.

  You don't need the Q part of QAICc; quasi-AIC(c)s are only
needed to correct for overdispersion when you're using a response
distribution (e.g. Poisson) that fixes the dispersion.

For future reference, I think that in general *something* like

sum(residuals(model)^2))/(nrow(data)-length(fixef(model))-
   (number of variance parameters)

should give you c-hat ...

_______________________________________________
R-sig-ecology mailing list
[hidden email]
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Loading...