Hello again. This is a query which follows up on the very useful answer
provided by Drew Tyre who suggested creating new variables and adding the random term you'll see in the model below. Quick reminder: the analysis aims at understanding how the presence of "bodyguard" ants regulates leaf damage caused by lepidopteran herbivores on a focal plant species. For this, I set up fourteen pairs of plants in a nature reserve and then assigned each plant in each pair to one of two treatments, either "with ants" or "without ants", by applying a physical barrier at the base of the plant. In the following nine weeks I measured four response variables once a week on a subsample of ten randomly chosen leaves of each plant. I thus have nine repeated measures on each plant. >From analyses done over the weekend I know excluding ants from the plants (variable "treat" in the dataset) has a positive effect on the number of butterfly eggs and larvae found on plants. Now I want to know if this results in a reduced damage to the leafs (variable "dam" in the data set), which is measured as the percentage of the area eaten by herbivores. My response variables are: ants: number of ants (this was measured just to check the physical barrier had worked OK) eggs: number of butterfly eggs larve: number of butterfly larvae dam: percent leaf damage (percentage eaten by larvae) My explanatory variables are: treat: treatment (two levels: "con" and "sin" mean with and without ants, respectively) pair: plant pair date #I provide a workable example below require(lme4); require(tidyverse); require(lubridate) #read data from Google drive id <- "0Bzd8I1jr8z_iU0h6R0hxaElseDA" # google file ID gaston <- read.table(sprintf(" <a href="https://docs.google.com/uc?id=%s&export=download">https://docs.google.com/uc?id=%s&export=download", id), head=T) #create variables following suggestion by Drew Tyre gaston <- mutate(gaston, plant = paste0(pair,treat), date = dmy(date), day = as.numeric((date - min(date)))) #After a few hours of scavenging through books, articles, blogs and R-lists I ended up with the following model which turns the % damage into a proportion and then logit transforms it so I can use lmer. M1 <- lmer(logit(dam/100) ~ treat + day + (1|pair/plant), data=gaston) summary(M1) #Am I headed in the right direction? Any advice would be greatly appreciated. Best, Mariano *Dr. Mariano Devoto* Profesor Adjunto - Cátedra de Botánica General, Facultad de Agronomía de la UBA Investigador Adjunto del CONICET Av. San Martín 4453 - C1417DSE - C. A. de Buenos Aires - Argentina +5411 4524-8069 http://www.agro.uba.ar/users/mdevoto/ [[alternative HTML version deleted]] _______________________________________________ R-sig-ecology mailing list [hidden email] https://stat.ethz.ch/mailman/listinfo/r-sig-ecology |
Dear Mariano,
The logit transformation will fail in case the damage is 0% or 100%. The correct distribution for ratios between 0 and 1 is a beta distribution. When 0 or 1 is present you'll need a zero and/or one-inflated beta distribution. This is currently non available in lme4. It looks like you measure the damage in steps of 5%. So you use the binomial distribution as a workaround. That is: assume that each leave has 20 'trials' (20 * 5% = 100%), each 5% damage is one successful trial. glmer(cbind(dam / 5, 20 - dam / 5) ~ treat + day + (1|pair/plant), data=gaston, family = binomial) Best regards, ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey 2017-03-06 21:17 GMT+01:00 Mariano Devoto <[hidden email]>: > Hello again. This is a query which follows up on the very useful answer > provided by Drew Tyre who suggested creating new variables and adding the > random term you'll see in the model below. > Quick reminder: the analysis aims at understanding how the presence of > "bodyguard" ants regulates leaf damage caused by lepidopteran herbivores on > a focal plant species. > For this, I set up fourteen pairs of plants in a nature reserve and then > assigned each plant in each pair to one of two treatments, either "with > ants" or "without ants", by applying a physical barrier at the base of the > plant. > In the following nine weeks I measured four response variables once a week > on a subsample of ten randomly chosen leaves of each plant. I thus have > nine repeated measures on each plant. > > >From analyses done over the weekend I know excluding ants from the plants > (variable "treat" in the dataset) has a positive effect on the number of > butterfly eggs and larvae found on plants. > Now I want to know if this results in a reduced damage to the leafs > (variable "dam" in the data set), which is measured as the percentage of > the area eaten by herbivores. > > My response variables are: > ants: number of ants (this was measured just to check the physical barrier > had worked OK) > eggs: number of butterfly eggs > larve: number of butterfly larvae > dam: percent leaf damage (percentage eaten by larvae) > > My explanatory variables are: > treat: treatment (two levels: "con" and "sin" mean with and without ants, > respectively) > pair: plant pair > date > > #I provide a workable example below > require(lme4); require(tidyverse); require(lubridate) > > #read data from Google drive > id <- "0Bzd8I1jr8z_iU0h6R0hxaElseDA" # google file ID > gaston <- read.table(sprintf(" > <a href="https://docs.google.com/uc?id=%s&export=download">https://docs.google.com/uc?id=%s&export=download", id), head=T) > > #create variables following suggestion by Drew Tyre > gaston <- mutate(gaston, > plant = paste0(pair,treat), > date = dmy(date), > day = as.numeric((date - min(date)))) > > #After a few hours of scavenging through books, articles, blogs and R-lists > I ended up with the following model which turns the % damage into a > proportion and then logit transforms it so I can use lmer. > > M1 <- lmer(logit(dam/100) ~ treat + day + (1|pair/plant), data=gaston) > summary(M1) > > #Am I headed in the right direction? Any advice would be greatly > appreciated. > > Best, > > Mariano > > *Dr. Mariano Devoto* > > Profesor Adjunto - Cátedra de Botánica General, Facultad de Agronomía de la > UBA > Investigador Adjunto del CONICET > > Av. San Martín 4453 - C1417DSE - C. A. de Buenos Aires - Argentina > +5411 4524-8069 > http://www.agro.uba.ar/users/mdevoto/ > > [[alternative HTML version deleted]] > > _______________________________________________ > R-sig-ecology mailing list > [hidden email] > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology [[alternative HTML version deleted]] _______________________________________________ R-sig-ecology mailing list [hidden email] https://stat.ethz.ch/mailman/listinfo/r-sig-ecology |
Hi Mariano,
You could use a log ratio between undamaged and damage proportions [1] and obtain a response variable not constrained between 0 and 1. If so you will have to replace zeros by a detection limit. [2] used 65% of the detection limit as a basis for replacement, but there are many options available for imputing zeros [3]. To make things simpler here, I replace zeros by 1% and 100% by 99% (no 100% damaged though). Then I compute a log ratio between undamaged and damaged. A positive log ratio means the damage is less than 50%, and conversely with 0 meaning 50% damaged. dam_imp = gaston$dam dam_imp[gaston$dam == 0] = 1 dam_imp[gaston$dam == 100] = 99 gaston$dam_lr = log((100-dam_imp) / dam_imp) hist(gaston$dam) hist(gaston$dam_lr) There are many 0% and many 95%. M1 <- lmer(dam_lr ~ treat + day + (1|pair/plant), data=gaston) summary(M1) hist(residuals(M1)) The distribution of residuals seems not so bad. [1] http://www.sediment.uni-goettingen.de/staff/tolosana/extra/CoDaNutshell.pdf [2] http://dx.doi.org/10.1016/j.gexplo.2013.09.003 [3] https://cran.r-project.org/web/packages/zCompositions/index.html -- Serge-Étienne Parent, ing., Ph.D. Professeur adjoint Département des sols et de génie agroalimentaire Faculté des sciences de l'agriculture et de l'alimentation Pavillon Paul-Comtois 2425, rue de l'Agriculture Local 2301 Québec (Québec) G1V 0A6 Canada +1 418-656-2131 poste 3798 Le Tue 7 Mar 2017 à 4:56, Thierry Onkelinx <[hidden email]> a écrit : > Dear Mariano, > > The logit transformation will fail in case the damage is 0% or 100%. > The > correct distribution for ratios between 0 and 1 is a beta > distribution. > When 0 or 1 is present you'll need a zero and/or one-inflated beta > distribution. This is currently non available in lme4. > It looks like you measure the damage in steps of 5%. So you use the > binomial distribution as a workaround. That is: assume that each > leave has > 20 'trials' (20 * 5% = 100%), each 5% damage is one successful trial. > > glmer(cbind(dam / 5, 20 - dam / 5) ~ treat + day + (1|pair/plant), > data=gaston, family = binomial) > > Best regards, > > ir. Thierry Onkelinx > Instituut voor natuur- en bosonderzoek / Research Institute for > Nature and > Forest > team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance > Kliniekstraat 25 > 1070 Anderlecht > Belgium > > To call in the statistician after the experiment is done may be no > more > than asking him to perform a post-mortem examination: he may be able > to say > what the experiment died of. ~ Sir Ronald Aylmer Fisher > The plural of anecdote is not data. ~ Roger Brinner > The combination of some data and an aching desire for an answer does > not > ensure that a reasonable answer can be extracted from a given body of > data. > ~ John Tukey > > 2017-03-06 21:17 GMT+01:00 Mariano Devoto <[hidden email]>: > >> Hello again. This is a query which follows up on the very useful >> answer >> provided by Drew Tyre who suggested creating new variables and >> adding the >> random term you'll see in the model below. >> Quick reminder: the analysis aims at understanding how the presence >> of >> "bodyguard" ants regulates leaf damage caused by lepidopteran >> herbivores on >> a focal plant species. >> For this, I set up fourteen pairs of plants in a nature reserve and >> then >> assigned each plant in each pair to one of two treatments, either >> "with >> ants" or "without ants", by applying a physical barrier at the base >> of the >> plant. >> In the following nine weeks I measured four response variables once >> a week >> on a subsample of ten randomly chosen leaves of each plant. I thus >> have >> nine repeated measures on each plant. >> >> >From analyses done over the weekend I know excluding ants from the >> plants >> (variable "treat" in the dataset) has a positive effect on the >> number of >> butterfly eggs and larvae found on plants. >> Now I want to know if this results in a reduced damage to the leafs >> (variable "dam" in the data set), which is measured as the >> percentage of >> the area eaten by herbivores. >> >> My response variables are: >> ants: number of ants (this was measured just to check the physical >> barrier >> had worked OK) >> eggs: number of butterfly eggs >> larve: number of butterfly larvae >> dam: percent leaf damage (percentage eaten by larvae) >> >> My explanatory variables are: >> treat: treatment (two levels: "con" and "sin" mean with and without >> ants, >> respectively) >> pair: plant pair >> date >> >> #I provide a workable example below >> require(lme4); require(tidyverse); require(lubridate) >> >> #read data from Google drive >> id <- "0Bzd8I1jr8z_iU0h6R0hxaElseDA" # google file ID >> gaston <- read.table(sprintf(" >> <a href="https://docs.google.com/uc?id=%s&export=download">https://docs.google.com/uc?id=%s&export=download", id), head=T) >> >> #create variables following suggestion by Drew Tyre >> gaston <- mutate(gaston, >> plant = paste0(pair,treat), >> date = dmy(date), >> day = as.numeric((date - min(date)))) >> >> #After a few hours of scavenging through books, articles, blogs and >> R-lists >> I ended up with the following model which turns the % damage into a >> proportion and then logit transforms it so I can use lmer. >> >> M1 <- lmer(logit(dam/100) ~ treat + day + (1|pair/plant), >> data=gaston) >> summary(M1) >> >> #Am I headed in the right direction? Any advice would be greatly >> appreciated. >> >> Best, >> >> Mariano >> >> *Dr. Mariano Devoto* >> >> Profesor Adjunto - Cátedra de Botánica General, Facultad de >> Agronomía de la >> UBA >> Investigador Adjunto del CONICET >> >> Av. San Martín 4453 - C1417DSE - C. A. de Buenos Aires - Argentina >> +5411 4524-8069 >> http://www.agro.uba.ar/users/mdevoto/ >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> R-sig-ecology mailing list >> [hidden email] >> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology > > [[alternative HTML version deleted]] > > _______________________________________________ > R-sig-ecology mailing list > [hidden email] > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology [[alternative HTML version deleted]] _______________________________________________ R-sig-ecology mailing list [hidden email] https://stat.ethz.ch/mailman/listinfo/r-sig-ecology |
Thank you, Thierry and Serge-Étienne, for your kind replies. They are two
clever ways of dealing with the analysis. I tried both alternatives in a model that includes an interaction term day*treatment and the solution suggested by Thierry pics up a significant interaction, whereas Serge-Étienne's doesn't. An interaction makes sense considering that damage by herbivores other than larvae seemed to increase towards the end of the experiment, probably blurring differences between treatments. Thanks again for your help! Best, Mariano *Dr. Mariano Devoto* Profesor Adjunto - Cátedra de Botánica General, Facultad de Agronomía de la UBA Investigador Adjunto del CONICET Av. San Martín 4453 - C1417DSE - C. A. de Buenos Aires - Argentina +5411 4524-8069 http://www.agro.uba.ar/users/mdevoto/ On 7 March 2017 at 12:14, Serge-Étienne Parent <[hidden email]> wrote: > Hi Mariano, > > You could use a log ratio between undamaged and damage proportions [1] and > obtain a response variable not constrained between 0 and 1. If so you will > have to replace zeros by a detection limit. [2] used 65% of the detection > limit as a basis for replacement, but there are many options available for > imputing zeros [3]. To make things simpler here, I replace zeros by 1% and > 100% by 99% (no 100% damaged though). Then I compute a log ratio between > undamaged and damaged. A positive log ratio means the damage is less than > 50%, and conversely with 0 meaning 50% damaged. > > dam_imp = gaston$dam > dam_imp[gaston$dam == 0] = 1 > dam_imp[gaston$dam == 100] = 99 > gaston$dam_lr = log((100-dam_imp) / dam_imp) > hist(gaston$dam) > hist(gaston$dam_lr) > > There are many 0% and many 95%. > > M1 <- lmer(dam_lr ~ treat + day + (1|pair/plant), data=gaston) > summary(M1) > hist(residuals(M1)) > > The distribution of residuals seems not so bad. > > [1] http://www.sediment.uni-goettingen.de/staff/tolosana/ > extra/CoDaNutshell.pdf > [2] http://dx.doi.org/10.1016/j.gexplo.2013.09.003 > [3] https://cran.r-project.org/web/packages/zCompositions/index.html > > -- > Serge-Étienne Parent, ing., Ph.D. > Professeur adjoint > Département des sols et de génie agroalimentaire > Faculté des sciences de l'agriculture et de l'alimentation > Pavillon Paul-Comtois > 2425, rue de l'Agriculture > Local 2301 > Québec (Québec) G1V 0A6 > Canada > +1 418-656-2131 poste 3798 > > > > Le Tue 7 Mar 2017 à 4:56, Thierry Onkelinx <[hidden email]> a > écrit : > > Dear Mariano, The logit transformation will fail in case the damage is 0% > or 100%. The correct distribution for ratios between 0 and 1 is a beta > distribution. When 0 or 1 is present you'll need a zero and/or one-inflated > beta distribution. This is currently non available in lme4. It looks like > you measure the damage in steps of 5%. So you use the binomial distribution > as a workaround. That is: assume that each leave has 20 'trials' (20 * 5% = > 100%), each 5% damage is one successful trial. glmer(cbind(dam / 5, 20 - > dam / 5) ~ treat + day + (1|pair/plant), data=gaston, family = binomial) > Best regards, ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / > Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / > team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht > Belgium To call in the statistician after the experiment is done may be no > more than asking him to perform a post-mortem examination: he may be able > to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural > of anecdote is not data. ~ Roger Brinner The combination of some data and > an aching desire for an answer does not ensure that a reasonable answer can > be extracted from a given body of data. ~ John Tukey 2017-03-06 21:17 > GMT+01:00 Mariano Devoto <[hidden email]>: > > Hello again. This is a query which follows up on the very useful answer > provided by Drew Tyre who suggested creating new variables and adding the > random term you'll see in the model below. Quick reminder: the analysis > aims at understanding how the presence of "bodyguard" ants regulates leaf > damage caused by lepidopteran herbivores on a focal plant species. For > this, I set up fourteen pairs of plants in a nature reserve and then > assigned each plant in each pair to one of two treatments, either "with > ants" or "without ants", by applying a physical barrier at the base of the > plant. In the following nine weeks I measured four response variables once > a week on a subsample of ten randomly chosen leaves of each plant. I thus > have nine repeated measures on each plant. >From analyses done over the > weekend I know excluding ants from the plants (variable "treat" in the > dataset) has a positive effect on the number of butterfly eggs and larvae > found on plants. Now I want to know if this results in a reduced damage to > the leafs (variable "dam" in the data set), which is measured as the > percentage of the area eaten by herbivores. My response variables are: > ants: number of ants (this was measured just to check the physical barrier > had worked OK) eggs: number of butterfly eggs larve: number of butterfly > larvae dam: percent leaf damage (percentage eaten by larvae) My explanatory > variables are: treat: treatment (two levels: "con" and "sin" mean with and > without ants, respectively) pair: plant pair date #I provide a workable > example below require(lme4); require(tidyverse); require(lubridate) #read > data from Google drive id <- "0Bzd8I1jr8z_iU0h6R0hxaElseDA" # google file > ID gaston <- read.table(sprintf(" https://docs.google.com/uc?id= > %s&export=download", id), head=T) #create variables following suggestion > by Drew Tyre gaston <- mutate(gaston, plant = paste0(pair,treat), date = > dmy(date), day = as.numeric((date - min(date)))) #After a few hours of > scavenging through books, articles, blogs and R-lists I ended up with the > following model which turns the % damage into a proportion and then logit > transforms it so I can use lmer. M1 <- lmer(logit(dam/100) ~ treat + day + > (1|pair/plant), data=gaston) summary(M1) #Am I headed in the right > direction? Any advice would be greatly appreciated. Best, Mariano *Dr. > Mariano Devoto* Profesor Adjunto - Cátedra de Botánica General, Facultad de > Agronomía de la UBA Investigador Adjunto del CONICET Av. San Martín 4453 - > C1417DSE - C. A. de Buenos Aires - Argentina +5411 4524-8069 > http://www.agro.uba.ar/users/mdevoto/ [[alternative HTML version > deleted]] _______________________________________________ R-sig-ecology > mailing list [hidden email] https://stat.ethz.ch/mailman/ > listinfo/r-sig-ecology > > [[alternative HTML version deleted]] _______________________________________________ > R-sig-ecology mailing list [hidden email] > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology > > > > [[alternative HTML version deleted]] _______________________________________________ R-sig-ecology mailing list [hidden email] https://stat.ethz.ch/mailman/listinfo/r-sig-ecology |
Free forum by Nabble | Edit this page |