Java Mailing List Archive

http://www.r-help.com/

Home » Home (12/2007) » R Help for Statistical Computing »

Re: [R] Conservative "ANOVA tables" in lmer

Spencer Graves

2006-09-09

Replies:

   Peter's example and Doug's "different test" reply sent me
Scheffé's discussion of the balanced and replicated mixed-effect 2-away
layout. As I note below, the obvious F test for the fixed effect does
not appear to be likelihood ratio for anything.

Douglas Bates wrote:
> On 9/7/06, Douglas Bates <bates@(protected):
>  
>> On 07 Sep 2006 17:20:29 +0200, Peter Dalgaard <p.dalgaard@(protected):
>>  
>>> Martin Maechler <maechler@(protected):
>>>
>>>    
>>>>>>>>> "DB" == Douglas Bates <bates@(protected)>
>>>>>>>>>   on Thu, 7 Sep 2006 07:59:58 -0500 writes:
>>>>>>>>>            
>>>>   DB> Thanks for your summary, Hank.
>>>>   DB> On 9/7/06, Martin Henry H. Stevens <hstevens@(protected):
>>>>   >> Dear lmer-ers,
>>>>   >> My thanks for all of you who are sharing your trials and tribulations
>>>>   >> publicly.
>>>>
>>>>   >> I was hoping to elicit some feedback on my thoughts on denominator
>>>>   >> degrees of freedom for F ratios in mixed models. These thoughts and
>>>>   >> practices result from my reading of previous postings by Doug Bates
>>>>   >> and others.
>>>>
>>>>   >> - I start by assuming that the appropriate denominator degrees lies
>>>>   >> between n - p and and n - q, where n=number of observations, p=number
>>>>   >> of fixed effects (rank of model matrix X), and q=rank of Z:X.
>>>>
>>>>   DB> I agree with this but the opinion is by no means universal. Initially
>>>>   DB> I misread the statement because I usually write the number of columns
>>>>   DB> of Z as q.
>>>>
>>>>   DB> It is not easy to assess rank of Z:X numerically. In many cases one
>>>>   DB> can reason what it should be from the form of the model but a general
>>>>   DB> procedure to assess the rank of a matrix, especially a sparse matrix,
>>>>   DB> is difficult.
>>>>
>>>>   DB> An alternative which can be easily calculated is n - t where t is the
>>>>   DB> trace of the 'hat matrix'. The function 'hatTrace' applied to a
>>>>   DB> fitted lmer model evaluates this trace (conditional on the estimates
>>>>   DB> of the relative variances of the random effects).
>>>>
>>>>   >> - I then conclude that good estimates of P values on the F ratios lie
>>>>   >>  between 1 - pf(F.ratio, numDF, n-p) and 1 - pf(F.ratio, numDF, n-q).
>>>>   >>  -- I further surmise that the latter of these (1 - pf(F.ratio, numDF,
>>>>   >>  n-q)) is the more conservative estimate.
>>>>
>>>> This assumes that the true distribution (under H0) of that "F ratio"
>>>> *is* F_{n1,n2} for some (possibly non-integer) n1 and n2.
>>>> But AFAIU, this is only approximately true at best, and AFAIU,
>>>> the quality of this approximation has only been investigated
>>>> empirically for some situations.
>>>> Hence, even your conservative estimate of the P value could be
>>>> wrong (I mean "wrong on the wrong side" instead of just
>>>> "conservatively wrong"). Consequently, such a P-value is only
>>>> ``approximately conservative'' ...
>>>> I agree howevert that in some situations, it might be a very
>>>> useful "descriptive statistic" about the fitted model.
>>>>      
>>> I'm very wary of ANY attempt at guesswork in these matters.
>>>
>>> I may be understanding the post wrongly, but consider this case: Y_ij
>>> = mu + z_i + eps_ij, i = 1..3, j=1..100
>>>
>>> I get rank(X)=1, rank(X:Z)=3, n=300
>>>
>>> It is well known that the test for mu=0 in this case is obtained by
>>> reducing data to group means, xbar_i, and then do a one-sample t test,
>>> the square of which is F(1, 2), but it seems to be suggested that
>>> F(1, 297) is a conservative test???!
>>>    
>> It's a different test, isn't it? Your test is based upon the between
>> group sum of squares with 2 df. I am proposing to use the within
>> group sum of squares or its generalization.
>>  
>
> On closer examination I see that you are indeed correct. I have heard
> that "well-known" result many times and finally sat down to prove it
> to myself. For a balanced design the standard error of the intercept
> using the REML estimates is the same as the standard error of the mean
> calculated from the group means.
>
>  
>> data(Rail, package = 'nlme')
>> library(lme4)
>> summary(fm1 <- lmer(travel ~ 1 + (1|Rail), Rail))
>>  
> Linear mixed-effects model fit by REML
> Formula: travel ~ 1 + (1 | Rail)
>   Data: Rail
>   AIC  BIC logLik MLdeviance REMLdeviance
> 126.2 128.0 -61.09    128.6     122.2
> Random effects:
> Groups  Name     Variance Std.Dev.
> Rail   (Intercept) 615.286 24.8050
> Residual         16.167  4.0208
> number of obs: 18, groups: Rail, 6
>
> Fixed effects:
>         Estimate Std. Error t value
> (Intercept)   66.50    10.17  6.538
>  
>> mns <- with(Rail, tapply(travel, Rail, mean)) # group means
>> sd(mns)/sqrt(length(mns)) # standard error matches that from lmer
>>  
> [1] 10.17104
>  
>> t.test(mns)
>>  
>
>  One Sample t-test
>
> data: mns
> t = 6.5382, df = 5, p-value = 0.001253
> alternative hypothesis: true mean is not equal to 0
> 95 percent confidence interval:
> 40.35452 92.64548
> sample estimates:
> mean of x
>    66.5
>
>  
>> ctab <- summary(fm1)@(protected)
>> ctab[,1] + c(-1,1) * qt(0.975, 15) * ctab[,2] # 95% conf. int.
>>  
> [1] 44.82139 88.17861
>  
>> ## interval using df = # of obs - rank of [Z:X] is too narrow
>>  
>
> So my proposal of using either the trace of the hat matrix or the rank
> of the combined model matrices as the degrees of freedom for the model
> is not conservative.
>
> However, look at the following
>
>  
>> set.seed(123454321) # for reproducibility
>> sm1 <- mcmcsamp(fm1, 50000)
>> library(coda)
>> HPDinterval(sm1)
>>  
>              lower    upper
> (Intercept)   40.470663 92.608514
> log(sigma^2)   2.060179  3.716326
> log(Rail.(In))  5.371858  8.056897
> deviance     128.567329 137.487455
> attr(,"Probability")
> [1] 0.95
>
> The HPD interval calculated from a MCMC sample reproduce the interval
> from the group means almost exactly. This makes sense in that the
> MCMC sample takes into account the variation in the estimates of the
> variance components, just as defining intervals based on the Student's
> t does.
>
> So for this case where the distribution of the estimate of the mean
> has a known distribution the correct degrees of freedom and the MCMC
> sample produce similar answers.
>
> This gives me more confidence in the results from the MCMC sample in
> general cases.
>
> The problem I have with trying to work out what the degrees of freedom
> "should be" is that the rules seem rather arbitrary. For example, the
> "between-within" rule used in SAS PROC Mixed is popular (many accept
> it as the "correct" answer) but it assumes that the degrees of freedom
> associated with a random effect grouped by a factor with k levels is
> always k - 1. This value is used even when there is a random
> intercept and a random slope for each group. In fact you could have
> an arbitrary number of random effects for each level of the grouping
> factor and it would still apparently only cost you k - 1 degrees of
> freedom. That doesn't make sense to me.
>
> Anyway, I thank you for pointing out the errors of my ways Peter.
>  
   For the traditional, balanced, replicated, 2-way mixed-effects
analysis, Scheffé (1959, Table 8.1.1, p. 269) gives the expected mean
squares for a two-way layout with "I" levels of a fixed effect A, "J"
levels of a random effect B, and "K" replicates, as follows:

EMS(A: fixed) = var(e) + K*var(A:B) + J*K*MeanSquareA
EMS(B: random) = var(e) + I*K*var(B)
EMS(A:B; random)=var(e)+K*var(A:B)
EMSE = var(e).

   In this case, the "obvious" test for A is MS(A: fixed) / MS(A:B,
random), because this gives us a standard F statistic to test
MeanSquareA = 0. However, it doesn't make sense to me to test A without
simultaneously assuming var(A:B) = 0. The same argument applies to
Peter's "simpler" case discussed above: With "Y_ij = mu + z_i +
eps_ij", it only rarely makes sense to test mu=0 while assuming var(z)
!= 0. In the balanced 2-way, mixed-effects analysis, the Neyman-Pearson
thing to do, I would think, would be to test simultaneously MeanSquareA
= 0 with var(A:B) = 0. In lmer, I might write this as follows:

   anova(lmer(y~A+(A|B)), lmer(y~1+(1|B)).

   However, this does NOT match the standard analysis associated with
this design, does it? To check this, I considered problem 8.1 in
Scheffé (p. 289), which compares 3 different nozzles (fixed effect)
tested by 5 different operators (random effect). The data are as follows:

y <- c(6,6,-15,  26,12,5,   11,4,4,  21,14,7, 25,18,25,
    13,6,13,   4,4,11,  17,10,17,  -5,2,-5, 15,8,1,
  10,10,-11, -35,0,-14, 11,-10,-17, 12,-2,-16, -4,10,24)

Nozzle <- data.frame(Nozzle=rep(LETTERS[1:3], e=15),
   Operator=rep(letters[1:5], e=3), flowRate=y)

   The traditional analysis can be obtained from
anova(lm(flowRate~Nozzle*Operator, ...)), but comparing MeanSq.Nozzle to
MeanSq.Nozzle:Operator rather than MeanSquareResidual, as follows:

> fitAB0 <- lm(flowRate~Nozzle*Operator, data=Nozzle)
> (aov.AB0 <- anova(fitAB0))
Analysis of Variance Table

Response: flowRate
          Df Sum Sq Mean Sq F value  Pr(>F)
Nozzle       2 1426.98 713.49 7.0456 0.003101 **
Operator      4 798.80 199.70 1.9720 0.124304
Nozzle:Operator 8 1821.47 227.68 2.2484 0.051640 .
Residuals     30 3038.00 101.27            
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

   Scheffé must have computed the following:

> (F.A.Scheffe <- aov.AB0[1, "Mean Sq"]/aov.AB0[3, "Mean Sq"])
[1] 3.133690
> pf(F.A.Scheffe, 1, 2, lower.tail=FALSE)
[1] 0.2187083

   However, I think I prefer the likelihood ratio answer to this,
because I think it is better to have an approximate solution to the
exact problem than an exact solution to the approximate problem. [I got
this from someone else like Tukey, but I don't have a citation.]) I can
get this likelihood ratio answer from either lme or lmer.

   When I tried to fit this model with 'mle'; it didn't want to
converge:

library(nlme)
fitAB. <- lme(flowRate~Nozzle, random=~Nozzle|Operator,
        data=Nozzle, method="ML")
Error in lme.formula(flowRate ~ Nozzle, random = ~Nozzle | Operator,
data = Nozzle, :
  nlminb problem, convergence error code = 1; message = iteration
limit reached without convergence (9)

   After several false starts, I got the following to work:

fitAB. <- lme(flowRate~Nozzle, random=~Nozzle|Operator,
        data=Nozzle, method="ML",
        control=lmeControl(opt="optim"))

> anova(fitAB., fitB.)
    Model df    AIC    BIC   logLik  Test L.Ratio p-value
fitAB.   1 10 361.9022 379.9688 -170.9511              
fitB.    2 3 361.3637 366.7837 -177.6819 1 vs 2 13.46153 0.0616


   I got essentially the same answer from lmer (without the
convergence problem, but quitting R in between:

> fitAB <- lmer(flowRate~Nozzle+(Nozzle|Operator),
+          data=Nozzle, method="ML")
> fitB <- lmer(flowRate~1+(1|Operator), data=Nozzle,
+         method="ML")
> anova(fitAB, fitB)
Data: Nozzle
Models:
fitB: flowRate ~ 1 + (1 | Operator)
fitAB: flowRate ~ Nozzle + (Nozzle | Operator)
   Df   AIC   BIC logLik Chisq Chi Df Pr(>Chisq)
fitB  2 359.36 362.98 -177.68                
fitAB 9 359.88 376.14 -170.94 13.479    7   0.06126 .
 
   Comments?
   Spencer Graves
p.s. For the lme fit:
> sessionInfo()
Version 2.3.1 Patched (2006-08-13 r38872)
i386-pc-mingw32

attached base packages:
[1] "methods"  "stats"   "graphics" "grDevices" "utils"   "datasets"
[7] "base"  

other attached packages:
  nlme
"3.1-75"
> ______________________________________________
> R-help@(protected)
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
R-help@(protected)
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
©2008 r-help.com - Jax Systems, LLC, U.S.A.