Java Mailing List Archive

http://www.r-help.com/

Home » Home (12/2007) » R Help for Statistical Computing »

Re: [R] Conservative "ANOVA tables" in lmer

Douglas Bates

2006-09-07

Replies:

On 9/7/06, Martin Maechler <maechler@(protected):
> >>>>> "DB" == Douglas Bates <bates@(protected)>
> >>>>>   on Thu, 7 Sep 2006 07:59:58 -0500 writes:
>
>   DB> Thanks for your summary, Hank.
>   DB> On 9/7/06, Martin Henry H. Stevens <hstevens@(protected):
>   >> Dear lmer-ers,
>   >> My thanks for all of you who are sharing your trials and tribulations
>   >> publicly.
>
>   >> I was hoping to elicit some feedback on my thoughts on denominator
>   >> degrees of freedom for F ratios in mixed models. These thoughts and
>   >> practices result from my reading of previous postings by Doug Bates
>   >> and others.
>
>   >> - I start by assuming that the appropriate denominator degrees lies
>   >> between n - p and and n - q, where n=number of observations, p=number
>   >> of fixed effects (rank of model matrix X), and q=rank of Z:X.
>
>   DB> I agree with this but the opinion is by no means universal. Initially
>   DB> I misread the statement because I usually write the number of columns
>   DB> of Z as q.
>
>   DB> It is not easy to assess rank of Z:X numerically. In many cases one
>   DB> can reason what it should be from the form of the model but a general
>   DB> procedure to assess the rank of a matrix, especially a sparse matrix,
>   DB> is difficult.
>
>   DB> An alternative which can be easily calculated is n - t where t is the
>   DB> trace of the 'hat matrix'. The function 'hatTrace' applied to a
>   DB> fitted lmer model evaluates this trace (conditional on the estimates
>   DB> of the relative variances of the random effects).
>
>   >> - I then conclude that good estimates of P values on the F ratios lie
>   >>  between 1 - pf(F.ratio, numDF, n-p) and 1 - pf(F.ratio, numDF, n-q).
>   >>  -- I further surmise that the latter of these (1 - pf(F.ratio, numDF,
>   >>  n-q)) is the more conservative estimate.
>
> This assumes that the true distribution (under H0) of that "F ratio"
> *is* F_{n1,n2} for some (possibly non-integer) n1 and n2.
> But AFAIU, this is only approximately true at best, and AFAIU,
> the quality of this approximation has only been investigated
> empirically for some situations.
> Hence, even your conservative estimate of the P value could be
> wrong (I mean "wrong on the wrong side" instead of just
> "conservatively wrong"). Consequently, such a P-value is only
> ``approximately conservative'' ...
> I agree howevert that in some situations, it might be a very
> useful "descriptive statistic" about the fitted model.

Thank you for pointing that out Martin. I agree. As I mentioned a
value of the denominator degrees of freedom based on the trace of the
hat matrix is conditional on the estimates of the relative variances
of the random effects. I think an argument could still be made for
the upper bound on the dimension of the model space being rank of Z:X
and hence a lower bound on the dimension of the space in which the
residuals lie as being n - rank[Z:X]. One possible approach would be
to use the squared length of the projection of the data vector into
the orthogonal complement of Z:X as the "sum of squares" and n -
rank(Z:X) as the degrees of freedom and base tests on that. Under the
assumptions on the model I think an F ratio calculated using that
actually would have an F distribution.

>
> Martin
>
>   >> When I use these criteria and compare my "ANOVA" table to the results
>   >> of analysis of Helmert contrasts using MCMC sample with highest
>   >> posterior density intervals, I find that my conclusions (e.g. factor
>   >> A, with three levels, has a "significant effect" on the response
>   >> variable) are qualitatively the same.
>
>   >> Comments?
>
>   DB> I would be happy to re-institute p-values for fixed effects in the
>   DB> summary and anova methods for lmer objects using a denominator degrees
>   DB> of freedom based on the trace of the hat matrix or the rank of Z:X if
>   DB> others will volunteer to respond to the "these answers are obviously
>   DB> wrong because they don't agree with <whatever> and the idiot who wrote
>   DB> this software should be thrashed to within an inch of his life"
>   DB> messages. I don't have the patience.
>
>   DB> ______________________________________________
>   DB> R-help@(protected)
>   DB> https://stat.ethz.ch/mailman/listinfo/r-help
>   DB> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>   DB> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
R-help@(protected)
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
©2008 r-help.com - Jax Systems, LLC, U.S.A.