Doğrusal karma model için rastgele efekt tahminlerini manuel olarak hesaplama


10

Elle doğrusal bir doğrusal modelden rastgele etki tahminlerini hesaplamaya çalışıyorum ve Genelleştirilmiş Katkı Modellerinde Wood tarafından sağlanan gösterimi kullanarak : R ile bir giriş (pg 294 / pg 307 of pdf), her parametrenin ne olduğu konusunda kafam karışıyor temsil eder.

Aşağıda Wood'un bir özeti bulunmaktadır.

Doğrusal karışık bir model tanımlama

Y=Xβ+Zb+ϵ

where b N(0, ψ), and ϵ N(0, σ2)

If b and y are random variables with joint normal distribution

[by]N[[0Xβ],[ψΣbyΣybΣθσ2]]

The RE predictions are calculated by

E[by]=ΣbyΣyy1(yxβ)=ΣbyΣθ1(yxβ)/σ2=ψzTΣθ1(yxβ)/σ2

where Σθ=ZψZT/σ2+In

Using an random intercept model example from lme4 R package I get output

library(lme4)
m = lmer(angle ~ temp + (1 | replicate), data=cake)
summary(m)

% Linear mixed model fit by REML ['lmerMod']
% Formula: angle ~ temp + (1 | replicate)
%    Data: cake
% 
% REML criterion at convergence: 1671.7
% 
% Scaled residuals: 
%      Min       1Q   Median       3Q      Max 
% -2.83605 -0.56741 -0.02306  0.54519  2.95841 
% 
% Random effects:
%  Groups    Name        Variance Std.Dev.
%  replicate (Intercept) 39.19    6.260   
%  Residual              23.51    4.849   
% Number of obs: 270, groups:  replicate, 15
% 
% Fixed effects:
%             Estimate Std. Error t value
% (Intercept)  0.51587    3.82650   0.135
% temp         0.15803    0.01728   9.146
% 
% Correlation of Fixed Effects:
%      (Intr)
% temp -0.903

So from this, I think ψ = 23.51, (yXβ) can be estimated from cake$angle - predict(m, re.form=NA), and sigma from the square of the population level residuals.

th = 23.51
zt = getME(m, "Zt") 
res = cake$angle - predict(m, re.form=NA)
sig = sum(res^2) / (length(res)-1)

Multiplying these together gives

th * zt %*% res / sig
         [,1]
1  103.524878
2   94.532914
3   33.934892
4    8.131864
---

which is not correct when compared to

> ranef(m)
$replicate
   (Intercept)
1   14.2365633
2   13.0000038
3    4.6666680
4    1.1182799
---

Why?

Yanıtlar:


9

Two problems (I confess it took me like 40 minutes to spot the second one):

  1. You must not compute σ2 with the square of residuals, it is estimated by REML as 23.51, and there is no guarantee that the BLUPs will have the same variance.

    sig <- 23.51

    And this is not ψ ! Which is estimated as 39.19

    psi <- 39.19
  2. The residuals are not obtained with cake$angle - predict(m, re.form=NA) but with residuals(m).

Putting it together:

> psi/sig * zt %*% residuals(m)
15 x 1 Matrix of class "dgeMatrix"
         [,1]
1  14.2388572
2  13.0020985
3   4.6674200
4   1.1184601
5   0.2581062
6  -3.2908537
7  -4.6351567
8  -4.5813846
9  -4.6351567
10 -3.1833095
11 -2.1616392
12 -1.1399689
13 -0.2258429
14 -4.0974355
15 -5.3341942

which is similar to ranef(m).

I really don't get what predict computes.


PS. To answer your last remark, the point is that we use the "residuals" ϵ^ as a way to obtain the vector PY where P=V1V1X(XV1X)1XV1. This matrix is computed during the REML algorithm. It is related to the BLUPs of random terms by

ϵ^=σ2PY
and
b^=ψZtPY.

Thus b^=ψ/σ2Ztϵ^.


1
Thanks Elvis. I am struggling a wee bit to align the values you have used back to the equations above, however, it seems there are many ways to skin a cat. The residuals I find a bit surprising as I thought it is meant to be yxβ, (fixed effect) whereas residuals is calculated using the random effects. (see the difference between plot(residuals(m), cake$angle-predict(m, re.form=NULL)) ; plot(residuals(m), cake$angle-predict(m, re.form=NA))).
user2957945

1
A way using the fixed effect, and the third version of the E[b|y] above: z = getME(m, "Z") ; big_sig = solve(((z * psi) %*% zt ) / sig + diag(270)) ; psi/sig * zt %*% big_sig %*% (cake$angle-predict(m, re.form=NA)). Thanks for pointing out the correct items.
user2957945

One final Q if I may, can I get either of Σby or Σyy directly from the output?
user2957945

Isn’t Σyb equal to ψZ ?
Elvis

Elvis, I had another wee think on this (I know I'm slow). I think using the residuals like this isn't really sensible as it uses the predicted values (and so residuals) at the RE level to calculate, so we are using it at both sides of your equation. (so it uses the RE predictions (E[b|y]) to make the predictions of residuals even though these are the terms we are trying to predict))
user2957945
Sitemizi kullandığınızda şunları okuyup anladığınızı kabul etmiş olursunuz: Çerez Politikası ve Gizlilik Politikası.
Licensed under cc by-sa 3.0 with attribution required.