更新: 2012-09-24 21:50:27

生態学のデータ解析 - 対数線形モデル

カウントデータをあつかう対数線形モデル
- ……について説明する予定ですが，モデルについてはまだ書いていません
- 自分自身のメモのため，「logistic 回帰と Poisson 回帰は ある意味では同じ」という例だけをとりあえず示しています
- 参照: GLM 参照

[もくじ]

logistic 回帰と Poisson 回帰

logistic 回帰と Poisson 回帰

logistic 回帰と Poisson 回帰 (対数線形モデル) は パラメーター推定に関しては同じ結果になるという例を示してみましょう

とりあえず架空例データを作る

こんなデータを作ります
データの意味としては:
- y.0: 「応答しなかった」個数
- y.1: 「応答した」個数
- x: 処理か何か，ともかく説明変数のつもり (ただしまったく無意味な数量)
- total: y.0 と y.1 の和

d <- data.frame(
	y.0 = c(2, 3, 5, 4, 1, 2),
	y.1 = c(4, 2, 9, 4, 0, 4),
	x   = seq(0.1, 0.6, length = 6)
)
d$total <- d$y.0 + d$y.1
print(d)

  y.0 y.1   x total
1   2   4 0.1     6
2   3   2 0.2     5
3   5   9 0.3    14
4   4   4 0.4     8
5   1   0 0.5     1
6   2   4 0.6     6

さらに， reshape() わざ (参照: reshape(data.frame)) を使って，「タテに長い data.frame」に変換します
これは対数線形モデルで使う準備です

d.long <- reshape(d, direction = "long", varying = c("y.0", "y.1"))
cn <- colnames(d.long)
colnames(d.long)[grep("time", cn)] <- "response" # y.0 と y.1 の区別をつける
d.long$response <- factor(d.long$response) # factor 化 (この例では必要ないけど)
print(d.long) # ようするに d を並びかえて長くしただけ

      x total response y id
1.0 0.1     6        0 2  1
2.0 0.2     5        0 3  2
3.0 0.3    14        0 5  3
4.0 0.4     8        0 4  4
5.0 0.5     1        0 1  5
6.0 0.6     6        0 2  6
1.1 0.1     6        1 4  1
2.1 0.2     5        1 2  2
3.1 0.3    14        1 9  3
4.1 0.4     8        1 4  4
5.1 0.5     1        1 0  5
6.1 0.6     6        1 4  6

まずは一番単純なモデルから

「切片だけ」モデルの logistic 回帰をやってみます

fit.b1 <- glm(
	formula = cbind(y.1, y.0) ~ 1,
	data = d, family = binomial
)
print(summary(fit.b1))

Coefficients:
            Estimate Std. Error z value Pr(>|z|)
(Intercept)    0.302      0.320    0.95     0.34

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 3.2024  on 5  degrees of freedom
Residual deviance: 3.2024  on 5  degrees of freedom
AIC: 17.41

Poisson 回帰 (glm(.., family = poisson)) で以下のようにすると，上と同じ推定結果が得られます

fit.p1 <- glm(
	y ~ response, offset = log(total),
	data = d.long, family = poisson
)
print(summary(fit.p1))

Coefficients:
            Estimate Std. Error z value Pr(>|z|)
(Intercept)   -0.856      0.243   -3.53  0.00042
response1      0.302      0.320    0.95  0.34462

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 4.1058  on 11  degrees of freedom
Residual deviance: 3.2024  on 10  degrees of freedom
AIC: 40.63

この response1 が logistic 回帰の (Intercept) に該当します
では，こちらの (Intercept) -0.855 は何なのかというと， response1 の推定値である 0.302 を使って ''-log(1 + exp(0.302)) = -0.855'' と計算した量になります (つまりこの係数は response1 に完全に依存しています)

説明変数を追加した場合

今度は説明変数 x を追加した場合について推定計算します
- この説明変数 x はてきとーに与えた数値なので，応答変数の説明にはまったく役にたちません

fit.b2 <- glm(
	formula = cbind(y.1, y.0) ~ x,
	data = d, family = binomial
)
print(summary(fit.b2))

Coefficients:
            Estimate Std. Error z value Pr(>|z|)
(Intercept)    0.350      0.762    0.46     0.65
x             -0.145      2.108   -0.07     0.95

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 3.2024  on 5  degrees of freedom
Residual deviance: 3.1977  on 4  degrees of freedom
AIC: 19.41

Poisson 回帰 (glm(.., family = poisson)) で以下のようにすると，上と同じ推定結果が得られます

fit.p2 <- glm(
	y ~ response * x, offset = log(total),
	data = d.long, family = poisson
)
print(summary(fit.p2))

Coefficients:
            Estimate Std. Error z value Pr(>|z|)
(Intercept)   -0.883      0.579   -1.52     0.13
response1      0.350      0.762    0.46     0.65
x              0.083      1.596    0.05     0.96
response1:x   -0.144      2.108   -0.07     0.95

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 4.1058  on 11  degrees of freedom
Residual deviance: 3.1977  on  8  degrees of freedom
AIC: 44.63

Number of Fisher Scoring iterations: 4

これが何の役にたつのか?

単なる二項 logistic 回帰を対数線形モデルで置きかえても御利益はありません
しかし三項以上の多項 logistic 回帰の場合:
- glm(...,family=binomial で計算できると便利，といった御利益があるかも
  - あるいは glmmML 紹介なんかをうまく使えば……?
- もちろん，R にはいろいろな多項 logistic 回帰の関数が準備されています
  - 例: ニューラルネット
対数線形モデルはパラメーター推定には使えますが，予測や乱数生成には使えません
- 乱数生成には，対数線形モデルで得られたパラメーターで，二項または多項分布を使えばよいでしょう

このページは以下から参照されています。
統計学授業 2010 FrontPage GLM 参照

生態学のデータ解析 - 対数線形モデル

logistic 回帰と Poisson 回帰

とりあえず架空例データを作る

まずは一番単純なモデルから

説明変数を追加した場合

これが何の役にたつのか?

内容

最新の30件

2019-07-16

2019-07-14

2019-07-03

2019-07-01

2019-06-04

2019-05-29

2019-04-24

2019-03-19

2019-01-25

2018-12-12

2018-12-05

2018-11-28

2018-11-20

2018-10-10

2018-09-20

2018-08-24

2018-08-15

2018-07-14

2018-06-04

2018-04-10

2018-03-19

2018-03-01

2018-02-28

2018-02-22

2018-02-15

2018-01-17

2018-01-16