Aşağıdaki soru tam olarak cevap vermiyor. Yine de size bazı fikirler verebilir. Son zamanlarda bir ila dört bağımsız değişken kullanarak (regresyon değişkeni df1 veri çerçevesinin ilk sütunundaydı) birkaç regresyon modelinin uyumunu değerlendirmek için yaptığım bir şeydi.
# create the combinations of the 4 independent variables
library(foreach)
xcomb <- foreach(i=1:4, .combine=c) %do% {combn(names(df1)[-1], i, simplify=FALSE) }
# create formulas
formlist <- lapply(xcomb, function(l) formula(paste(names(df1)[1], paste(l, collapse="+"), sep="~")))
As.character (formlist) içeriği
[1] "price ~ sqft" "price ~ age"
[3] "price ~ feats" "price ~ tax"
[5] "price ~ sqft + age" "price ~ sqft + feats"
[7] "price ~ sqft + tax" "price ~ age + feats"
[9] "price ~ age + tax" "price ~ feats + tax"
[11] "price ~ sqft + age + feats" "price ~ sqft + age + tax"
[13] "price ~ sqft + feats + tax" "price ~ age + feats + tax"
[15] "price ~ sqft + age + feats + tax"
Sonra bazı yararlı endeksler topladım
# R squared
models.r.sq <- sapply(formlist, function(i) summary(lm(i))$r.squared)
# adjusted R squared
models.adj.r.sq <- sapply(formlist, function(i) summary(lm(i))$adj.r.squared)
# MSEp
models.MSEp <- sapply(formlist, function(i) anova(lm(i))['Mean Sq']['Residuals',])
# Full model MSE
MSE <- anova(lm(formlist[[length(formlist)]]))['Mean Sq']['Residuals',]
# Mallow's Cp
models.Cp <- sapply(formlist, function(i) {
SSEp <- anova(lm(i))['Sum Sq']['Residuals',]
mod.mat <- model.matrix(lm(i))
n <- dim(mod.mat)[1]
p <- dim(mod.mat)[2]
c(p,SSEp / MSE - (n - 2*p))
})
df.model.eval <- data.frame(model=as.character(formlist), p=models.Cp[1,],
r.sq=models.r.sq, adj.r.sq=models.adj.r.sq, MSEp=models.MSEp, Cp=models.Cp[2,])
Son veri çerçevesi
model p r.sq adj.r.sq MSEp Cp
1 price~sqft 2 0.71390776 0.71139818 42044.46 49.260620
2 price~age 2 0.02847477 0.01352823 162541.84 292.462049
3 price~feats 2 0.17858447 0.17137907 120716.21 351.004441
4 price~tax 2 0.76641940 0.76417343 35035.94 20.591913
5 price~sqft+age 3 0.80348960 0.79734865 33391.05 10.899307
6 price~sqft+feats 3 0.72245824 0.71754599 41148.82 46.441002
7 price~sqft+tax 3 0.79837622 0.79446120 30536.19 5.819766
8 price~age+feats 3 0.16146638 0.13526220 142483.62 245.803026
9 price~age+tax 3 0.77886989 0.77173666 37884.71 20.026075
10 price~feats+tax 3 0.76941242 0.76493500 34922.80 21.021060
11 price~sqft+age+feats 4 0.80454221 0.79523470 33739.36 12.514175
12 price~sqft+age+tax 4 0.82977846 0.82140691 29640.97 3.832692
13 price~sqft+feats+tax 4 0.80068220 0.79481991 30482.90 6.609502
14 price~age+feats+tax 4 0.79186713 0.78163109 36242.54 17.381201
15 price~sqft+age+feats+tax 5 0.83210849 0.82091573 29722.50 5.000000
Son olarak, bir Cp grafiği (wle kütüphanesini kullanarak)