Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- R version 2.15.2 (2012-10-26) -- "Trick or Treat"
- Copyright (C) 2012 The R Foundation for Statistical Computing
- ISBN 3-900051-07-0
- Platform: x86_64-w64-mingw32/x64 (64-bit)
- R is free software and comes with ABSOLUTELY NO WARRANTY.
- You are welcome to redistribute it under certain conditions.
- Type 'license()' or 'licence()' for distribution details.
- Natural language support but running in an English locale
- R is a collaborative project with many contributors.
- Type 'contributors()' for more information and
- 'citation()' on how to cite R or R packages in publications.
- Type 'demo()' for some demos, 'help()' for on-line help, or
- 'help.start()' for an HTML browser interface to help.
- Type 'q()' to quit R.
- This is a session spawned by NppToR.
- [Previously saved workspace restored]
- >
- >
- > require(earth) # for etitantic data
- Loading required package: earth
- Loading required package: leaps
- Loading required package: plotmo
- Loading required package: plotrix
- > data(etitanic)
- > mydata <- etitanic
- >
- > require(gbm)
- Loading required package: gbm
- Loading required package: survival
- Loading required package: splines
- Loading required package: lattice
- Loaded gbm 1.6.3.2
- > gbm1 <- gbm(survived ~ .,
- + data=mydata,
- + var.monotone=c(0,0,0,0,0),
- + distribution="bernoulli",
- + n.trees=2, # number of trees
- + shrinkage=0.005, interaction.depth=3, bag.fraction = 1, train.fraction = 1,
- + n.minobsinnode = 10, cv.folds = 5, keep.data=TRUE, verbose=FALSE)
- >
- >
- > # Two full trees
- > pretty.gbm.tree(gbm1, 1)
- SplitVar SplitCodePred LeftNode RightNode MissingNode ErrorReduction Weight Prediction
- 0 1 0.000000e+00 1 5 9 73.139438 1046 -2.010714e-17
- 1 2 9.500000e+00 2 3 4 6.512112 658 -4.202694e-03
- 2 -1 3.584234e-03 -1 -1 -1 0.000000 43 3.584234e-03
- 3 -1 -4.747146e-03 -1 -1 -1 0.000000 615 -4.747146e-03
- 4 -1 -4.202694e-03 -1 -1 -1 0.000000 658 -4.202694e-03
- 5 0 1.000000e+00 6 7 8 19.437432 388 7.127249e-03
- 6 -1 1.354899e-03 -1 -1 -1 0.000000 152 1.354899e-03
- 7 -1 1.084503e-02 -1 -1 -1 0.000000 236 1.084503e-02
- 8 -1 7.127249e-03 -1 -1 -1 0.000000 388 7.127249e-03
- 9 -1 -2.010714e-17 -1 -1 -1 0.000000 1046 -2.010714e-17
- > pretty.gbm.tree(gbm1, 2)
- SplitVar SplitCodePred LeftNode RightNode MissingNode ErrorReduction Weight Prediction
- 0 1 2.000000e+00 1 5 9 72.409580 1046 -7.371885e-06
- 1 2 9.500000e+00 2 3 4 6.447161 658 -4.185715e-03
- 2 -1 3.563973e-03 -1 -1 -1 0.000000 43 3.563973e-03
- 3 -1 -4.727564e-03 -1 -1 -1 0.000000 615 -4.727564e-03
- 4 -1 -4.185715e-03 -1 -1 -1 0.000000 658 -4.185715e-03
- 5 0 3.000000e+00 6 7 8 19.243329 388 7.078582e-03
- 6 -1 1.347789e-03 -1 -1 -1 0.000000 152 1.347789e-03
- 7 -1 1.076960e-02 -1 -1 -1 0.000000 236 1.076960e-02
- 8 -1 7.078582e-03 -1 -1 -1 0.000000 388 7.078582e-03
- 9 -1 -7.371885e-06 -1 -1 -1 0.000000 1046 -7.371885e-06
- >
- > # Which variable is used for the first splitting rule (row 0)?
- > pretty.gbm.tree(gbm1, 1)[0,]
- [1] SplitVar SplitCodePred LeftNode RightNode MissingNode ErrorReduction Weight Prediction
- <0 rows> (or 0-length row.names)
- > pretty.gbm.tree(gbm1, 2)[0,]
- [1] SplitVar SplitCodePred LeftNode RightNode MissingNode ErrorReduction Weight Prediction
- <0 rows> (or 0-length row.names)
- >
- > # SplitVar is 1 in both trees, but it is zero based, so add 1.
- > # So the splitting variable is 'sex'.
- > attr(gbm1$Terms,'term.labels')[1+1]
- [1] "sex"
- >
- > # Sex has two levels.
- > (var.levels <- gbm1$var.levels[[2]])
- [1] "female" "male"
- >
- > # The first rule of the first tree splits on SplitCodePred=0.
- > # This must be zero based, so it must be female.
- > var.levels[0+1]
- [1] "female"
- >
- > # The first rule of the second tree splits on SplidCodePred=2.
- > # Using the same "zero based" logic, now there is an out of bounds error.
- > var.levels[2+1]
- [1] NA
- >
- > # The splitting rules do not make sense, so try working backwards.
- > # The first node is female
- > mydata[1,'sex']
- [1] female
- Levels: female male
- >
- > # Predict the outcome of the first row.
- > predict(gbm1, newdata=mydata[1,], n.trees=1, single=T)
- [1] 0.01084503
- > predict(gbm1, newdata=mydata[1,], n.trees=2, single=T)
- [1] 0.0107696
- >
- > # These predictions correspond to these terminal nodes.
- > pretty.gbm.tree(gbm1, 1)[8,]
- SplitVar SplitCodePred LeftNode RightNode MissingNode ErrorReduction Weight Prediction
- 7 -1 0.01084503 -1 -1 -1 0 236 0.01084503
- > pretty.gbm.tree(gbm1, 2)[8,]
- SplitVar SplitCodePred LeftNode RightNode MissingNode ErrorReduction Weight Prediction
- 7 -1 0.0107696 -1 -1 -1 0 236 0.0107696
- >
- > # In both trees the terminal nodes trace back to row 0,
- > # which has different SplitVar. I am confused.
- >
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement