Advertisement
Guest User

Catnip MDP: optimal Bayesian solver

a guest
Jun 10th, 2018
208
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 5.38 KB | None | 0 0
  1. Costs:
  2.  
  3. - [I Love Catnip! 2 oz Liquid Catnip Spray, by Pet MasterMind](http://www.amazon.com/Catnip-Pet-MasterMind-Premium-Natural/dp/B016R4EBKW/): \$8.96
  4. - [Honeysuckle Spray for Cats (2 oz)](https://www.amazon.com/dp/B000788TP8/): \$7.99
  5. - [Valerian Root Extract](https://www.amazon.com/dp/B002H2VYNG/): \$11.86 (another possibility: [Nature's Answer super-concentrated 1 fluid oz](https://www.vitacost.com/natures-answer-valerian-root-alcohol-free-1-fl-oz), \$7)
  6. - [Arata Silvervine 12x0.5g, Cat's Favorite, Best Value](https://www.amazon.com/dp/B008PNNMHW/): \$10.95 + \$6.82 (S&H from Japan for the silvervine) = \$17.77 or \$2.96/g
  7.  
  8. Load Bol et al 2017 for predicting response to one drug based on responses to others:
  9.  
  10. ~~~{.R}
  11. bol2017 <- read.csv("https://www.gwern.net/docs/catnip/2017-bol-cats.csv")
  12. bol2017[,5:8] <- (bol2017[,5:8] > 0) # treat '5'/'10' (weak/strong response) as binary
  13. bol2017Responses <- subset(bol2017, select=c("Catnip", "Valerian", "Silver.vine", "Tatarian.honeysuckle"))
  14. colnames(bol2017Responses) <- c("Catnip", "Valerian", "Silvervine", "Honeysuckle") # rename for consistency
  15. library(skimr)
  16. skim(bol2017Responses)
  17. #... variable missing complete n mean count
  18. # Catnip 1 99 100 0.68 TRU: 67, FAL: 32, NA: 1
  19. # Honeysuckle 3 97 100 0.53 TRU: 51, FAL: 46, NA: 3
  20. # Silvervine 0 100 100 0.79 TRU: 79, FAL: 21, NA: 0
  21. # Valerian 4 96 100 0.47 FAL: 51, TRU: 45, NA: 4
  22. ~~~
  23.  
  24. Greedy one-step policy + simple marginal probability chooses a test sequence of catnip/valerian/silvervine/honeysuckle:
  25.  
  26. Drug Cost P Loss
  27. ----------- ----- ---- -------
  28. Catnip 8.96 0.67 2.9568
  29. Valerian 7.00 0.47 3.7100
  30. Silvervine 17.77 0.79 3.7317
  31. Honeysuckle 7.99 0.53 3.7553
  32.  
  33. ~~~
  34. costs <- read.csv(stdin(), header=TRUE, colClasses=c("character", "numeric"))
  35. Drug,Loss
  36. Catnip,8.96
  37. Valerian,7.00
  38. Silvervine,17.77
  39. Honeysuckle,7.99
  40. ~~~
  41.  
  42. Define a catnip testing MDP: we are testing 4 cat drugs with intercorrelated responses, we want to find 1 drug which works, and we want to minimize the monetary loss (since we have to pay for each one as defined above).
  43. What is the optimal sequence, which minimizes expected loss before finding 1 working drug?
  44. We can solve by backwards induction on a decision tree and using Bayesian linear regression to predict posterior probability of each remaining drug's success probability conditional on previous drugs not working:
  45.  
  46. ~~~{.R}
  47. library(rstanarm)
  48. library(memoise)
  49. library(boot)
  50.  
  51. predictBayes <- memoize(function (action, observations, df) {
  52. print(paste("Observations in predictBayes:", observations))
  53. formula <- as.formula(paste(action, paste(c(names(observations), "1"), collapse=" + "), sep=" ~ "))
  54. b <- stan_glm(formula, family=binomial(), algorithm="optimizing", data=bol2017Responses)
  55. print(b)
  56. if (length(observations)==0) { mean(inv.logit(predict(b))) } else { inv.logit(predict(b, newdata=as.data.frame(observations))[1]) } })
  57.  
  58. f <- memoize(function(observations) {
  59. allActions <- colnames(bol2017Responses)
  60. actions <- allActions[!allActions %in% (names(observations))] # untried drugs
  61. if (length(actions) == 0) { return(0) } else {
  62.  
  63. actionCosts <- costs[costs$Drug %in% actions,]$Loss
  64.  
  65. probabilities <- c(numeric())
  66. for (i in 1:length(actions)) {
  67. probabilities[i] <- predictBayes(actions[i], observations, bol2017Responses)
  68. }
  69. lossActions <- c(numeric())
  70.  
  71. for (i in 1:length(actions)) {
  72. # we pay the loss to test each action; if successful, we stop here, otherwise we add a FALSE observation and recurse
  73. obs <- list(a=FALSE); names(obs) <- actions[i]
  74. obs <- c(observations, obs)
  75. newHistory <- obs[order(unlist(obs), decreasing = TRUE)] # sort history to avoid unnecessarily different but equivalent calls
  76. # 1. incur cost of each drug; success means stopping; failure: augment history with failure observation, and look for an optimal path through remaining actions recursively by calling 'f' again
  77. lossActions[i] <- actionCosts[i] + probabilities[i]*0 + ((1-probabilities[i]) * (f(newHistory)))
  78. }
  79. print(data.frame(Actions=actions, P=probabilities, Loss=lossActions))
  80. print(paste("Optimal action: ", actions[which.min(lossActions)], "; given observations: ", names(observations), observations))
  81. return(min(lossActions)) }
  82. })
  83. # f(list(Catnip=FALSE, Honeysuckle=FALSE, Silvervine=FALSE))
  84. f(list())
  85.  
  86. ## No data, optimal choice is catnip:
  87. # Actions P Loss
  88. # 1 Catnip 0.6810641502 15.79007271
  89. # 2 Valerian 0.4731509696 16.92706363
  90. # 3 Silvervine 0.7879218896 20.92153367
  91. # 4 Honeysuckle 0.5276423201 16.51408849
  92.  
  93. ## With catnip resistance, optimal choice is honeysuckle:
  94. #
  95. # Actions P Loss
  96. # 1 Valerian 0.2005122547 23.27730700
  97. # 2 Silvervine 0.7202520977 21.49101933
  98. # 3 Honeysuckle 0.3287608075 21.41519278
  99.  
  100. ## With catnip/honeysuckle resistance, optimal choice is silvervine, and if silvervine doesn't work, only valerian is left:
  101. #
  102. # Actions P Loss
  103. # 1 Valerian 0.1695397115 21.75727933
  104. # 2 Silvervine 0.6813415396 20.00060922
  105.  
  106. ## Final optimal policy: catnip, honeysuckle, silvervine, valerian; aside from catnip, almost reverse of greedy policy.
  107. ~~~
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement