SHARE
TWEET

Untitled

a guest Jun 25th, 2019 55 Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
  1. # Make data with biased train2 set
  2. d <- data.frame(x = runif(300, 0, 5))
  3. d$y <- d$x + rnorm(300)
  4. d$g <- c('train1', 'train2', 'test')
  5. d$y[d$g == 'train2'] <- d$y[d$g == 'train2'] + 5
  6.  
  7. plot(d$x, d$y, col = factor(d$g))
  8.  
  9.  
  10. # Fit models
  11. m1 <- lm(y ~ x, subset(d, d$g == 'train1'))
  12. m2 <- lm(y ~ x, subset(d, d$g == 'train2'))
  13.  
  14. # Make predictions
  15. p1 <- predict(m1, newdata = subset(d, d$g == 'test'))
  16. p2 <- predict(m2, newdata = subset(d, d$g == 'test'))
  17.  
  18. # Is it clear that m2 is biased?
  19. cor(p1, d$y[d$g == 'test'])
  20. cor(p2, d$y[d$g == 'test'])
  21.  
  22. # Is it clear that m2 is biased?
  23. mean(abs(p1 - d$y[d$g == 'test']))
  24. mean(abs(p2 - d$y[d$g == 'test']))
RAW Paste Data
We use cookies for various purposes including analytics. By continuing to use Pastebin, you agree to our use of cookies as described in the Cookies Policy. OK, I Understand
 
Top