Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- Q1)
- # - (9/12) (log(9/12)/log2) - (3/12) (log(3/12)/log2)
- # = 0.8113
- Q2)
- Cloudy, because:
- 1)visually, it has the most matching datapoints
- 2)visually, the "yes" option for cloudy is guaranteed to eliminate "yes" results from rain
- 3)mathematically, it has the lowest entropy, in which it creates the highest information gain:
- entropy for (x|rain):
- Temperature (split 25-27|28-29): 0.9080
- Temperature (split 25-26|27-29): 0.8755
- UV Index: 0.8333
- Humidity: 0.8333
- Cloudy: 0.5732 *
- Q3)
- Rain
- Yes No Total P(Clou) E P*E
- Cloudy Yes 3 0 3 0.25 0 0
- No 2 7 9 0.75 0.7 0.5
- 12
- Entropy = 0.5731
- # Information Gain = 0.4067
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement