Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- cdef struct SplitRecord:
- # Data to track sample splitting process
- # This structure also store the best split found so far
- SIZE_t feature # Which feature to split on.
- SIZE_t start
- SIZE_t end
- SIZE_t pos # Split samples array at the given position,
- # i.e. count of samples below threshold for feature.
- # pos is >= end if the node is a leaf.
- double impurity
- double threshold # Threshold to split at.
- double proxy_improvement # Proxy for impurity improvement to speed up
- # computation times
- double improvement # Impurity improvement given parent node.
- # Use these to compare the current split stats with the best so far
- SIZE_t best_feature
- SIZE_t best_pos
- double best_threshold
- double best_proxy_improvement
- # This will be updated only finally to save some computations
- double best_improvement
- # stats for left partition
- SIZE_t n_left
- double weighted_n_left
- # stats for right partition
- SIZE_t n_right
- double weighted_n_right
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement