Hacker News new | past | comments | ask | show | jobs | submit login

They were one of the first widespread ensemble methods (based on bagged decision trees), and share with other bagging/boosting methods fairly good empirical performance on a wide range of problems.



Are they easy to interpret? Is it ok if I have a lot of columns in my data?


You will find no understanding in the grown trees-- by definition, they're each seeing a different angle of the data, and each node a different subset of available features.

You can, however, calculate importance scores for the features used. Brennan's original paper gives a good algorithm for doing this (in short: for each tree, permuting the data along some feature for an out of bag sample and seeing how much worse it does.)


(Pardon me/blame autocorrect: Breiman.)


Depend on the number of trees, and their depth. Even with 20 trees, with effort, it could be written on paper as rules, for example. And it helps if most of the trees are depth 3 or something.

But then, some models are pretty much impossible to understand or interpret.


They handle large-dimensional spaces fairly well, which is one of their strengths. They are not all that interpretable, though.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: