They were one of the first widespread ensemble methods (based on bagged decision...

tocomment · on Dec 18, 2012

Are they easy to interpret? Is it ok if I have a lot of columns in my data?

textminer · on Dec 18, 2012

You will find no understanding in the grown trees-- by definition, they're each seeing a different angle of the data, and each node a different subset of available features.

You can, however, calculate importance scores for the features used. Brennan's original paper gives a good algorithm for doing this (in short: for each tree, permuting the data along some feature for an out of bag sample and seeing how much worse it does.)

textminer · on Dec 18, 2012

(Pardon me/blame autocorrect: Breiman.)

glogla · on Dec 18, 2012

Depend on the number of trees, and their depth. Even with 20 trees, with effort, it could be written on paper as rules, for example. And it helps if most of the trees are depth 3 or something.

But then, some models are pretty much impossible to understand or interpret.

_delirium · on Dec 18, 2012

They handle large-dimensional spaces fairly well, which is one of their strengths. They are not all that interpretable, though.