I think the combination of a) more processed datasets than google/Facebook can sometimes have, b) smaller cpu/gpu resources that google/Facebook have, and c) the incentive to squeeze every last bit of log loss out, all encourages you to ensable all your independently trained models under a gradient tree.