why (if) was this not picked for further research? i know that oatml did quite amount of work on this front as well and it seems the direction is still being worked on. want to get ur 2 cent on this approach.
BNNs certainly have their uses, but I think people in general found that it's a better use of compute to fit a larger model on more data than to try to squeeze more juice from a given small dataset + model. Usually there is more data available, it's just somewhat tangentially related. LLMs are the ultimate example of how training on tons of tangentially-related data can ultimately be worthwhile for almost any task.