We did something pretty similar in a nearby field of materials science, using support vector machines rather than linear regression and focusing on a less specific target:
I'm not with the lab any more, but last I heard they were soliciting data from nearby but related subfields (our lab was doing mostly vanadium templated tellurides, but there are a lot of other things to explore in the space of inorganic–organic hybrids, and there are labs working on them). The problem really is data sharing; the most active collaborators were willing to give us lab note-books as long as we paid to digitize them; it was very hard to convince other labs to digitize their own notebooks. TBH I think that's the hard part of a lot of this (collecting the data).
http://www.nature.com/nature/journal/v533/n7601/abs/nature17...