Sure, but there are grounded metrics there (the theorem is proved, not proved) that allow feedback. Same for games, almost the same for domains with cheap, approximate evaluators like protein folding (finding the structure is difficult, verifying it quite well is cheap).
For discovery and reasoning??? Not too sure.