If it's to be really 100% automated I don't think there's much solution besides ...

If it's to be really 100% automated I don't think there's much solution besides recreating the exact layout, using the very same font, and then superimposing the "OCR then re-rendered" text with the original scan and see if they're close enough. This means finding the various fonts, sizes, types (italic, bold, etc.).

But we'll get there eventually with AIs. We'll be able to tell: "Find me the exact font, styles, etc. And re-render it using InDesign (or LaTeX or whatever fancies you), then compare with the source and see what you got wrong. Rinse and repeat".

We'll eventually have the ability to do just that.