Not OP but based on their writeup it sounds like you do need to provide at least a target schema, so what data you need or expect to extract from the unstructured input.
I assume that in the validation step if you don't get all those data points, then that routes to an error state for further review or something.