For use in retrieval/RAG, an emerging paradigm is to not parse the PDF at all.
By using a multi-modal foundation model, you convert visual representations ("screenshots") of the pdf directly into searchable vector representations.
Claude.ai handles tables very well, at least in my tests. It could easily convert a table from a financial document into a markdown table, among other things.
By using a multi-modal foundation model, you convert visual representations ("screenshots") of the pdf directly into searchable vector representations.
Paper: Efficient Document Retrieval with Vision Language Models - https://arxiv.org/abs/2407.01449
Vespa.ai blog post https://blog.vespa.ai/retrieval-with-vision-language-models-... (my day job)