A wide variety of PDFs (both in length and content) that can have a variety of different tables, real estate related with a lot of financial content. And I need to be able to run on local models / software (no parsing as a service, no OpenAI, etc).
Sorry, this will be very hard to do. You can't really try and segment images based on lines as the tables probably varied. The floor plans and things... this data is very very challenging.
I would suggest your best bet is waiting 2 years for the next version of LLAVA to come out which may have capabilities to interpret very accurately on device. The progress with LLAVA has been fast recently but for now it's still a bit too inaccurate.
Here's just one example: https://www.totalflood.com/samples/residential.pdf (I struggle getting accurate data out of the Sales Comp section - basically all approaches mix up the properties.