In many use cases, like flagging documents for compliance issues or processing customer emails, it's challenging to manage this at the vendor level because end customers want the ability to apply business logic and run different analyses.
For data ingestion and mapping, I agree that in an ideal world, we would all have first-party API integrations. However, many industries still rely on PDFs and CSV files to transfer data.
perhaps im misunderstanding the product offering here, isn't this just throwing PDFs (which also has unparsable content like formulas, symbols and large tables even with OCR) on an LLM with structured outputs and running SQL queries?
isn't it obvious that this would be a problem that will eventually be solved by the LLM providers themselves including the ability to flag and apply business logic on top of the structured outputs?
Like I'm not sure if this is well known but LLM providers have huge pressure to turn a profit and will not hesitate to copy any downstream wrappers out of existence rather than acquiring them outright.
Its like selling wrapping tape around the shovel handle for better grip and expecting the shovel makers to not release their new shovels with it in the near future.
The shovel makers don't even need to do any market research or product development and the buyers don't have any incentive to seek or pay a dedicated third party for what their vendors will release for free and at lower costs if that makes sense.
This misunderstanding is valid. Another example is why subscription/recurring billing software exists when payment gateways can solve this problem themselves.
The elephant in the room is the complexities involved down the funnel that need very specific focus/solutions.
A few that we experience as we’re building Trellis out:
1. Managing end-to-end workflows from integrating with data sources, automatically triggering new runs when there’s new data coming in, and keeping track of different business logic that’s involved (i.e. I want to classify the type of the
emails and based on that apply different extraction logic)
2. Most out-of-the-box solutions only get you 95% of the way there. The customers want the ability to pass in their own data to improve performance and specify their unique ontology.
3. Building a good UI and API support for both technical and non-technical users to use the product.