Hacker News new | past | comments | ask | show | jobs | submit login

I'm partly sad at the approach this and other engines take: reimplement each part (PDF parser, etc etc) in a way where they are pretty much useless except in their specific engine.

If instead we had a PDF() class that did what RAGFlow is doing (dealing with all the different trade-offs of the different python PDF engines such as pdfplumber), then we could easily adapt it and improve it, and it can be useful for other projects as well.




It is open-source though. Just rip it off and make that PDF() class.


Love this counterpoint to "OSS means I can get everybody else to do work for me for free" => "allows you to do the work yourself and share buddy" PR or it didn't happen


Each project has its own detailed requirements and scenarios, and we cannot demand that each project use same library to implement similar functions




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: