I've had great luck just base64'ing images and asking Qwen 2.5 VL to both parse ... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

		cpursley 9 days ago \| parent \| context \| favorite \| on: DeepRAG: Thinking to retrieval step by step for la... I've had great luck just base64'ing images and asking Qwen 2.5 VL to both parse it to markdown and generate a title, description and list of keywords (seems to work well on tables and charts). My plan is to split PDFs into pngs first then run those against Qwen async, then put them into a vector database (haven't gotten around to that quite yet).

metadat 9 days ago [–]

How does the base64 output become useful / usable information to an LLM?

cpursley 7 days ago | [–]

No idea but Qwen 2.5 VL seems to understand it all quite well.

Join us for AI Startup School this June 16-17 in San Francisco!
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact