I don't think chunking is the most optimal approach either. I can envision a future where embeddings have variable length and comparing two variable-length embeddings would require a more complex similarity metric. Vector databases will need to adjust to this reality.