I literally have DistilBERT models that can do this exact task in ~14ms on an NV...

hansvm · 2024-12-28T22:53:07 1735426387

Thanks for providing a concrete model to work with. Compared to GPT3.5, the number you're looking for is ~0.04%. I pointed out the napkin math because 0.00000001% was so obviously wrong even at a glance that it was hurting your claim.

And, yes, purpose-built models definitely have their place even with the advent of LLMs. I'm happy to see more people working on that sort of thing.

griomnib · 2024-12-28T23:53:32 1735430012

I applaud you doing the math! Proves you aren’t an LLM :-D