Some context would be helpful. I've been working on a production architecture scaling an NLP model across a K8s cluster of high-compute nodes that needs to run 1000s of inference call per request. C++ conversion and using Go (already the majority api) have been considered.