I managed to run inference locally by installing the requirements and running app.py from the demo: https://huggingface.co/spaces/replit/replit-code-v1-3b-demo/...
It is very fast on my RTX 3070, VRAM usage goes to ~= 6.3GB during inference.
I managed to run inference locally by installing the requirements and running app.py from the demo: https://huggingface.co/spaces/replit/replit-code-v1-3b-demo/...
It is very fast on my RTX 3070, VRAM usage goes to ~= 6.3GB during inference.