Hacker News new | past | comments | ask | show | jobs | submit login
Fine-Tuning VLMs for Data Extraction (nanonets.com)
6 points by OceanBreez 79 days ago | hide | past | favorite | 2 comments



Is it possible to host multiple fine-tuned VLMs on a single machine? like multiple models sharing the GPU(s) for inference?


Yeah. If you have a large enough GPU you can use vanilla pytorch to load as many models as required. Docker is a good option if you have isolated services. Triton, TensorRT, TorchServe, RAY are also good services to checkout, especially when you want to load multiple adapters for the same LLM/VLM backbone. Is there anything specific you are looking to serve?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: