We are Rohit & Ayush, we created Portkey this year March to help tackle some challenges we had seen while building apps based on GPT3, 3.5, 4, and the DevOps principles we brought to the scene to help tackle them.
We believe, a solid, performant, and reliable gateway lays the foundation to help build the next level of LLM apps. It decreases excessive reliance on any one company and takes the focus back to building instead of spending time fixing the nitty gritties of different providers and making them work together.
Features:
Blazing fast (9.9x faster) with a tiny footprint (~45kb installed)
Load balance across multiple models, providers, and keys
Fallbacks make sure your app stays resilient
Automatic Retries with exponential fallbacks come by default
Plug-in middleware as needed
Battle tested over 100B tokens and millions of requests
For the folks serious about gateway, separation of concerns, TS developers.. I'd love to hear your thoughts and we're hungry for feedback!
Reach out to us at hello@portkey.ai or explore the project: https://github.com/portkey-ai/gateway
One plugin or feature that I will like to see in an AI gateway: *Cache* per unique request. So if I send the same request (system, messages, temperature, etc.), I will have the option to pull if from a cache (if it was already populated) and skip the LLM generation. This is much faster and cheaper - especially during development and testing.