same for me - also the api itself is very unstable
sometimes the same prompt finish’s within a minute, sometimes our client timesout after 10 minutes and sometimes the api sends a 502 bad gateway after 5-10 minutes.
the very same request then runs fine within a few minutes after a delay of 5 minutes.
the results vary very much, even with a temperature of 0.1
requests that needs responses with over ~2k tokens almost always fails, the 8k cannot be used
I try to use the api for classification of tickets, which i thought the model would be a good choice to use for
requests that needs responses with over ~2k tokens almost always fails, the 8k cannot be used
I try to use the api for classification of tickets, which i thought the model would be a good choice to use for