| | New Gemini 1.5 Pro (0801) top on LMSys leaderborad (twitter.com/lmsysorg) |
|
9 points by zopper 5 months ago | past
|
| | Gemma 2B: Scores better than GPT 3.5 Turbo (twitter.com/lmsysorg) |
|
8 points by FergusArgyll 5 months ago | past
|
| | Chatbot Arena Leaderboard: Gemini 1.5 Flash, Pro and Advanced Results (twitter.com/lmsysorg) |
|
57 points by tosh 7 months ago | past | 38 comments
|
| | GPT-2 Chatbots Top the Arena with +50 Elo, Strongest Model Ever (twitter.com/lmsysorg) |
|
2 points by georgehill 7 months ago | past
|
| | Gemini 1.5 Pro is now #2 on the leaderboard (twitter.com/lmsysorg) |
|
1 point by kmisiunas 8 months ago | past
|
| | Gemini 1.5 moves to #2 on the lmsys arena leaderboard (twitter.com/lmsysorg) |
|
3 points by petulla 8 months ago | past
|
| | Llama-3 now top-5 on the Arena leaderboard (twitter.com/lmsysorg) |
|
4 points by tosh 8 months ago | past
|
| | Lmsys Arena results are out: Claude 3 Opus behind turbo, ahead of classic GPT4 (twitter.com/lmsysorg) |
|
3 points by vitorgrs 10 months ago | past | 1 comment
|
| | Google's Bard shows big leap on LLM performance leaderboard (twitter.com/lmsysorg) |
|
132 points by mkmk 11 months ago | past | 91 comments
|
| | Mistral Medium reaches Claude-level performance on Chatbot Arena (twitter.com/lmsysorg) |
|
4 points by reissbaker 12 months ago | past | 2 comments
|
| | Gemini Pro, Mixtral (Mistral-Small) vs. GPT3.5 in LLM Arena (twitter.com/lmsysorg) |
|
2 points by Palmik on Dec 16, 2023 | past
|
| | Vicuna v1.5 series, featuring 4K and 16K context, based on Llama 2 (twitter.com/lmsysorg) |
|
168 points by tosh on Aug 3, 2023 | past | 43 comments
|
| | Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena (twitter.com/lmsysorg) |
|
20 points by weichiang on June 16, 2023 | past
|
| | Google PaLM 2 ranked 6th on the LLM benchmark in the wild (twitter.com/lmsysorg) |
|
1 point by weichiang on May 25, 2023 | past
|
| | Chatbot Arena: a crowd-sourced LLM leaderboard (twitter.com/lmsysorg) |
|
1 point by weichiang on May 12, 2023 | past | 1 comment
|
| | Chatbot Arena Leaderboard: OpenAI GPT-4 and Anthropic Claude Take the Lead (twitter.com/lmsysorg) |
|
2 points by MMMercy2 on May 10, 2023 | past
|
| | Fastchat-T5: 4x smaller but more powerful than Dolly-v2, commercial use ready (twitter.com/lmsysorg) |
|
7 points by zhisbug on April 28, 2023 | past | 1 comment
|
| | Vicuna releases its secrete of finding available A100s on the cloud to train it (twitter.com/lmsysorg) |
|
4 points by zhwu on April 13, 2023 | past | 2 comments
|
| | State-of-the-Art Chatbot, Vicuna-7B, now runs on MacBook with GPU acceleration (twitter.com/lmsysorg) |
|
126 points by weichiang on April 6, 2023 | past | 84 comments
|
| | State-of-the-art open-source chatbot, Vicuna-13B, just released model weights (twitter.com/lmsysorg) |
|
271 points by weichiang on April 3, 2023 | past | 139 comments
|