Indeed. But if GPT-4 is actually 1.76T as rumored, an open-weight 400B is quite ... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

causal 29 days ago | parent | context | favorite | on: Meta Llama 3

Indeed. But if GPT-4 is actually 1.76T as rumored, an open-weight 400B is quite the achievement even if it's only just competitive.

cjbprime 29 days ago [–]

The rumor is that it's a mixture of experts model, which can't be compared directly on parameter count like this because most weights are unused by most inference passes. (So, it's possible that 400B non-MoE is the same approximate "strength" as 1.8T MoE in general.)

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact