Hacker News new | past | comments | ask | show | jobs | submit login

Do we know it is waited differently? How are they composing the messages into a token stream embedding? How are they manipulating this vector in preprocessing or the first layer(s)?

Does this depend on the vendor and model?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: