Hacker News new | past | comments | ask | show | jobs | submit login

I’m not easily finding GPT-2 use cases. Any query guidance?



The GPT family of models shines above 100B parameters. Almost nobody uses GPT2 today. It's too weak.

If you want to go with <1B model, you use a BERT which is bidirectional or a T5 that is easier to fine-tune on other tasks.


Something that immediately comes to mind is text summarization. You'll by now be used to better results from GPT-3 or recent models, though.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: