>OpenAI GPT adapted idea of fine-tuning of language model for specific NLP task,... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

solomatov on Feb 17, 2019 | parent | context | favorite | on: Microsoft’s New MT-DNN Outperforms Google BERT

>OpenAI GPT adapted idea of fine-tuning of language model for specific NLP task, which has been introduced in ELMo model.

Idea of transfer learning of deep representations for NLP tasks was before, but nobody was able to achieve it before ELMo.

If we are pedantic we can include the whole word2vec stuff. It's a shallow transfer learning.

riku_iki on Feb 17, 2019 [–]

Yeah, but in case of ELMo it was fine-tuning (training of pretrained language model and task model together), not just transfer learning.

phowon on Feb 17, 2019 | [–]

With ELMo, the pretrained weights are frozen. Only the scalars for ELMo layers are tuned (as well as the additional top-level model, of course).

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact