Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
riku_iki
on Feb 17, 2019
|
parent
|
context
|
favorite
| on:
Microsoft’s New MT-DNN Outperforms Google BERT
Yeah, but in case of ELMo it was fine-tuning (training of pretrained language model and task model together), not just transfer learning.
phowon
on Feb 17, 2019
[–]
With ELMo, the pretrained weights are frozen. Only the scalars for ELMo layers are tuned (as well as the additional top-level model, of course).
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: