You should check out the VOLT paper, I think it would work well. It's a new tech... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

rajansaini on Dec 6, 2021 | parent | context | favorite | on: Fakelish – Fake English word generator

You should check out the VOLT paper, I think it would work well. It's a new technique for splitting up a vocabulary into subwords while minimizing entropy. These subwords could then be mixed and matched, maybe by a neural model, for better results.

lioeters on Dec 6, 2021 [–]

Thank you for the reference. To save others a search, I believe this is the paper:

Vocabulary Learning via Optimal Transport for Neural Machine Translation - https://arxiv.org/abs/2012.15671

https://jingjing-nlp.github.io/volt-blog/

https://github.com/Jingjing-NLP/VOLT

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact