Hacker News new | past | comments | ask | show | jobs | submit login

Very interesting approach, and intuitively it makes sense to treat language less as a sequence of words over time and more as a collection of words/tokens with meaning in their relative ordering.

Now I'm wondering what would happen if a model like this were applied to different kinds of text generation like chat bots. Maybe we could build actually useful bots if they can have attention on the entire conversation so far and additional meta data. Think customer service bots with access to customer data that can learn to interpret questions, associate it with their account information through the attention model and generate useful responses.




A "collection of words in a relative ordering" is "a sequence of words".


They mentioned tensor2tensor, how is this related to another repo: seq2seg:https://github.com/google/seq2seq? Which one is more general?


No doubt a holy grail for chat bots, but I'll believe it when I see it.


"Even without evidence, everyone should believe it will solve the problem, but I won't believe it will solve the problem until there's evidence."

Is that what you just said? :)


The harder problem, as usual, is to get enough high quality training data for a particular problem domain though.

Maybe as a POC we can try building a bot that generates relevant HN comments given the post and parent comments. Maybe I'm such a bot, how could I possibly know?




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: