Hacker News new | past | comments | ask | show | jobs | submit login

This is very interesting! How far does the 'missing generalization' impact go? I.e. your Nalgene grammar file (nice reference, didn't know about Nalgene) is used to generate both flat input strings and nested desired outputs, which you use to train your network on (if I understand correctly). This file seems to contain quite a lot of hand-written varieties like "please/plz/plox/...". After training the network, I assume it is capable to also handle inputs not seen in the training data? Like someone writing "please?!11"? If not, I don't really understand why'd you'd train a network in the first place: you have put in the effort to create a grammar, so might as well use that one to use the actual conversion to a tree, no?

Totally not trying to be negative here, just trying to understand your workflow a bit better.




The words are turned into GloVe word vectors on the input side, so it is able to handle a decent amount of variation in spelling and using synonyms. Having synonyms defined in the Nalgene file helps the network to accept vectors in a general region rather than an exact point in space. It also encourages the network to learn about the grammar of the input rather than use of specific words, so it can handle words that it hasn't seen before (good for names, places).


Gotcha! Thanks for the explanation!




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: