Hacker News new | past | comments | ask | show | jobs | submit login
LSTM: How to Train Neural Networks to Write Like Lovecraft (datastuff.tech)
10 points by strikingloo on June 24, 2019 | hide | past | favorite | 4 comments



Hey guys, I'm the writer. As you can see from the post, I'm still very much learning.

What I want the most from this site is for more experienced people to help me out with some of my questions.

Here they come:

- Can you use Batch Normalization (the one from tf.keras) on an LSTM layer? Or will it break the model?

- How do you deal with extremely infrequent words if you do a word-based LSTM (with a one-hot encode of each word in the corpus?)? Do you remove them? Replace them? Cluster them?

- Do you think there's any other architecture that would've had better results -while still not taking too long to train-?


> Do you think there's any other architecture that would've had better results -while still not taking too long to train-?

Yes! A 5th or 6th order word level Markov chain with smoothing.


Thanks! I hadn't thought of doing Markov Chains since I was focusing on RNN/LSTMs, but I think I'll try that for the next article, see how that goes.


> Do you think there's any other architecture that would've had better results -while still not taking too long to train?

You could easily finetune a pretrained GPT2 model on your lovecraft dataset : https://github.com/minimaxir/gpt-2-simple




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: