LSTM: How to Train Neural Networks to Write Like Lovecraft

strikingloo · on June 24, 2019

Hey guys, I'm the writer. As you can see from the post, I'm still very much learning.

What I want the most from this site is for more experienced people to help me out with some of my questions.

Here they come:

- Can you use Batch Normalization (the one from tf.keras) on an LSTM layer? Or will it break the model?

- How do you deal with extremely infrequent words if you do a word-based LSTM (with a one-hot encode of each word in the corpus?)? Do you remove them? Replace them? Cluster them?

- Do you think there's any other architecture that would've had better results -while still not taking too long to train-?

srean · on June 24, 2019

> Do you think there's any other architecture that would've had better results -while still not taking too long to train-?

Yes! A 5th or 6th order word level Markov chain with smoothing.

strikingloo · on June 25, 2019

Thanks! I hadn't thought of doing Markov Chains since I was focusing on RNN/LSTMs, but I think I'll try that for the next article, see how that goes.

nestorD · on June 24, 2019

> Do you think there's any other architecture that would've had better results -while still not taking too long to train?

You could easily finetune a pretrained GPT2 model on your lovecraft dataset : https://github.com/minimaxir/gpt-2-simple