Hey guys, I'm the writer. As you can see from the post, I'm still very much learning.
What I want the most from this site is for more experienced people to help me out with some of my questions.
Here they come:
- Can you use Batch Normalization (the one from tf.keras) on an LSTM layer? Or will it break the model?
- How do you deal with extremely infrequent words if you do a word-based LSTM (with a one-hot encode of each word in the corpus?)? Do you remove them? Replace them? Cluster them?
- Do you think there's any other architecture that would've had better results -while still not taking too long to train-?
What I want the most from this site is for more experienced people to help me out with some of my questions.
Here they come:
- Can you use Batch Normalization (the one from tf.keras) on an LSTM layer? Or will it break the model?
- How do you deal with extremely infrequent words if you do a word-based LSTM (with a one-hot encode of each word in the corpus?)? Do you remove them? Replace them? Cluster them?
- Do you think there's any other architecture that would've had better results -while still not taking too long to train-?