It's not as easy as what you just described, especially on sequential data.
Sure, people already use embeddings built by different models for different tasks (think word2vec, last layer from inception, etc.) but this rarely performs as well as what this article shows.