Definitely depends on your math background (knowledge of analysis and linear algebra seems to be particularly helpful).
Witten's Data Mining is a very good beginner's book (has very little math, but lots of good explanations and discussion of real life issues).
Bishop's book is excellent, but it's easy to get lost if you don't have the mathematical background.
Duda, Hart & Stork's Patten Recognition book is also very well organized and has one of the best first chapters in any machine learning book. But it too requires mathematical background to be fully appreciated.
Hastie & Tibshirani's book is written by people from a statistical background, and is very very mathematical. I haven't progressed beyond Chapter 2, and I'm working on improving my math skills before I get back into it.
--
For NLP, a very good intro is the NLTK book.
Jurafsky & Martin's book covers more NLP topics, but Manning and Schutze cover statistical portions in depth. I think you should just read both :D.
Witten's Data Mining is a very good beginner's book (has very little math, but lots of good explanations and discussion of real life issues).
Bishop's book is excellent, but it's easy to get lost if you don't have the mathematical background.
Duda, Hart & Stork's Patten Recognition book is also very well organized and has one of the best first chapters in any machine learning book. But it too requires mathematical background to be fully appreciated.
Hastie & Tibshirani's book is written by people from a statistical background, and is very very mathematical. I haven't progressed beyond Chapter 2, and I'm working on improving my math skills before I get back into it.
-- For NLP, a very good intro is the NLTK book.
Jurafsky & Martin's book covers more NLP topics, but Manning and Schutze cover statistical portions in depth. I think you should just read both :D.