I've heard from multiple people teaching Haskell that it's much easier to grasp for people with no prior programming experience compared to those with one or more languages under their belt. It makes sense on some level - core Haskell is about taking plain data and transforming it into something else. Much easier to explain than public static enterprise beans. I think that learning Java/Python/JavaScript/C#/PHP as a first language burdens you with too much baggage, while a lot of what Haskell teaches can be successfully applied regardless of what language you end up working with.
It's truly unfortunate that Haskell got the reputation of being mainly about monoids in the category of endofunctors, and zygohistomorphic prepromorphisms. The core principles are simple, solid, and lend themselves well to writing production code.
Uh, what? Learning Haskell covers a huge surface area, many things you learn are applicable to tons of other languages. How does that make it a poor teaching language?
I'm interpreting "Teaching language" to be the first language students learn. How can one explain the need for monadic IO properly without explaining how regular printf and the like work. Also, Haskell is quite scary to the beginner (Although I personally find it much simpler/more consistent than the Java and C++)
If it's (say) the third language they learn, then Haskell is absolutely perfect for the exact reason/s you have stated. Personally, I actually learnt Haskell by accident (Read SPJ's book about lazy functional language implementation, realized SPJ started Haskell then learnt more from there).
That's only if you're used to imperative languages. Of course even in functional land a slimmed down language like Scheme is probably better for teaching.