...a single Python process using gevent and pymongo can copy a large MongoDB collection in half the time that mongodump (written in C++) takes, even when the MongoDB client and server are on the same machine.
I would guess this is for the reason posted in the article: the Python code is multithreaded (sort of, gevent) but the C++ code is only single threaded.
Sure, but that seems to suggest a meta-reason: the standard utility included with the package isn't asynchronous [sort of orthogonal to threadedness] because it would be a PITA to write an asynchronous utility in C++.
...a single Python process using gevent and pymongo can copy a large MongoDB collection in half the time that mongodump (written in C++) takes, even when the MongoDB client and server are on the same machine.