If someone would pay me... I'd find the time. The code I released to do that analysis is a simple version. I actually have a better version that's much more accurate and fast:
1. It eliminates a lot of false positives (such as detecting that the white sky has been copied when it's simply white)
2. It automates manual tweaking of the parameters.
3. It's multithreaded.
I guess I should be cold calling Reuters etc. to see who wants to license the fast version.