Hacker News new | past | comments | ask | show | jobs | submit login

Shouldn't they be running each scan through the mechanical turk multiple times to remove errors? I guess that would double or triple their costs.



This is the dirty secret of Mechanical Turk – you're supposed to run it through multiple times according to much of the documentation, but people don't because then it becomes expensive enough that it's no longer an attractive option.

I work in a company who have considered it many times for various things, but between the Mechanical Turk API/tech being pretty terrible, and it all being expensive and low quality, we always either end up getting a temp in for a day or two to sit in front of Excel and tidy up data, or if it's a bigger process we outsource it to a data processing company in Bangladesh where we can have dedicated people on our account who sit in a shared Slack channel and who we can train.


How expensive has mechanical turk gotten though over the years? I thought this was standard practice as well, running it multiple times to reduce human error and do random spot checking.

What is the current cost per HIT over years previously?


The last time I looked at this, a few years ago, it was ~$0.10 per HIT, and 2-3 would be needed, and that was for very simple data processing. We have quite complex data processing requirements with multiple interdependent fields, and UI, which would have increased the processing time, so I'd have guessed $1 per item processed total, plus extensive integration time.

Our outsourcing gives is far better communication and the ability to train staff doing the processing over time, feedback on their performance, and help them get better. I don't know the figures, but I suspect it's a similar price but with far better accuracy, but we do have enough consistent work for this to make sense - if we were more spikey in our demand then it might not.


My experience with MTurk is that 3 isn't enough runs if you need the data to be correct and can't afford to pay someone (who ISN'T from MTurk) to validate every entry.

We regularly ran into these two situations:

- All three workers got different answers

- Two of the three workers agreed on the wrong answer

I think five or more runs may be necessary for data transcription on MTurk.


You should consider using qualifications / simplifying the requests.

The error rate I get for data entry tasks is around 0.5%-1% discrepancy between double entry. If you use prior reliability of the worker to tie break between who's right it drops to <0.1% error rate.


Does MTurk API allow to identify, rank and exclude workers? By identifying I mean get some common key for all given worker submissions etc.


I mean, this is an issue in any annotation exercise. Most annotation work heads south due to a failure to create a entire, discrete and complete workflow/ classification.


yep, and then using this type of analysis to determine who is good and who is not: https://en.wikipedia.org/wiki/Inter-rater_reliability




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: