I decided that I'd drive my data pipeline in Python (it seemed another good thing for me to learn, and after all, sabbaticals really are about learning stuff). With my student, Elaine Angelino, as my Python tutor and some web surfing, I had a nice collection of tools (I'd like to thank Mitch Garnaat for boto and Jen Harvey for turkpipe).
For those of you unfamiliar with Mechanical Turk, there are two kinds of users: Requesters (those of us who have stuff we want done) and Workers (people who want to do stuff). I would primarily be a Requester and would be relying on Workers to classify my web pages. The unit of work that Workers do are called HITS, Human Intelligence Tasks. Requesters indicate how much they are willing to pay for each HIT and what kinds of qualifications they want their Workers to have. Requesters select HITS for which they are qualified.
My Mechanical Turk dabbling began with a few small data sets that I'd labeled myself. After having my IRB tell me that my use of the Turkers did not constitute research on people (which I knew, but I had to ask anyway), I nervously sent off my first job. I was amazed. At a whopping five cents per HIT, my 300 HITS were completed in about 10-15 minutes. And the accuracy was pretty good. I submitted each page three times and compared the best 2 of 3 classifications by the Turkers to my own hand labeling. Our agreement was roughly 80% and the points of disagreement were pretty consistent.
I submitted my second batch of HITS and not only did I get immediate turn around, but it turned out that one of my pages wasn't rendering correctly (it was clobbering Amazon's Mechanical Turk's header, so the workers could not accept the HIT, so they couldn't work on it). All three Turkers to whom it had been assigned sent me a note. Each note was polite and explained what was happening. Had they not told me, all I would have known is that some of my HITS hadn't been completed, and perhaps I would have been smart enough to log in as a worker and check them out (but perhaps not). I was really impressed -- these people who were doing some tasks for a nickel a shot all took the time to tell me there was a problem. I was truly grateful (and told them so). Some even replied to my thank you to let me know they'd be happy to test out other HITS.
I'm still tweaking my HITS a bit, but I am overwhelmingly happy with my Turkers. In less time than it took me to write this blog they already processed the HITS that didn't work before (there were about 15 of them). I may just have to figure out other research tasks for which Turkers can be helpful.
I've been meaning to try Mechnical Turk ever since I saw this paper:
ReplyDeleteUMass CS TR 2011-44: AutoMan: A Platform for Integrating Human-Based and Digital Computation, with Dan Barowy and Andrew McGregor. http://www.cs.umass.edu/~emery/pubs/AutoMan-UMass-CS-TR2011-44.pdf
Cheers,
Muli
Try it! It's available for download at www.automan-lang.org. Version 0.2 coming soon!
ReplyDelete