A better way to interact with AMT from the command line

Amazon Mechanical Turk is a useful thing. Interacting with it can be a giant pain.

My “standard” approach is to use web2py running on top of Google AppEngine to serve ExternalQuestion HITs. This allows me to have quite a bit of control over the Turker’s experience and collect useful data like when they click on each button in the webapp. Although my current project doesn’t use this click history, a related, later project will. It also let’s me do fun things like use a bit of JavaScript to figure out the distribution of screen widths across workers so that I can optimize their viewing experience.

Width histogram

Anything under 700 pixels is fair game.

But that’s not what this post is about. I’d like to eventually use boto to build the control of AMT directly in to the webapp itself. For now, I’m using the command line interface that Amazon provides. The CLT is an ugly hack that implements the Java API in a bunch of shell scripts. I have written my own wrapper around their scripts that enforces a certain amount of sanity. You can get the scripts over on gitHub.

There isn’t really any documentation beyond the scripts themselves. The idea is that you create an amt-script directory where the new-and-improved scripts live. Under that directory, you create several exp# directories that hold the info that you need for experiments. Even ones are for production runs. Odd ones are for sandbox runs. Once you get a sandbox run working you run buildGoLiveExp.sh and it makes a copy from the staged experiment to a new go-live experiment. It’s a bit of a hack at the moment, but it works for me. I like it because it gives me an audit trail for each thing I run on AMT. Feel free to use them yourself. (Or use them as inspiration for something better that you can write yourself!)

Leave a Reply