Hello everyone, my team has a big series of HITs we plan to post on Amazon Mechanical Turk, and before I do, I hope to get some feedback from potential turkers about the way to go about it, and the clarity of the instructions, and so on. We hope to hear back from people on this forum. We are creating a corpus of post-1990 American English that has a lot of annotation added to it in many different ways. Linguistics students have added annotations, automatic processing has added some, and so on. It is used by researchers all over the world who study Natural Language Processing, linguistics, and other topics. We want to do a big experiment where turkers add word meanings -- given a word, a sentence from the corpus where the word is used, and a list of meanings to select from. We're almost ready to go, but we hope to get feedback on the process we propose, and our instructions before we launch. First, the process: the name of the HITs will be "Do you know what this word means? (For American English Word Mavens)". We will first do a trial run of a few words, 100 sentences each where the title will indicate it is a trial run as follows "Do you know what this word means? (Trial Run; For American English Word Mavens)". The trial run is to make sure we can process the hits quickly and accurately and that everything is working before we launch the big series of HITs. Does this seem like a good idea? Second, the instructions describe the task and a bonus schedule we came up with to encourage the same turkers to do many HITs and briefly what the data will be used for. Each HIT has only 10 sample sentences to judge, but in total for each word we have 1000 sentences (10 HITs). By making the HITs small, turkers can stop doing HITs whenever they want to and we'll still be able to get a lot of data. By offering bonuses, we want to reward turkers who do a lot of HITs of the same word, because more data from the same person is good for the corpus. Does this seem like a sensible strategy? Third, we will have a delay of no more than a week after the HITs expire while we do quality checking and compute bonuses, which is explained in the instructions. The quality checking is so we can refuse work from obvious spammers, and the bonus computation is so that we can properly reward people who do a lot of HITs. Does that seem reasonable? Below are the instructions. We welcome comments, questions, feedback, etc. masc-word-sense team ------------------------------------------------------------------------------- Title: Do you know what this word means? (Trial Hits; for American English Word Mavens) For the 10 sentences in this HIT, select the best meaning of the word in boldface. Each sentence is followed by the same list of meanings to choose from. There are *100 HITs* for this *same word, same list of meanings*. The data collected from these HITs will be used for research on how word meaning varies with context, and will become part of an open resource for linguistic research. There are a total of 45 words. Because we want you to do as many HITs per word as possible, we will give bonuses for larger quantities of HITs we approve: 1) if you complete 10 HITs for a word, you get a 0.01/HIT bonus for HITs #10-#39; 2) if you complete 40 HITs for a word, you get a 0.02/HIT bonus for HITs #40-#100. WARNING: Before we approve any HITS for a given word, we will do quality checking, we will not approve poor quality HITs. We apologize in advance that there will be a delay in payment while we quality check, and while we compute the bonuses. We will take no longer than a week from the time that we let HITs expire, or that all HITs for all words have been completed.
Hi Masc-words, Welcome to this forum and sharing about your HITs. As I am from India, I hope I cannot do anything in this HIt.
Sure, here's a sample sentence with the word "quiet" in boldface, and the six meanings for "quiet" below. One HIT would have ten sentences like this, and the task is to select one meaning for each sentence, then submit the HIT. Sentence: On a quiet news day, the Washington Post leads with a scoop about Boeing's concealment of fuel tank problems similar to the ones that may have caused the crash of TWA Flight 800. Meanings: characterized by an absence or near absence of agitation or activity free of noise or uproar; or making little if any sound not showy or obtrusive in a softened tone (of a body of water) free from disturbance by heavy waves of the sun characterized by a low level of surface phenomena like sunspots
I'd be interested in hits like this. I'm an avid reader and enjoy associating words to different situations or instances. The sentence and given replies seems clear and in a nice format. (1) If you want the same people to work on the hits, maybe there's a way you can set up a qualification. "Practice" run, qualification given > subsequent hits available to a selected set. I have only been a part of mTurk for a week now but it seems you will have lots of willing participants. Good Luck.
Hit like this i work good on. I make money time on this please post more now. I have over 9000 hit done with only 500 rejection. thake
Yeah, agree with him. You can go over to other forum where highly literate but equally uneducated western trolls inhabit the forum. Please decide if you want to deal with humans or trolls.
Thanks for the advice. I found the forum by googling for good tips for using AMT -- do you have any suggestions of other forums?
Of course,you are much worse than either species.That is very evident. Trolls couldn't care any less and a 'normal' human wouldn't indulge in racial profiling. Gimme a break, like I want to interact with you. I wouldn't dare touch you with a barge pole.
Ignoring the random chatter; I find the experiment interesting and something I'd enjoy participating in, will it be available for participants outside the Unites States? Your strategies and plans seems well devised, if you want to address a smaller audience though I would suggest you post a series of test runs and select the best workers for a private qualification.
Thanks for the suggestion. I'm not sure we can implement it given the number of workers we would need to recruit to get the entire set of data done. It's a lot of data. masc-words
If my math is correct you should have 4500 HITs, considering the job is pretty easy I'd say 10 active people can take care of that within less than 12 hours though I'd be impressed if they're still there after 6.
I would be interested in completing these hits. I agree with acustic on the series of test runs. If you have people who don't understand English very well completing this type of task, it can affect your data. And I don't think that you would have too much trouble finding people to complete your hits. They tend to go by quickly.