AbstractWe investigate using Amazon Mechanical Turk to create translations from English to Haitian Creole. The intention is to produce a bilingual corpus for statistical machine translation. In different experiments we offer varying amounts of money for translation tasks. The current results show that there is no clear correlation between salary and translation quality. Almost all translations show significant overlap with online translation tools, indicating that operators often did not translate the sentences themselves.1 Introduction Our group is currently involved in developing an English↔Haitian Creole translation system for use in the earthquake-stricken region of Haiti. One of the current tasks is the rapid production of a corpus of bilingual English↔Haitian Creole medical dialogue to be able to train a statistical machine translation system. Some native Haitian Creole speakers have volunteered to help with translations, and we also intend to use professional translators to support the effort. Amazon's Mechanical Turk (AMT) is an interesting alternative in this case as it would be cheaper than using professional translators. This is particularly relevant for an English↔Haitian Creole translation system as the commercial potential is likely limited. A major concern in using AMT for NLP tasks, especially translation, is the quality of the resulting data and the availability of workers with knowledge of Haitian Creole. The experiments presented in this article address these concerns and evaluate the translations produced by Amazon Mechanical Turk versus unpaid professionals and volunteers. We investigate the overall quality of the translations produced and compare the translations made to different...... middle of the paper......and it seems to be reasonably well represented. It will be necessary to confirm the experiments with further translations to have a larger test set. A professional translation will be used as a reference standard to have a more secure reference translation for automatic evaluations. It would also be interesting to do similar experiments with other more common or less common language pairs for further comparisons. References Winter Mason, Duncan J. Watt. 2009. Financial Incentives and “Crowd Performance.” Proceedings of the KDD-HCOMP 2009, Paris, France. Kishore Papineni, Salim Roukos, Todd Ward and Wei-Jing Zhu. 2002. BLEU: A Method for Automatic Evaluation of Machine Translation. Proceedings of the 2002 ACL Conference, Philadelphia, Pennsylvania.Jason Pontin. 2007. Artificial intelligence, with the help of humans, The New York Times, March 25, 2007.
tags