Thursday, May 31, 2012

Getting Captcha to do useful work



Getting Captcha to digitize books. Well solution is simple and elegant. show two words, one is  known  while other is one to digitize. Users do not know which is which. If the control (known word) is good, we can trust the second with high probability.

This is done by many sites, and they have digitized about 2.5 million books per year this way. See the above Ted talk by Louis Ahn for details.

They are also trying to use language learners to translate the web, which shows the samples to translate and combine translations to create the final version.

This, in my opinion, is a truly creative solution.

May be we can use the same idea to tag images. For example, show a picture and ask how many elephants are in this picture. It is not very easy for a program to answer such random questions on pictures. Now play the same tick with controls and without controls as above. (Well need some thinking to get it right .. but)

No comments: