Thursday, July 2, 2009

Knowledge is fun!

Corbett and I were watching Nova's "Science Now" show the other night and learned something that was exciting to me. This is along the lines of the comments from the previous post, as people love to obsess over the word verifications that one gets on the internet all the time.

These "captchas," as they're called, are something we've known for sometime, and they were invented in 2000 by a (then student at CMU) professor at Carnegie Mellon. This is not that interesting, but then another problem which is seemingly unrelated came up. As you also probably know, many people are trying to digitize everything, especially old texts, to get everything in a digital format. The glory of this, of course, is to have everything both preserved and accessible to everyone. The problem though is that many old texts have non-standard fonts, or are smudged, and as such, they cannot be read properly by a computer. So when a computer scans in many of these old texts, they cannot understand the words, and a lot is lost.

Of course, a human could enter it in by hand, but that is a big time waster. So instead, many times when you get two words for these word verifications, one is a word the computer already knows and is actually used to test if you're a human signing up for something, and the other word is one the computer cannot decipher. The assumption is that if you get the one it knows correctly, then you will get the other one correct as well, and then they use that to determine the unknown word. While I don't like citing it, according to Wikipedia, "This provides about the equivalent of 160 books per day, or 12,000 manhours per day of free labor (as of September 2008)."

I just think this is fantastic! So many things can get done this way, because of the well-planned multitasking involved here. This is my new favorite piece of knowledge.

0 Comments:

Post a Comment

<< Home