Friday, November 20, 2009

Deaf engineer at Google creates automatic captioning technology for YouTube videos

From The New York Times:

MOUNTAIN VIEW, Calif. — In the first major step toward making millions of videos on YouTube accessible to deaf and hearing-impaired people, Google unveiled new technologies on Nov. 19 that will automatically bring text captions to many videos on the site.

The technology will also open YouTube videos to a wider foreign market and make them more searchable, which will make it easier for Google to profit from them.

While the technology can insert captions only on English-language speech, Google is giving users the choice of using its automatic translation system to read the captions in 51 languages. That could broaden the appeal of YouTube videos to millions of other people who do not speak English but could use the captioning technology to read subtitles in their native language.

The speech recognition technology that Google uses to turn speech into text is not new; Google currently uses it to transcribe voice mail messages for users of its Google Voice service. But Ken Harrenstien (pictured), a deaf engineer who helped develop the automatic captioning system, said the technology had never been applied on such a large scale.

“This is something that I have dreamt of for many years,” Mr. Harrenstien said, speaking through an interpreter. “To see it happen is amazing.”

YouTube already has several hundred thousand videos that have closed captions, which typically come from broadcast networks that include them in their programs. Some other online video sites like Hulu and AOL also have some professionally created videos with closed captioning.

But Mr. Harrenstien said a vast majority of clips on YouTube did not have captions and the new Google technology would generate them automatically. YouTube is initially applying the captioning technology only to a few channels, most of them specializing in educational content. They include channels from universities like Stanford, Yale, Duke, Columbia and the Massachusetts Institute of Technology, PBS and National Geographic, and Google itself — its corporate videos will be captioned. The company plans to gradually expand the number of channels that work with the automatic captioning technology.

“Because the tools are not perfect, we want to make sure that we get feedback from the video owners and the viewers before we roll it out for the whole world,” Mr. Harrenstien said. “Sometimes the auto-captions are good. Sometimes they are not great, but they are better than nothing if you are hearing-impaired or don’t know the language.”

Google also introduced a related service to give anyone who uploads a video to YouTube the option of uploading as well a text file of the words spoken in the video. Google will turn the text file into captions, automatically matching the spoken words with the files.

The technology, which Google calls “auto-timing,” will make it easy for anyone to add captions to their videos. It will be available to YouTube users worldwide, and Google said it would be particularly useful for videographers who shoot from a script, since they already have a file of the text spoken in the video.

In addition to helping people who are deaf or do not speak English, the captions will make it easier for anyone to search text inside videos and find specific snippets within a video.

Google announced the new features on Thursday at an event in Washington. The company said they would be available by the end of the week.