When IBM’s Watson Learned Too Much About Natural Language

IBM’s Watson.

IBM’s Watson can crush opponents on “Jeopardy” and even help with improving cancer care, but the supercomputer does have some difficulties with language, including the ability to comprehend slang.

In order to boost Watson’s aptitude with everyday lingo, its software engineers began teaching it the Urban Dictionary, that massive online depository of current slang. There was just one small problem: Watson couldn’t differentiate “clean” terms from profanity. It’s one thing when a supercomputer uses “OMG,” but quite another when it starts cursing like a Quentin Tarantino character—hopefully not in the middle of a “Jeopardy” episode, although that could prove memorable for everyone involved.

Fortune offers a glimpse into how it all played out:

“Ultimately, [IBM research scientist Eric Brown's] 35-person team developed a filter to keep Watson from swearing and scraped the Urban Dictionary from its memory. But the trial proves just how thorny it will be to get artificial intelligence to communicate naturally. Brown is now training Watson as a diagnostic tool for hospitals. No knowledge of OMG required.”

Watson also adopted some “bad habits” from Wikipedia, which contains a number of risqué definitions.

Hundreds of IBM staffers are involved in the Watson project. It takes several days—at least—to train the system for new tasks, whether Jeopardy or medical research. Watson’s “mind” depends on a combination of geospatial, statistical, and temporal reasoning; for those interested in a little more detail, Dr. David Ferrucci, IBM Fellow and Principal Investigator for Watson, once gave a speech at the Future Health Technology Summit that breaks down how Watson reasons and learns from its data.

Despite its sophistication, Watson sometimes fails—such as the one “Jeopardy” episode where it confused the location of Chicago airports and responded with, “What is Toronto.”

Nonetheless, IBM is openly considering how best to use Watson in some high-pressure industries, where its massive datasets and ability to process natural-language queries could make it an invaluable resource for researchers, doctors, and other workers. For example, IBM could end up deployed in patient evaluation, deploying its considerable brainpower to determining the exact cause of a medical complaint.

But thanks to IBM’s run-in with the Urban Dictionary, Watson probably won’t be delivering that diagnosis with an “LOL.”

 

Image: IBM

Comments

  1. BY Fred Bosick says:

    People curse all the time. Watson was only doing what it was taught, though a better curated source is needed. I find Urban Dictionary useful and hilarious but it has spurious entries which I think are spoofs entered by bored teens.

Post a Comment

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>