The Problem With Today’s Chatbots

CNET recently published an interview with Bruce Wilcox, the creator of the open source chatbot platform Chatscript. The interview, by Daniel Terdiman, sheds some light on what Wilcox believes are the strengths and weaknesses of today’s chatbot technologies.

Before getting into the interview content, Terdiman describes the notoriety behind Talking Angela, a sassy chatbot mobile app that Bruce created with his wife Sue Wilcox. Rumors are rampant that the app could be a front for pedophiles. Curious smartphone users have been downloading the app in droves on either their iPhones or Android devices, sending Talking Angela to the top of the app store charts.

Chatbot likes horses.jpgWilcox insists that there’s absolutely no validity to the rumor. But perhaps the Wilcoxs’ strategy for making a successful chatbot contributed at least in some part to the urban legend. Bruce Wilcox strives to create believable personalities in his chatbots and Talking Angela, who appears as a cat, asks those who talk to her for personal information such as name, age, and apparently the names of friends as well. She also strikes the familiar, chatty pose of a teenager. All of these tactics help her appear more real and cover up her conversational superficiality.

This story got me thinking about how chatbots work and how they compare to personal assistants that are gaining so much traction in the mobile marketplace.

In the interview. Wilcox talks about Chatscript and his approach to creating chatbots. He points to past successes, including the fact that his chatbots are the only ones to have made it into the final rounds of the Loebner Prize competition for the past four years. You can check out the script of a 15-minute conversation that occurred between a judge and the Angela bot during the 2012 Chatbot Battles. Wilcox describes the conversation as close to great, presumably meaning that his chatbot responded to the judge’s questions and comments in a believable and even human-like way . It’s certainly an impressive performance for a chatbot, but if you read through the script, I think you’ll see that Angela’s responses are often far from convincing.

The disappointing fact is that chatbot technology, when compared to today’s conversational search algorithms, just isn’t very good. The fundamental structure of chat scripts requires that the bot creator anticipate almost everything the other person will say. This is a huge limitation, so it’s interesting to examine how successful bot masters work around it.

Wilcox says that you can trip up a chatbot by asking questions that rely on physical inference. An example would be a question like: “If I drop a rock into a pond, but the pond is frozen, what will happen?” That’s obviously going to be a tough question to anticipate and to answer appropriately.

I think there are other, even easier ways to trip up a chatbot. One way is to ask it for further details about something it just said. If the chatbot says “I like to play checkers,” ask it “what do you like about it?” Every chatbot I’ve conversed with gets stumped by this recursive questioning and will answer with something completely unrelated to checkers or why it likes to play that game. The chatbot doesn’t know what “it” in your question refers to. It doesn’t maintain conversational context from one sentence to the next. This makes every chatbot seem like a complete airhead.

For unanticipated questions or comments, the bot master needs to throw in off hand comments that don’t come across as completely off the wall. As mentioned earlier, Wilcox’s strategy is to create a character with a personality and a back story. Angela is a flighty teenager with definitive tastes in music and fashion. She can chatter on about the specifics of pop stars and other icons of modern culture. Teenagers are self-absorbed by nature so it’s not completely unexpected for them to ignore questions or go off topic.  If you ask something Angela can’t match with an existing response pattern, it might respond with “I’m a little monster (claw claw)” (example taken from the ChatbotBattles script linked to above). Since you’re conditioned to think you’re talking to a teenager, this off topic response might just be convincing enough to keep the conversation going. If the exact same response comes up again, though, you’re likely to see through the thin veneer and switch to another activity.

Who would you rather talk to: a search algorithm that doesn’t pretend to have a personality but that can understand and appropriately respond to pretty much anything you can think to ask, or a make-believe teenage airhead? That’s the challenge we face as we try to create meaning conversational assistants.

Share your thoughts on this topic

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s