New Android Robots Unveiled at Tokyo Museum

Just on the heels of the announcement of Pepper, the talking robot offered by Softbank and Aldebaran Robotics, Tokyo’s National Museum of Emerging Science and Innovation unveiled two humanoid robots that are “staffing” the museum. One of the robots, called Kodomoroid, appears to just read from scripts. The second robot, known as Otonaroid, seems to be more of a conversational robot that can engage in spontaneous dialog. The robots were designed by Hiroshi Ishiguro of Osaka University.

AndroidThere’s a video of the news conference where the two humanoid robots were unveiled. It’s difficult to judge from the video just how capable of a conversationalist Otonaroid might be. Based on Ishiguro’s other work with androids, it seems that his focus is more on the appearance and movements of robots rather than on their conversational abilities.

Otonaroid and Kodomoroid are reminiscent of the virtual human twins Ada and Grace–named after Ada Lovelace and Grace Hopper–that answer visitor questions at the Museum of Science in Boston. I wrote about the twins in an earlier post that described the work of USC’s Institute for Creative Technologies (ICT). The ICT has been creating virtual humans for well over a decade and developing a framework of technologies to support all of the capabilities that a virtual human needs to be convincing, including conversational speech.

If I had to choose who to interact with, the Otonaroid and Kodomoroid robots or Ada and Grace, I’d probably pick Ada and Grace. Talking to a virtual representation of a human seems less creepy than interacting with a doll-like robot that’s supposed to very closely mimic human appearance and behavior.  Ada and Grace are able to talk to visitors and answer questions, but I’m not sure if their conversational abilities far surpass those of Otonaroid. We’ll have to await more evidence to make a judgement.

What will win out in the future: virtual humans or physical androids? I suppose there will be a role for both types of artificial intelligent companions and assistants. But both will certainly need conversational abilities if they’re to have enduring success in the marketplace.

Chatbots as Storytellers

When we think of chatbots, aka chatter bots, we normally imagine computer programs that can mimic simple human conversation. Chatbots can generally respond to a narrow range of questions about themselves. If they’re connected to the Internet, they might also be able to look up answers on trivia topics or tell you the local time or weather conditions. Chatbots might even be able to take input from you and turn it back around to create the illusion that they’re listening and empathetic. This illusion is quickly dispelled, though, with an errant response or with the repetition of the exact same answer that the chatbot gave previously.

StorytellingBut what if chatbots had the ability to tell stories? In fact, what if chatbots could make up stories on the fly in a way that mimicked human creativity and invention?

Back in 2010, a group of students at the University of Twente in the Netherlands created a computer system they labeled The Virtual Storyleller. One of the key contributors to the project, Ivo Swartjes, wrote his doctoral dissertation on The Virtual Storyteller project. The software is based on a concept called emergent narrative. As the name suggests, an emergent narrative is one where the story evolves organically with no predetermined outcome. Character agents interact with each other within the framework of a storyworld. They take actions based on their beliefs and goals and the result is a spontaneous narrative that hopefully holds the interest of the listener.

Based on Swartjes’ description of the system, the team anticipated a broad range of storytelling needs and challenges and proposed interesting solutions to many of them. For example, a technique called late commitment allows characters to fill in details of the plot once the story is already underway. This spontaneity makes the story more dynamic.

Emergent narrative plays a big role in the world of gaming. Games are compelling because they immerse the player in a deeply textured fantasy world. They also offer the player choices to keep the world from appearing pre-scripted. As affordable conversational robots, such as NAO and the newly announced Pepper, arrive on the scene, it seems there’s an opportunity for emergent storytelling technologies. A robot that can make up interesting stories at bedtime, or anytime you’re bored, would be a great companion. Perhaps we’ll see storytelling architectures, such as the techniques used in The Virtual Storyteller, applied to chatbots and conversational robots in the near future.

New Conversational Robot Unveiled by Softbank and Aldebaran

Pepper RobotSoftbank Mobile, the Japanese wireless carrier, and Aldebaran Robotics, a robotics company headquartered in France, announced a partnership to build and distribute an intelligent robot called Pepper. The announcement was made with a lot of fanfare at a Softbank event in Japan. Bruno Maisonnier, Aldebaran’s Founder & CEO, describes Pepper as an emotional robot. Like Aldebaran’s smaller robot NAO, Pepper is equipped with sensors and software that enable it to detect human emotions through both visual and voice cues. Pepper itself is designed to appear friendly and non-threatening and to evoke emotions of happiness and ease in those who interact with it. I wrote about NAO in an earlier post.

You can watch a webcast of the entire press event where Pepper is presented to a live audience. Most of the webcast is in Japanese and the dubbed over translation is a bit awkward to listen to. If you fast forward to about Minute 36:00, you can watch Maisonnier’s introduction of Pepper in English (not dubbed over). It’s difficult to tell from the demo how conversational the current version of Pepper really is. According to Yuri Kageyama of the Associated Press who covered the live demo, Pepper looks good but displayed serious limitations. The robot’s voice recognition system appears to need some improvements, and its conversational abilities seem fairly rudimentary.

Maisonnier talks about an ecosystem and set of APIs for Pepper that will allow developers to create third-party apps for the robot. He specially mentions a physical “atelier” where developers can get together in person and collaborate on coding projects. The Aldebaran website currently supports a store where NAO owners can download apps to augment the skills of their robots. There’s also an Aldebaran developer community that you can register for and take part in. It would be great if there was a way for chatbot scripters, or others with an aptitude for creating conversational stories, to package dialog or story content and make it available to run on the NAO and Pepper platforms.

Will Pepper acquire truly compelling conversational skills? That remains to be seen. But unless it can carry on a consistent conversation with us, or entertain us with engaging stories, it seems unlikely that Pepper will become the breakthrough technology that Softbank and Aldebaran claim it to be.


Does Automated Content Technology Hold Promise for Chatbots and Intelligent Assistants?

I recently wrote about a presentation I saw by Erik Brynjolfsson, Director of the MIT Center for Digital Business. In the presentation, Erik mentioned Narrative Science as an example of a technology company that uses artificial intelligence to automate activities that were recently thought to be the exclusive domain of humans. In the case of Narrative Science, the company’s technology is able to generate articles, stories, and other content automatically based on an analysis of various datasets. To put it another way, a computer algorithm takes raw, unformulated data as input and turns it into paragraphs of natural language that read as if they had been written by humans. Narrative Science’s “robot writer” is called Quill. The underlying algorithms were developed at Northwestern University and the first story was generated based entirely on the batting statistics from a baseball game.

Automated ContentAutomated Insights is a competitor of Narrative Science with a product called Wordsmith. They too analyze sports scores and statistics to generate readable stories about sporting events. But both companies create content well beyond sports news. Their products are used to create a wide array of content, including financial reports, product descriptions, and marketing content. It appears to be a growing and profitable market.

On the Automated Insights website, there’s a brief overview of how the Wordsmith technology works. The four steps in generating stories from data are described as:

  • Retrieving data from various sources
  • Analyzing the data to classify it and identify trends and context
  • Identifying insights, comparing them against other data, and making them actionable
  • Structure a narrative around the insights to tell a story
  • Publish the content via a cloud-based infrastructure

After writing earlier this week about the shortcomings of chatbots that are based on pattern matching templates, a question arises. Could the same or similar algorithms that generate automated stories be used to support more compelling virtual agent conversations? Conversations are, after all, at least partly about storytelling. If you’re chatting with an intelligent assistant and ask it about sports, wouldn’t it be good if the assistant could talk to you in a conversational style about the latest games played by your favorite teams? If it can do this with sports, it should be able to create dialog about many other topics that are data driven, such as current events, company news, stock updates, weather, and so forth. It’s just a thought, but it might be interesting to further explore how automated content technologies might be leveraged to improve today’s artificial conversational agents.

The Recent Hype About the Turing Test — And Why It May Not Matter

ChatbotThe blogosphere has been full of the news that a Russian chatbot convinced 33% of judges at the recent Royal Society sponsored Turing test that it was human. There’s apparently nothing unusual or advanced about the technology behind Eugene Goostman, the chatbot that’s scripted to have the personality of a 13-year old Ukranian with poor English skills. It’s a pattern-matching chatbot with a typical backstory (teenager, flippant, weak English) that makes its confusing and off-topic answers seem more forgivable.

Since all the media hype about “the computer that passed the Turing test,” there’s been a bit of a backlash from more astute observers of the AI world.

Doug Aamoth published a short piece on Time online that recounts his brief conversation with the Goostman chatbot. The conversation has all the markings of a shaky exchange with a pattern-matching chatbot. There’s very little in the conversation to make Doug believe he’s talking with a human. As I wrote in my post The Problem With Today’s Chatbots, pattern-matching dialog programs are just very, very limited in their ability to mimic real human conversation. To be convincing, a program has to be able to react to completely unanticipated questions and comments. Simply diverting the conversation to another topic, or coming up with some generic, hollow comment, isn’t how conversation works.

An even more scathing unveiling of the Goostman chatbot’s weakness was posted by Scott Aaronson, a theoretical computer scientist. Aaronson applies the technique of asking fairly straightforward factual questions to quickly unmask Goostman as a chatbot. His conversation also encapsulates a compelling debate about the inherent flaw of chatbots, as opposed to other potentially promising virtual agent and cognitive computing technologies.

If Eugene Goostman’s performance tells us anything, it’s probably that humans can be fooled by chatbots, especially when they want to be. But we already knew that. There are all kinds of bots out in the world and they fool people all the time. Maybe what we need is a training course on “How to Know You’re Talking to a Chatbot.” But that might just spoil people’s fun.

Dean Burnett wrote the cleverest response to the recent hype about the 13-year old chatbot wonder. Check it out for a good laugh. And for a literary take on chatbots and the Turing test, I highly recommend  Scott Hutchins’ novel A Working Theory of Love.

Erik Brynjolfsson on Digital Economy Pros and Cons

I had the opportunity to hear Erik Brynjolfsson speak at a business conference this past week. Brynjolfsson is the Director of the MIT Center for Digital Business. His recent books include Race Against the Machine and the more recent The Second Machine Age, both of which he co-authored with fellow MIT professor Andrew McAfee.

The Second Machine AgeBrynjolfsson (let’s call him Prof. B from now on, and hope that he won’t take offense) used his allotted thirty minutes to present the basic thesis of his recent book. The world is becoming increasingly digitized. Machine learning technologies have advanced rapidly in the past several years and will continue to increase exponentially in the next decade. I’ve written in this blog about many examples of how voice recognition, natural language processing, and search technologies have matured to provide consumers and businesses with a new level of virtual agent capabilities.

Prof. B showed in his presentation that machine intelligence is growing increasingly adept in three key areas:

  • Interacting with the world
  • Language
  • Problem Solving

He used three examples of products and companies that are leveraging the advancement in machine language capability.

  • Siri – voice recognition
  • Lionbridge – Translation
  • Narrative Science – Authoring news stories

Prof. B also talked about IBM Watson. What was of most interest to Prof. B was not so much that Watson resoundingly defeated the best human jeopardy champions, but that Watson started out with relatively poor performance and in just a few years, by perfecting its underlying algorithms, was able to improve its abilities at an astonishing rate. In fact, IBM Watson far surpasses humans in its ability to learn and improve its performance.

The increasing intelligence and overall prowess of machines is a double-edged sword, according to Prof. B. We now have access to more computing power in our smart phones than astronauts had in their capsules when landing on the moon. We’ll benefit from the ability of machines to detect diseases early and to recommend the best treatment options. We’ll be able to stay connected to our loved ones, no matter where we are, and enjoy the many benefits of a massively connected world.

On the other hand, the increasing digitization of society and commerce means that income may decline for many, while profits increase exponentially for a select few.

Prof B. showed data that indicates that U.S. workers remain as productive as ever, but that their incomes are decreasing. Low skilled workers, in particular, are not sharing in the increasing economic pie. In fact, many lower skilled, routine tasks are being automated by machines and may soon displace low skilled workers altogether.

The big winners in a fully digitized economy are those individuals or companies who create digital goods (such as software, music, and film) that is easy to distribute broadly across the globe. Prof. B uses the example of Turbo Tax software. Once the software is made, it’s easy to sell and distribute it to millions of people. A tax preparer can’t possibly prepare millions of tax returns, so the preparer loses out on lots of business. Eventually, the tax preparer may be completely displaced by the software program. The people who own the software company get really, really rich, while the unfortunate tax preparer goes hungry. This observation mirrors Jaron Lanier’s description of what he calls Siren Servers in his book Who Owns the Future?

Prof. B was available for questions after giving his talk. Somehow asked him if he’d seen the movie Her, and he said that he had. He’d found that some of it seemed to ring true, in terms of how we might interact with smart machines in the future. He thought it was certainly plausible that humans could become so emotionally attached to their intelligent machines. He’s seen research and evidence that supports the hypothesis that humans tend to anthropomorphize machines that they can talk to and that appear to be listening to them (I’m paraphrasing).

The talk was certainly interesting and leads to lots of further thinking about how virtual agents and personal assistants will impact our lives, both positively and perhaps negatively, going forward. You can download a pdf of Prof. Brynjolfsson’s talk if you’re interested in seeing the slides.