Mark Hachman wrote an informative piece last week on Nuance’s Wintermute technology. Nuance is bringing to bear powerful tools that it already owns: state of the art speech recognition, natural language processing, and virtual agent technologies used in Dragon Mobile Assistant and Nina, to create a new kind of virtual agent. In fact, as Hachman points out, Nuance’s goal is to create a persona rather than an ‘agent.’
Is such a goal attainable? As we saw in a recent post about Spike Jonze’s movie “Her,” the idea of smart, appealing software agents that can interact on a personal level with humans is entering the mainstream. In “Her,” the smooth-talking Samantha is a disembodied artificial intelligence, but she’s so skilled at learning about her owner and playing to his interests that he’s quickly smitten.
Nuance doesn’t necessarily want you to fall in love with Wintermute, which is apparently a code name for a product that’s not yet available, but it wants you to trust Wiintermute to understand you and to remember the context of your conversations with it. The really cool feature in demos of the Wintermute technology is the smart persona’s ability to infer what you’re asking for, even when you change the platforms or channels that you use to interact with it. In one demo, for example, a Nuance rep asks Wintermute for information on the score of a basketball game. Later, when the rep turns on a TV set, he simply asks Wintermute to put on the game. Wintermute accesses information stored in the cloud about previous interactions to infer that the rep wants to watch the rest of the same game he was asking about earlier, and it sets the channel appropriately.
If Nuance’s technology pans out, we’ll one day have helpful digital assistants that can remember bits of conversations across time and devices, and that can assist us in so many ways, it’s hard to imagine them all. It still remains to be seen how many of us will become hopelessly enamored with our disembodied smart assistants!
Scott Kirsner of the Boston Globe published an interesting article this week on “Forging new frontiers in voice recognition software.” The gist of the article is that speech recognition technologies, and the technical experts behind them, are hot. There’s been a rapid increase in the evolution of speech recognition systems over the past decade and powerhouse tech giants like Apple, Amazon, Microsoft, and Nuance are vying for talent and supremacy in the race to master voice recognition and to integrate it into the products of the future.
Kirsner quotes Dan Miller of Opus Research as saying that the killer app is a personal digital assistant that can not only understand what we’re saying, but that can provide us with information based on its knowledge of us, our preferences, and our current situation and location. The example question that an intelligent virtual assistant should be able to respond to appropriately is something complex like “what’s the best place for dinner near my next meeting?” To give a great answer to that question, the assistant would have to know and access a range of inputs: the time and place of your next meeting, the type of food and dining experiences you enjoy, the location, menus, and hours of operation of restaurants close to the meeting venue, etc.
Creating the winning products of the future requires more than just integrating great voice recognition software. It also means building intelligent applications that use speech-based inputs to do amazing things. Kirsner says that competition for speech recognition experts is fierce and that salaries keep rising. So if you’re a student wondering what field to concentrate on, the areas of voice recognition, natural language processing, and virtual agent technologies may be worth consideration. They certainly have a bright future!
Springer has published “Intelligent Virtual Agents,” the proceedings from the 13th International Conference, IVA 2013, Edinburgh, UK, August 29-31, 2013. The softcover version of the book has a hefty price tag of $95.00. The Springer website indicates that an eBook version will follow and hopefully the cost of the digital book will be more affordable. Articles can also be purchased individually.
You can browse through an abstract of all the articles on the Springer website. These are scholarly papers written primarily by academic researchers and/or graduate students. They represent the cutting-edge in virtual agent and embodied conversational agent technology and cover a wide range of topics. Some of the papers seem to be available online for free if you search for them by title and author.
The topic areas in the IVA 2013 proceedings publication include Cognitive Models, Applications, Dialogue/Language/Speech, Non-verbal Behavior, Social/Cultural Models and Agents, and Tools and Techniques.
One of the papers I’m looking at in depth is by a group from the University of Southern California’s Institute for Creative Technologies. They’ve been developing what they call a Virtual Human Toolkit that they’re making available. I plan to do a post on the toolkit and its capabilities soon. I’ll try to write posts on other interesting research areas from the proceedings as well. The book looks like a great resource for those of us interested in the virtual agent technologies and virtual humans.
There’s a new film coming out that’s bound to stir up discussion again on personal digital assistants and the future of the human to computer interface. When Apple introduced Siri a few years back, there was a huge amount of buzz about virtual agent technologies and, well, talking apps. A new film by Spike Jonze titled Her takes the Siri concept further by spinning the tale of a lonely man, Theodore, who falls in love with his personal digital assistant. Based on the Her trailer, the story revolves around an intelligent “operating system” called Samantha, a software program configured especially for Theodore based on his responses to various personal questions (are you social or antisocial?, how would you describe your relationship with your mother?, etc.).
Samantha is apparently so perfectly suited to Theodore’s personality and preferences that he quickly becomes strongly attached to ‘her.’ As someone who hasn’t been very lucky in building lasting relationships with other humans, Theodore is giddy to have a loyal companion to interact with. In fact, Samantha seems totally focused on him, his feelings, hopes, and desires. Watching the trailer, you almost get the sense that Samantha is a high tech version of Weizenbaum’s psychotherapist ELIZA chatbot.
What will the next 5 to 20 years bring us in terms of virtual agents that we interact with on a personal level? Will we all have our own, custom-fit personal digital assistant like Samantha? Will our relationships with these virtual agents augment, or replace, our relationships with other people? All this remains to be seen. Spike Jonze’s Her is an intriguing glimpse into at least one potential future. A recent article on Salon.com speculates that Her, as evidenced by the trailer, contains more sap than substance. We’ll just have to watch the movie and find out for ourselves. If nothing else, the existence of the film shows that virtual agent technologies are firmly rooted in the mainstream. What happens next is still speculation, but not for long.
Today VocalWare published a short piece on a new digital receptionist by the UK company MoneyPenny. Presumably the company name is a reference to the dry-witted and ever capable assistant of James Bond legend. MoneyPenny’s digital receptionist product is called Penelope and it’s not really a virtual agent. It’s more of a clever automated call routing service.
Reading through the description of how Penelope works, it sounds a lot like RingCentral or other call routing services. What caught my eye was the fact that Penelope is equipped with voice recognition that lets callers ask for specific people by name. So if you’re a small company with several employees, a caller could ask Penelope to connect him or her to “Melissa” and the digital assistant would route the call to Melissa, assuming that she’s available.
This sort of technology seems to me to be where intelligent virtual agents are headed. If voice recognition, natural language processing, and question answering systems continue to develop at the current rapid pace, we might soon have Interactive Voice Response (IVR) systems that can truly think and act. What if you just found out your flight had been canceled and you could call the airline and talk to a digital representative that was able to rebook you on the next flight?
How far away are we from that scenario becoming a reality? We have virtual agents that can understand questions and context and can carry on basic conversations. We have software than can perform transactions. It doesn’t seem a stretch to think that all of these capabilities, and more to come, will be rolled into capable digital assistants sometime within this decade. What that means for the humans who are currently paid to perform such functions remains to be seen. We can hope that new, more meaningful and nuanced jobs open up for humans as the virtual agents become the first level of support for all incoming calls.
Today Forbes published an article about the launch of Japan’s Kirobo conversational robot into space. We wrote about Kirobo in a previous post. Now, after a successful launch, the little guy is set to reach the International Space Station (ISS) on August 10th.
Just last week we took a look at SuperToy Teddy, a Teddy bear that can be transformed into a true conversational partner by means of a smartphone and a chatbot app. The makers of Kirobo, who include Toyota and the University of Tokyo, have taken the concept even further. They’ve created what appears to be a fully functioning, foot high humanoid robot that can walk and talk and turn to look at the person its chatting with.
The team behind Kirobo seems to be interested in exploring the potential psychological benefits of robot companionship for humans. Astronauts on their way to Mars, or those couped up in the ISS for a year, are an extreme example of humans living in isolation. But loneliness is a part of modern human existence, even for city dwellers surrounded by thousands of other people. Kirobo offers a way to study the effects that a conversational robot can have on people in isolation. Perhaps we’ll have an opportunity to watch a broadcast of Japanese astronaut Koichi Wakata, who is set to arrive at the ISS later this year, speaking live with Kirobo from inside the station. And here’s a question: can Kirobo go on a spacewalk without putting on a suit?
Whatever the team learns from Kirobo’s trip to space could be used to improve how the robot functions and talks. Maybe one day soon commercial models will be available to offer companionship to folks here on earth. It’s always good to have someone to talk to. Hopefully they’ll come out with a model that can converse in English as well as Japanese. Take a look at the Kirobo unveiling video if you want to see the talking robot in action. You have to admit he’s pretty darned cute!