O’Reilly’s Solid Conference 2015 to Include Speech and Intelligent Assistant Topics

O’Reilly’s Solid Conference 2015, covering the topics of Hardware, Software & the Internet of Things is shaping up to be a very interesting event. Solid 2015, aka Solidcon, takes place in San Francisco from June 23-25, 2015.

O'Reilly SolidThe event includes tutorials, presentations, keynotes, and an exhibitor section. Two of the presentations announced in the schedule caught my attention.

Tim Lynch of Nuance wlll be speaking about “Designing for Dialogue: How Speech Unifies the Internet of Things.” Lynch’s presentation touches on a number of compelling topics related to speech technology, including how we can use speech to create common experiences across different connected devices and considerations for designing effective speech interfaces.

Mark Stephen Meadows is speaking about “The Design of Personalities and Natural Language UX.”  Meadows is the founder of Geppetto Avatars, which creates 3D animated intelligent assistants for a variety of use cases, in particular for the healthcare space. Meadows’ discussion touches on topics such as how power dynamics and the role of psychology affect the success of embodied virtual agents that act as healthcare advisors.

The Solidcon website indicates that the best pricing for the conference ends on April 2, so you still have some time to make up your mind about attending. If you’re a student or independent innovator, you only have until midnight of February 27th (that’s tomorrow!) to apply to become a Solid Fellow and have all your expenses to the event paid.

Conversational Toys – The Latest Trend in Speech Technology

Conversational toys seem to be all the rage. Two new entrants to the market are Mattel’s Hello Barbie and Elemental Path’s new CogniToy dinosaur. Before delving into the specifics of each toy, here’s a list of the primary features that both toys seem to share:

  • Speech recognition and natural language processing capability
  • Connection to the cloud
  • Ability to store basic information from previous conversations
  • Ability to offer personalized responses
  • Software that can evolve over time (get updates from the cloud server)
  • Activation of the toy’s listening mode by pressing a button

Barbie Chatbot Doll Powered by ToyTalk

Hello BarbieMattel is partnering with ToyTalk for the Hello Barbie doll. ToyTalk produces several popular mobile apps for children. ToyTalk also has speech recognition technology that’s specifically tuned to understand the higher register and more erratic speech patterns of children’s voices.

A Mattel spokesperson provided a brief demo of Hello Barbie at the recent Toy Fair 2015 in New York. Hello Barbie’s current conversational abilities are comparable to those of a chatbot that’s connected to Wikipedia or some other data source.

The talking Barbie can also store information about its conversational partners in the cloud, so that it can call on its memory to create more personalized responses. In the demo example, Hello Barbie remembers that its interlocutor enjoys being onstage. When the question of possible future jobs comes up, Hello Barbie uses this stored information to suggest a career such as dancer or politician, presumably since both jobs involve lots of time onstage.

According to the spokesperson, Barbie will have the ability to play simple conversational games, tell jokes and stories, and learn more about the person its talking to. The company hopes that offering this type of dynamic interaction with the doll will deepen the child’s relationship with it.

CogniToys by Elemental Path on Kickstarter

CogniToyElemental Path is currently blowing the roof off  the Kickstarter campaign to fund their CogniToys talking toy project. The last time I looked, they had raised over three times more than the $50K they were asking for, and there were still 23 days left in the campaign.

Elemental Path seems to have evolved from Majestyk Apps, which was one of the winners of the IBM Watson Mobile Developer Challenge.

Elemental Path is marketing their first production talking toy as both educational as well as entertaining. On their Kickstarter video, the cute dinosaur creature quizzes kids on simple math and counting exercises, answers science trivia questions, and tells knock knock jokes.

From the demo, it’s not completely clear how the CogniToy leverages the IBM Watson technology. Based on the Elemental Path website, the toy’s technology contains a dialogue engine that uses advanced language processing algorithms.

Concerns About Conversational Toys?

In an opinion piece for ComputerWorld.com, Mike Elgan writes about the potentially darker side of both Hello Barbie and CogniToys. Elgan’s main concern about the Barbie toy is that it pulls young children into a world of total surveillance. Everything children say to their Barbie is captured and sent to the cloud to be analyzed and stored on ToyTalk’s server. According to Elgan, ToyTalk will also email conversations to the parent.

In the case of CogniToys, Elgan expresses concern that the question-answering dinosaur teaches children that knowledge is stored in the cloud and served up by Artificial Intelligence. In the future, Elgan fears that children may not have to learn or experience on their own. Instead, they’ll just ask their intelligent assistant for the answer.

What’s the Market for Conversational Toys?

If the success of Elemental Path’s Kickstarter campaign is any indication, there might be a sizeable market for conversational toys. The fact that Mattel feels motivated to partner with ToyTalk is another sign that we need to take the trend seriously.

Will children be better off with conversational toys than without them? Only time will tell. It seems to me that there can be many positive outcomes to interactions with toys like Hello Barbie and CogniToys. It depends in large part on what producers “program” the toys to do, how interactive we can make them, and whether they will spark a child’s creativity and critical thinking capacity as opposed to stifling them.

Talking Hospital Robot Tells You What It’s Doing

I recently read a story in Wired about the Aethon TUG robot designed for use in hospitals. Apparently the robot has been around for several years, but it’s gotten lots of media attention since the University of California, San Francisco’s Mission Bay wing deployed 25 of the robots.

TUG robotThe TUGs are designed to transport things like patient meals, medicines, medical waste, and linens within the hospital. Hospital staff use a touch screen to send the robot to a specific location. The TUG has built-in sensors and maps that enable it to navigate its way autonomously through corridors, into elevators, and around corners and obstacles.

What especially intrigued me was the fact that the TUG can talk. In fact, Matt Simon, the author of the Wired article, calls the robot “chatty.” I didn’t see any mention of the robot’s speech abilities on the Aethon website, though I could have overlooked it.

I was able to locate a 2011 video of a talking TUG operating in a hospital. To say that the TUG talks is a bit of an overstatement. The robot seems to have a set of pre-recorded scripts that it can broadcast when it needs to convey what it’s doing.

In the video, the TUG makes a few statements around the fact that it has called the elevator, that it’s waiting for the elevator to arrive, and that it’s about to get onto the elevator, so please step aside. It looks like the TUG provides this spoken information whether or not someone is actually there to hear it.

So why does the TUG need to talk at all? Unlike the hardware store robot that I wrote about previously, the TUG is not designed to be customer-facing. It doesn’t interact with patients or answer questions. It just needs to be told where to go.

But giving the TUG a voice seems like the best way for it to communicate its intentions. The robot has to interact within a dynamic social environment. To be successful, it can’t just barge blindly through the hallways and it also doesn’t want to be seen as standoffish and unpredictable. Being able to say what it’s up to goes a long way to making it seem less alien. The TUG doesn’t come across as any less robotic for the fact that it can talk, but it does become more of an accepted part of the social fabric.

It would be interesting to see what new capabilities the TUG might gain if it was enabled with speech recognition and NLP technology. If you were a patient, you might be able to ask it what was for lunch. In the best case, if you didn’t like the hospital menu, you could send it across the street for a burger. But that’s probably wishful thinking.

Amazon Echo / Alexa and Others – Just Getting Started?

Geoffrey A. Fowler wrote a critique of Amazon’s Echo in a recent Wall Street Journal article. Fowler notes that the list of actions Alexa (the default name of Echo’s intelligent assistant) can perform is relatively short. Alexa can tell you the time and report the weather, read short entries from web data sources such as Wikipedia and the dictionary, set timers and do measurement conversions to help you in the kitchen, make shopping lists and buy music on Amazon, and play music from streaming services. It performs all these actions based on your voice input.

Amazon EchoFowler’s favorite use for Alexa is as an on-demand DJ. He’s disappointed, though, by the broad range of functions Alexa can’t perform. Alexa can’t yet connect to smart home devices, like the Nest thermostat. It can’t read from a book or make book recommendations. It can’t answer many of the questions that Siri or Cortana can answer. Though it can make a shopping list and display the list in a companion app, Fowler notes that it can’t execute any purchases other than buying music on Amazon.

Alexa’s biggest shortcoming, in Fowler’s opinion, is that it has no access to email, calendar, and contacts. Alexa is stuck in your living room, or wherever in your house you choose to place the Echo. That means it doesn’t reside in your smartphone, where most of the context relating to your life is stored.

I wrote about Amazon’s Echo in a previous post, where I dreamed about the day when Alexa could act as a truly powerful personal assistant. In a post where I focused on the yet-to-be released Jibo social robot, I explored the question of whether stationary physical assistants like the Echo, the Ubi, Jibo, and others will ever be able to compete with the intelligent apps in our smartphones. And then there’s the recent Kickstarter success by Robotbase, which is a physical assistant too, but one that’s not completely stationary.

Echo and these other “physically present” assistants are so new, that the jury is still out. But something tells me that in another year or two, we’ll be surprised by the functionality leap these devices have made. Once they have access to the same personal context data that our smartphones do, much of the gap between a Siri / Google Now / Cortana and an Alexa / Ubi / Jibo will be erased. The next challenge will be our desire to take them with us wherever we go.

Speech Technology Magazine’s Look at the State of Intelligent Virtual Assistants

Speech Technology Magazine’s Spring 2015 edition contains an in-depth look at the state of speech technology. Included in the study is an article by Michele Masterson on intelligent virtual assistants (IVAs). Masterson describes customer-focused intelligent assistants as technology that bridges the gap between web self-service and human support agents.

Intelligent Virtual AgentSelf-service, Masterson explains, is best suited for helping customers answer frequently asked questions or perform basic tasks. In contrast, live service agents can assist customers with all types of complex inquiries and transactions. IVAs, it turns out, can nicely fill the critical gap between the low touch and high touch customer service experience.

The technology of IVAs has advanced to the point where these assistants can understand natural language questions, surmise context, search for answers, and even carry out business processes. Masterson points out that IVAs are capable of tasks that include booking an airline flight, ordering a pizza, and executing personal banking transactions.

Masterson quotes Dan Miller of Opus Research as saying that he expects the IVA market to reach $700 million by 2016. The article also gives a glimpse into some of the IVA vendors in the marketplace and into sample deployments. You can read the full article on the Speech Technology Magazine website.

CodeBaby Creates Virtual Guide for Colorado Healthcare Exchange

Derek Top of Opus Research recently wrote about intelligent assistants in the healthcare space. One of the technologies Top profiled was CodeBaby’s intelligent assistant Kyla, which supports Colorado’s health insurance marketplace. I gave Kyla a test drive to see how she works.

Codebaby KaylaCodeBaby’s Kyla differentiates itself from other self-service virtual assistants in that there is no text-based interface. In fact, you can’t ask Kyla specific questions. Though this structure sounds like it would be very constraining for the potential health insurance customer, I found that Kyla actually works quite well.

Kyla appears as an animated image of a young woman. The animation is pleasant and doesn’t mimic a human image enough to be creepy. Kyla is more like a guide than a question-answering bot. She pops up at the lower right of the healthcare connect screen and provides a pre-recorded message to welcome the user to the site and give them a quick overview of what the site is about. All of Kyla’s statements are pre-recorded. She has a human voice and her statements are made with natural intonation and tonality, which is a big plus over computer generated speech.

Once Kyla has finished introducing the user to the site (or to a new web page), a pop up appears with a selection of other topics that Kyla can address. Some examples of topics the user can choose are:

  • How long will it take me to enroll?
  • I want to learn more about financial options
  • What are the important deadlines I should know about?

When you select a topic, Kyla delivers her pre-recorded response. The text of her response isn’t displayed on the screen, so you must have your speakers turned on and you must be able to hear. You can start and stop Kyla’s recorded message and replay it as often as you like.

I think Kyla definitely works as a site guide and advisor. If a user has questions beyond those that have already been anticipated, Kyla won’t be able to assist. But she can refer the health insurance shopper to a webpage that helps them find a human broker. The intelligent assistant guide is a great fit for a website as complex and intimidating as a health insurance marketplace. It’ll be interesting to see if the intelligent assistant as guide gets expanded to other use cases

Intelligent Assistants: Our New Co-Workers

Mobile appOpus Research has published my article on Intelligent Assistance and the Inside Game: Inroads into the Enterprise. The article focuses on two recent examples of intelligent assistants that are expanding the reach of these technologies beyond the customer-facing self-service realm. The first example is Oracle’s voice-enabled sales rep assistant, created in partnership with Nuance and aimed at users of Oracle’s CRM system. I wrote about the Oracle Voice app for Oracle Sales Cloud in an earlier post here. The second example is a collaboration between Cisco and the virtual agent technology company noHold. SARA is a powerful assistant that can help simplify the work of network administrators. SARA can answer questions, but also execute system commands. Have a look at the article on the Opus Research site for the full story.