Nuance Impresses with Domino’s Intelligent Assistant

At the recent SpeechTek 2014, I had an opportunity to sit down with Nuance’s Gregory Pal. Pal is Nuance’s Vice President, Strategy & Business Development, Enterprise Division. One thing that came across loud and clear as I spoke with Pal was the sheer size and breadth of Nuance. The intelligent assistant / virtual agent space is competitive and crowded these days, but Nuance is the undisputed 800-pound gorilla. Pal informed me that Nuance is segmented into four major divisions: Healthcare, Imaging, Mobile and Consumer, and Enterprise. The Enterprise division deals mainly with Business to Consumer solutions.

Domino's Pizza AppNuance’s influence is wide-reaching. According to Pal, 75% of all Fortune 500 companies use Nuance products. 1 in 4 adult Americans are reached at least once a year by companies using an outbound Nuance communication solution. That’s some pretty extensive influence! In the Opus Research report on Enterprise Virtual Assistants published earlier this year, Opus rated Nuance as the top vendor.

When it comes to the intelligent assistant market, which is heating up, Nuance has funneled all its powerful technology into their Nina product. Nina has all the components required to make a successful web self-service, speech-enabled user interface for consumers. These include voice biometrics, speech recognition, natural language understanding, dialog management, and speech output services. The Nina product is also backed by a large professional services organization.

While Nina is in use by many companies for many different purposes, Pal was eager to talk about the recently released Domino’s app that features a branded assistant called “Dom.” After SpeechTek, I downloaded the Android version of the app and gave it a thorough trial. It turned out to be a fun and pleasant test experience. Dom is easy to work with. He has a pleasing voice and a patient personality: positive traits in someone you’re ordering pizza from when you can’t quite make up your mind what you want!

Dom first asks if you want to place a delivery order or a pick up order. I told him I wanted to pick up my order and he immediately showed me a screen of all the closest Domino’s locations. I didn’t even have the GPS on my smartphone activated, but the assistant was accurate in the list it provided. I picked the Domino’s just down the street from me and it set that as my store.

I wasn’t sure what I wanted to order. Dom didn’t understand me when I asked what my choices were, but he understood when I asked for the menu. He showed me a very easy to read menu with photos of different food choices. I told him I wanted a pepperoni pizza and then he brought up three size choices and asked which I wanted. I like the fact that Dom’s questions are reinforced by images and text on the smartphone interface. I asked if he had any drinks and he answered enthusiastically “We’ve got some good ones!” Again, he showed a list of choices with images for reinforcement. I ordered a bottle of water and then I asked for two more. Dom added all of this to my shopping cart and it was very easy to visually confirm what was being added. The final step was to confirm the order and then go to a screen where I needed to type in my contact information.

I was very pleasantly surprised at how smoothly Dom led me through the entire ordering process. You need to keep in mind that there are no human ‘assistants’ working in the background of this Nina-driven app. It’s all pure machine technology. It’s not only impressive, but it’s practical as well. I can definitely see myself using this app, or a similar app, to order dinner.

I asked Pal if he was concerned at all about IBM Watson and if he sees Watson as a potential threat to their top spot in the intelligent assistant market. Pal believes that Watson and Nina are more complementary than directly competitive. Watson is a great tool for mining broad subject matter expertise. Nina, on the other hand, is flexible and can be configured to serve as a company’s branded representative, to carry on targeted conversations, and to integrate with useful apps like the Domino’s app.

I also asked Pal whether he was concerned about all the competition in the intelligent agent space. He recognizes the competition, but he knows that Nuance is playing with a bit of an unfair advantage. With so much technology and the ability to invest $300 million a year in research and development, it’s really tough for smaller players to keep up. The superior performance of intelligent assistants like Domino’s Dom proves that Nuance is going to be hard to displace from the top of the heap.

Pandorabots Unveils Major Upgrades

Pandorabots1At SpeechTek 2014, I had the opportunity to sit down with Pandorabots’ Chief Science Officer Dr. Richard Wallace. Wallace is legendary in the chatbot community as the inventor of the popular AIML (Artificial Intelligence Markup Language) script and the creator of A.L.I.C.E., a highly awarded chatbot that’s been the recipient of three Loebner prizes. Wallace has been extremely busy over the past couple of years. Not only has he been working on a major upgrade to the Pandorabots website and hosting service, but he’s completed a huge overhaul of the open source AIML script that’s now available as AIML 2.0. Included in the AIML 2.0 release is a completely updated, more polished version of the ALICE chatbot content. This treasure trove of existing conversational content is available to anyone who wishes to create a chatbot that takes advantage of AIML 2.0 and the Pandorbots platform.

So what’s the new Pandorbots website like? The first thing to note is that the site is still in beta, so you may encounter a few minor glitches. When you visit the site, though, you’re greeted by a very polished, professional looking single page application that provides access to the chatbot development platform and an extensive tutorial. The Pandorabots team always supplied good documentation, but the new tutorial is super handy and very thorough. There’s really no excuse to sit on the sidelines if you’re interested in building a conversational bot.

The new development platform has been dubbed the Playground and you can quickly sign up for access using one of several popular email or social networking accounts. The playground is a slicker version of the old development environment, but it contains all of the features of its predecessor, as far as I could tell.  The new user interface provides a somewhat more intuitive menu for accessing the various features needed by botmasters. As with the old platform, the new interface includes an automated training capability. The training feature allows you to teach your chatbot correct responses as you’re conversing with it.

Pandorabots2You can also access all of the chat logs from the Playground and make changes to your bots’ responses from inside the logs. If you want to create, modify, or upload .aiml files, you can still do that too. For more experienced botmasters, there are additional advanced features. If you’re ready to share your bot with the community, you can publish it to the Clubhouse.

Note that if you had a previous account on Pandorabots, all you need to do is sign in from the new Homepage and you’ll be routed directly to the old user interface, where you’ll see a list of your existing chatbots. If you want to transition to the new Playground to enjoy its features and fresher look and feel, you’ll need to create a new account and then manually port your existing AIML files over.

In my discussion with Dr. Wallace, he indicated that a marketplace for the exchange of chatbot content, capabilities, and features would be coming soon. This seems like a great idea. It looks like this exchange, along with an API that will enable you to link your chatbot to an app or to other services (a la CallMom), is in the works. Based on information in the FAQ section, it seems support for Chatscript and other chatbot development platforms is also anticipated.

What about AIML 2.0? What makes it superior to the original version? For one thing, AIML 2.0 and A.L.I.C.E. 2.0 support the mobile apps CallMom BASIC and CallMom. The BASIC version runs completely natively within your device without needing a connection to the web. AIML 2.0 facilitates mobile apps with something called out of band (OOB) functions that allow the bot / intelligent assistant to access the features of a phone, such as making calls or setting reminders. AIML 2.0 also extends the capability of AIML by adding new, more powerful wildcards, adding commands that allow external services to be accessed, and providing a slew of other goodies.

But Wallace is very deliberately striving to keep AIML simple. In his view, chatbots are the domain of literary, creative types. Programmers are certainly welcome to create chatbots and they may excel at it. But the creative English-major types are often the ones who come up with truly imaginative and rich personalities that become the most convincing conversational partners (think Eugene Goostman). For this reason, Wallace is dedicated to keeping AIML a tool that is easily accessible to non-programmers.

The new Pandorabots is a great resource for independent web developers, marketing consultants, or other service providers who might want to offer their customers the addition of a virtual assistant, either on their website or as a standalone app. I can imagine many small businesses wanting to experiment with virtual assistants to engage customers and provide access to things like weekly specials, or answers to frequently asked questions. Pandorabots and similar platforms make it so easy to create and maintain a chatbot that small business owners can even do the work themselves, or outsource it to a creative family member. Got a college graduate malingering around the house in between the occasional job interview? Put him or her to work building a corporate virtual assistant for your business! Chances are good that they’ll actually enjoy the work.

LinguaSys – Helping Intelligent Assistants Understand Us

At the recent SpeechTek 2014 event, I had an opportunity to speak with Brian Garr, Chief Executive Officer of LinguaSys, a very interesting company in the Natural Language Understanding space. The prevalence of speech-enabled applications and devices has increased exponentially in the past five years. We can talk to our smartphones, our cars, and even our home appliances. Soon we’ll be conversing with social robots like Ubi and Jibo. Speech recognition technology has made vast improvements over the years. We’re also used to typing in text when we want a search engine, an app, or an intelligent assistant to answer a question or help us complete a transaction. But what about natural language understanding technology? All of this incoming language, whether it be spoken or typed, has to be interpreted and understood before we can get back the answers we need.

LinguaSysOur intelligent assistants seem to understand us pretty well when we ask simple questions about the weather or fact-based questions like “what’s the capital of Wyoming?” But can they understand more complex statements? And can they understand them when we use different languages? LinguaSys is a niche player with a unique and very powerful offering that can make intelligent assistants smarter at understanding what we say. In fact, the LinguaSys technology powers many of the smart applications we use today that involve natural language input.

In talking with Garr about the LinguaSys technology, I learned that they have the keys to a veritable gold mine. The gold mine is a proprietary treasure trove of word meanings and semantic relationships that spans thousands of concepts and over 18 languages. The LinguaSys semantic network was built up over years, during which it offered machine translation software. The company’s products still include machine translation, but the same basic technology now enables the seamless translation and understanding of a huge range of possible conversational inputs. How does this work? In the LinguaSys database, word meanings, concepts, and relationships are stored in language neutral, symbolic format. That means the word “rainbow” has the same symbol no matter if the concept is uttered in Japanese, Urdu, or English.

The use case example that Garr used during our discussions was of someone wanting to make a reservation at a hotel that would also accommodate their poodle. A speech recognition engine can probably do a good job at translating the sounds into the right words. But what are the chances that it’ll know that a poodle is dog, which is a domesticated animal, also known as a pet? This type of conceptual understanding is embedded in the LinguaSys system. It would take a monumental amount of work to establish your own comprehensive semantic model to enable you to extract this type of understanding. You might be able to leverage something like Freebase for some applications. But then what happens when you need to start supporting other languages?

The Carabao Linguistic Virtual Machine, as the product offering is called, can basically be plugged into your application to give it an NLU boost. If you leverage the Carabao Linguistic VM for your hotel booking or general reservation system, the system will understand that when someone refers to their poodle, they’re looking for a pet-friendly accommodation.

Garr refers to the LinguaSys products as middleware. You can access the solution via the cloud or from your own on-premise deployment. Based on my understanding of the product set, they can be readily integrated into new or existing applications using industry standard protocols.

I don’t know what the pricing model is for access to the LinguaSys middleware. The solution may not be affordable for smaller companies or independent botmaster types, but I don’t know that for sure. If your product or technology depends on being able to correctly understand language input, and especially if you’re challenged with accepting input in multiple languages, this is a product you’ll likely want to explore.

 

What Will Google Do with Emu’s Messaging App with Built-In Assistant?

A few months ago, I wrote about X.ai and the intelligent assistant product that they call Amy. Amy is different from most personal intelligent assistants in that you don’t talk to her directly. Instead, Amy ‘listens in’ to your email conversations and takes instructions based on the content of what you write. You can instruct her specifically to set up a meeting for you, but she’s intelligent enough to pick up the nuances of what you need by analyzing your emails.

This type of implicit understanding seems to be the latest trend in intelligent assistant technologies. Techcrunch reported last week that Google acquired Emu, an instant messaging app that appears to have the same sort of context-aware, behind-the-scenes smart assistant built into it. It was just back in April of this year, according to another Techcrunch report, that Emu exited Beta with its mobile messaging app. Obviously, Google must see a lot of promise in the technology if they were anxious to snap it up so quickly.

Google acquires EmuEmu seems to have a broader range of talents than X.ai’s Amy at this point. According to the Techcrunch article, Emu can proactively offer up contextual information based on a number of different topics that you might happen to be texting with friends about. If you’re texting about a dinner date, for example, Emu can show you your calendar, as well as the location and Yelp ratings of relevant restaurants. It can offer the same type of on the spot info about nearby movies if the conversation turns in that direction. The app also lets you tap a button to carry out an action related to the information Emu has retrieved. For example, you can reserve a table at a restaurant or purchase movie tickets.

All of the attributes make Emu sound more like a real personal assistant then either Siri or Google Now.  And it seems the importance of perfecting voice recognition is taking a back seat to an assistant’s ability to infer context and relevant data based on “ambient” information. I use the term ambient to refer to information that surrounds us in our emails, texts, and search behavior. Google Now seems to be more satisfying than Siri as an assistant, precisely because you don’t have to talk to it or ask it anything. It picks up pieces of relevant information about your life by accessing the same data sources that you use routinely.

It will be interesting to see what Google does with the Emu acquisition. It’s also a fun thought experiment to consider how this type of ambient assistance could be applied to enterprise virtual assistants. Recommendation engines, like those suggesting books and movies you might like, are an example of this technology. Customer service intelligent agents that are smart enough to assist you based on a knowledge of your past purchases and preferences might be an appealing concept–as long as they can steer clear of the creepy factor.

IBM Watson Now Acts as Smart Executive Advisor

I recently wrote about IBM Watson’s new debating skills. In that post, I reported on how Watson is now able to build logical arguments both for and against specific policies or topics (using viewpoints written and published by real  people). Now IBM has added additional capabilities to its cognitive computing platform. Based on an article from the MIT Technology Review, Watson can now act as a smart executive advisor in meetings.

Conference RoomTom Simonite, author of the MIT Technology Review article, described a test run of the new Watson feature that took place at IBM’s Cognitive Environments Lab in Yorktown Heights, New York. The lab has a conference room outfitted with microphones to pick up everything meeting participants say. Some sort of software apparently takes all the speech input and transcribes it in real-time to feed it to Watson. Apparently Watson only takes text input at this point.

In the lab test that Simonite observed, a group of IBM folks pretended that they were at an executive strategy session where the goal was to identify strong acquisition targets for a company. Watson was directed to read over a corporate strategy memo summarizing the goal. It was then asked to go out and find possible companies that would make good acquisitions and bring back a list of those companies.

Once Watson returned a list of potential acquisition targets, the meeting participants proceeded to discuss them. They asked Watson to shorten the list and to include key characteristics on each company in a table. Finally, they asked Watson to make a recommendation on the strongest candidate and then asked it some follow up questions. While this executive advisor version of Watson doesn’t seem to be strongly conversational, it can answer direct questions with short, concise responses.

The demo that Simonite describes provides us with a preview of what might be on the horizon for intelligent assistants. Watson has the ability to very quickly consume and analyze large data stores and draw conclusions. This ability to rapidly read and grasp meaning would be a great trait to have in an enterprise assistant. If I know that I need to select the best material out of which to construct my product, for example, but I don’t have the ability to locate, read, and compare information on all viable materials, a Watson-like assistant would come in very handy.

It’s important to remember, though, that Watson relies completely on existing information that was developed by humans. Some of that data may be in a structured format and other content may be completely unstructured, but Watson isn’t making any of this knowledge up or inventing any new concepts. The Watson technology gives us quicker access to the content, and in a way that allows us to make better decisions based on having a broader view of the information and how it relates to other relevant data. It would be good if there was some way to give credit to those humans who created the content that Watson leverages. Ideally (or perhaps idealistically) it would be even better if the originators of the content could profit in a small way from the fact that Watson discovers their information and, in some sense, monetizes it. (This is an idea that Jaron Lanier espouses in his book Who Owns the Future, which is about how to make the economics of a software mediated world work).

Hopefully, as IBM and others develop this type of smart analytical computing, they will ensure that we have a way to understand and verify the results these assistants bring back to us. A Jeopardy! answer is either right or wrong. But the types of answers an executive assistant Watson might provide should be the starting point for further human analysis, not the end of the discussion.

5-Year Old Chats with ELIZA Chatbot. Fun Ensues.

chatbotsWhat happens when a 5-year old converses with an updated version of the ELIZA chatbot? The resulting conversations may be even more amusing than you imagine.  Kieran Snyder, an IT professional who also happens to have a Ph.D. in linguistics, has been teaching her young daughter River about how computers work. She recently decided to help her daughter understand what it’s like to talk to a chatbot.

Synder published the results of River’s first faltering dialogue with a new version of the ELIZA chatbot that Synder coded up herself. The ensuing conversation is a classic example of both the highs and lows of trying to talk to an “artificial intelligence.” There’s a razor thin edge separating a magical sense of human-to-machine understanding from the total frustration of conversing with a brick wall.

It’s a cute story. And it also shows that conversation is a fine art. Even humans need some time to learn how to practice the art well. Some people, if we’re honest, don’t ever learn to pass the brick wall test. I can tell that River’s going to be a great conversationalist though. I bet she’ll also be the kind of person who gets straight to the point!