Mobile Voice Conference to Focus on Intelligent Connection

The Mobile Voice Conference is just around the corner. If you haven’t registered yet, now’s the time. This is the fifth iteration of the annual conference and this year’s theme is The Intelligent Connection.

Mobile Voice ConferenceThe theme has a two-fold meaning. On the one hand, mobile voice technologies connect us with the intelligence that is in the world around us, be it on the Web or in specific applications or things.

On the other hand, voice technologies allow us to leverage our inherent human intelligence to connect to apps and smart things. Our voices, and the meaning they convey through language, have become the new user interface.

The fifth annual Mobile Voice Conference takes place in San Jose, CA from April 20-21.  It features two full days of sessions covering actionable information on implementing effective mobile voice strategies. You can view the full program online.

How the Apple Watch Might Benefit Siri

Dan Miller of Opus Research wrote a post earlier this month about how the Apple Watch is a perfect match for Siri. Miller points out that Siri has some apparent limitations on the Apple Watch, being that she doesn’t speak back to the wearer, but just listens and carries out spoken commands.

Apple WatchYet Miller views the Apple Watch as the perfect extension for speech-enabled use cases that the typical iPhone user is already accustomed to and comfortable with. iPhone users have come to depend on Siri’s reliability for controlling the clock, setting alarms, leaving reminders, or making calendar entries. Miller thinks these types of commands are closely integrated with watch functionality and that buyers of the Apple Watch will naturally use Siri to perform these operations.

Miller also points out that Siri’s range of capabilities is extensible. Siri is standing by to help execute a large number of apps and the inventory of apps with Siri integration will steadily increase.

In a post I wrote last September, I noted that the form factor of the Apple Watch might actually lead to a renewed interest in using Siri, even among the large number of iPhone users who rarely interact with Apple’s digital personal assistant (now that the novelty has long since worn off). When I saw the demo of the Apple Watch following the #SpringForward reveal, the concerns about the form factor resurfaced.

There are so many tiny app icons loaded onto the watch face that it’s difficult to tap on the app you really want to open. Wouldn’t it be a ton easier to just say “Hey Siri, open Uber” or “Hey Siri, open Facebook?” The constraints of the wearables form factor may provide a renewed raison d’etre for voice interfaces in general, and intelligent assistants in particular.

I agree with Dan Miller that Apple Watch and Siri are a natural pair. It’ll be interesting to observe how this next generation of wearables impacts the intelligent personal assistant market and whether wearables force voice interfaces to the forefront.


The Year 2024, Intelligent Assistants, and the CITIA Roadmap

The Conversational Interaction Technology Innovation Alliance (CITIA) is a group of European organizations interested in the field of conversational speech technologies. On February 24-25, CITIA held its Roadmap conference in Brussels, Belgium. The purpose of the Roadmap conference is to accelerate research and innovation strategies for European public and private sector organizations in the area of conversational technologies.

New frontiersI didn’t attend the conference, but I’ve taken advantage of the videos that have been posted on the CITIA Roadmap website.

One session that I found particularly interesting was that of Steve Young, Professor of Information Engineering, University of Cambridge. Young spoke on the topic of “A Vision For Conversational Interaction Technologies (this link takes you to the presentation page; look at the top of the “Conference Programme” section to see the video for Young’s session).

Young offers rationale for why speech-enabled interfaces are likely to become more important in the near future. The more apps, devices, and “things” that become connected to the Internet, the harder it is for the average user to understand how to interact with them all. No one wants to learn to navigate dozens of individual user interfaces. Instead, we want a single, unified conversational interface that we can talk to. The ideal interface would be an intelligent personal assistant.

The technologies enabling these assistants are rapidly evolving. Young’s information lets him postulate that 2024 will be the year of the fully capable and fully conversational virtual personal assistant.

Young provides a brief synopsis of the technologies that make the optimal virtual personal assistant work. He touches on the areas of:

  • Speech recognition and speech synthesis
  • Semantic decoding and dialogue management
  • Belief tracking and natural language generation
  • Statistical natural language processing
  • Sentiment and emotional analysis

Young believes there is a huge market opportunity for intelligent assistants. He lists the companies that he sees as the “Goliaths” in the business and contrasts them to the current “Davids.”  Quite a few of the Davids have already been swallowed up by Goliaths in acquisitions. Can you guess who the Goliaths are? How about the Davids? Take a look at Young’s presentation to see if you guessed right.

I’m not sure exactly what the CITIA Roadmap holds, but 2024 is less than a decade away. Regardless of the outcome, the journey towards optimized conversational technologies is sure to be an interesting one over the coming years.


Can Data Centers Handle the Load of Intelligent Personal Assistants?

A flurry of press attention has surrounded the recent announcement of Sirius, an intelligent personal assistant built by a group from the University of Michigan.  Most of the articles focus on the fact that Sirius is based on open source software. The implication is that U-M has created a technology that could lead to a generation of open source competitors to Siri, Google Now, and Cortana.

SiriusThe paper published by the U-M Engineering team that built Sirius, however, indicates that the open source intelligent assistant wasn’t designed to take on commercial competitors. Instead, the team needed a fully functioning intelligent assistant to experiment with the hardware server designs required to support these technologies. The U-M team’s starting hypothesis was that intelligent assistants are “approaching the computational limits of current data center architectures.”

Why do intelligent personal assistants put such a heavy burden on backend systems? It’s because of the type of data these assistants are sending from the mobile device to the backend server, and because of the way that the data has to be processed. When you speak to Siri or Google Now, compressed voice files have to be transferred to a server, where complex speech recognition functions are executed. If the request requires intensive data analysis to obtain a response, this can consume a high degree of computational power as well. And the end-to-end process of asking your question and receiving a correct response has to happen at lightning speed. In other words, the latency between question and answer needs to be kept to a minimum.

To test out the best possible server architectures, the U-M team built Sirius with the following core capabilities:

  • speech recognition
  • question-answering
  • image matching

As noted in all the other articles about Sirius, the team leveraged open source projects for the foundation of their assistant. Refer to the team’s paper for a complete listing of all the open source components used in Sirius.

The team’s experiments proved that certain server architectures result in significant latency improvements for intelligent assistant queries. The team found that the best architecture to support these queries is a field-programmable gate array (FPGA) architecture, which is a type of specialized integrated circuit. FPGA is a high-cost solution. If a somewhat lower cost solution is required, then the study found that a design based on Graphical Processing Units (GPUs) is also well suited to intelligent assistant query processing.

For me at least, another takeaway from the U-M experiment is that operating a commercial grade intelligent personal assistant isn’t for the faint of heart. Providing the low latency service needed to keep the user happy requires a large data center investment. Having an operational intelligent personal assistant built on open source software is only the start. You need all the backend compute power to make it work.


Aivo Brings AgentBot Virtual Assistant to the U.S. Market

I recently had an opportunity to learn more about Aivo, the makers of the AgentBot virtual assistant technology. Aivo is based in Argentina and has a broad customer base in Latin America. The company is expanding into North America, having just opened an office in New York City and planning an office in San Francisco.

I spoke with Martin Frascaroli, Founder and CEO of Aivo. Frascaroli answered my questions about the company and how their technology works. He also shared his thoughts on the vision of the company and the value their product offers to clients and their customers.

Aivo’s AgentBot operates atop Aivo’s own natural language processing technology. AgentBot can understand customer questions and intent and deliver the most appropriate response based on the information available in its knowledge repository. But Aivo doesn’t strive to make AgentBot an exceptional conversationalist or a cognitive genius. According to Frascaroli, Aivo is all about providing an engaging and satisfying self-service experience for the customer.

AgentBotFor the client, AgentBot is designed for ease of both the initial set up and continued operation. In fact, Frascaroli made a point of stating that Aivo doesn’t want the client to have to depend on support services. The AgentBot intelligent assistant technology is straightforward to implement and there are monitoring and analytics tools that help the client continuously improve the system.

I asked Frascaroli to describe the typical client implementation. After a kickoff phase where the goals and scope of the project are set, the client can use AgentBot’s tools to begin creating their question and answer database. AgentBot offers interfaces to other critical customer support tools, such as live chat and ticketing systems. Once AgentBot is up and running, the client can leverage analytical insights to identify missing answers and make the agent smarter.

Though most of Aivo’s customers are large companies, Frascaroli says that they have pricing models that can fit the more limited budgets of smaller companies. Aivo enjoys much current success, but Frascaroli remembers what it’s like to be an entrepreneur with ambitions bigger than the corporate balance sheet and he’s open to finding ways to help growing companies use AgentBot.

I tried out some of the live instances of AgentBot on a few of Aivo’s client websites.  AgentBot comes in many different forms and personalities. The common features of AgentBot include a simple static or animated character and a crisp, effective user interface. I was struck by the versatility of the user interface, which seeks to assist the customer by presenting all kinds of information in a variety of different forms.

Several of the user-facing AgentBot screens I saw were divided into two distinct panels. The left-hand panel contains the agent’s dialogue box, where users communicate with AgentBot by typing questions into a text box. Answers are displayed on the text box side and answers can contain links to website pages. Each AgentBot answer offers a way for the user to rate whether it was helpful or not.

The right-side panel typically starts out by displaying frequently asked questions with links to the answers. But this right-hand panel can transform itself into a versatile self-service platform, offering the customer all types of information. I saw an interesting example of this variety on the Galicia Bank website. The AgentBot for Galicia Bank answers lots of questions about banking, bank accounts, and credit cards using plain text. But when I asked a specific question about opening a credit card account, AgentBot popped up a short, animated video in the right-hand panel which walked me through the basics of credit cards.

Luigi, the AgentBot on the Fiat Argentina site, knows everything about Fiat models, prices, warranties, the history of Fiat, and more. When I asked Luigi about specific Fiat models, it used the right-hand panel to launch a very handy graphical comparison of the Fiat line-up. I could go straight to the panel to easily compare any two Fiat car models and get all the specifications on each one.

Aivo’s AgentBot is offering an easy-to-use and compelling self-service experience to customers in Latin America. In speaking with Frascaroli, it’s clear to see how committed the Aivo team is to expand AgentBot’s reach into North America, where it can continue to help customers find solutions on their own terms.

H&R Block Impresses with Intelligent Scheduling Assistant

I’ve used H&R Block to prepare my tax returns for the past several years. This year I accidentally ran across their automated appointment scheduling IVR system. I found the system to be remarkably good. In fact, the whole experience is more akin to interacting with an intelligent assistant than with a traditional IVR phone tree.

SchedulingHere’s how it worked. Towards the end of February, I realized it was time for me to make an appointment with my regular tax preparer at the H&R Block local office. I called the number and a pleasant, automated voice said something along the lines of: “it looks like you’re an existing customer and you already have an appointment. Enter the four digits of the year in which you were born.” So I entered the digits and, to my surprise, the voice said that I already had an appointment.

The automated assistant told me the details of my existing appointment and instructed me to press 1 to change it or 2 to cancel it. But it was the perfect day and time. In fact, creepily enough, the appointment was on the exact date and time that I’d been planning to ask for. Apparently I’d made the appointment a year earlier, when I was in the office having my 2013 tax return done.

When I hung up the phone I was so amazed by the experience, I had to tell a colleague about it. Okay, the fact that the appointment was on the exact date and time I wanted was just a coincidence (or good guessing on my part a year earlier), but it sure made the intelligent assistant seem smart!

Fast forward to this week. I was supposed to go to my appointment yesterday evening. However, we had some more bad weather in the area and the local roads were impassable. The H&R Block office was closed. I made a mental note to call and reschedule my appointment. I dreaded the hassle.

Today I received a voicemail. Guess who it was from? Yep, the H&R Block intelligent scheduling assistant. I dialed the number the assistant had provided and it quickly walked me through setting up a new appointment with my regular preparer. It gave me a few options and let me pick the one I liked best. Then it repeated the information for me so that I could enter it into my calendar.

The whole transaction took place without me ever speaking to a human call agent. In fact, I never spoke at all. Except for the fact that I didn’t say anything, since I was an existing customer, the H&R Block scheduling assistant reminded me of the Hyatt Hotels automated reservation system that’s based on technology from Interactions. Both systems work seamlessly and even though you’re intuitively aware that you’re not dealing with real people, you don’t even notice.

I tried to find some information about the technology that powers H&R Block’s system. I located a brief description of an “H&R Block intelligent virtual agent” on the site, but it seems to describe a different intelligent assistant than the one I interacted with.

The whole experience with the H&R Block intelligent scheduling system made me realize that self-service is becoming ever more integrated into our daily lives. Sometimes we don’t even notice it. That, in fact, is exactly how it’s supposed to work.


Geppetto Avatars Offers Virtual Health Care Assistants

I recently ran across Geppetto Avatars, a company that creates intelligent virtual characters. The avatars have many potential use cases, but currently the company seems to be focusing on the health care space.

Geppetto AvatarsSophie is a compelling 3D virtual character that interacts with patients in the role of physician’s assistant or medical advisor. The underlying technology for Sophie includes speech recognition and natural language processing that enables Sophie to understand what patients are saying. In the video demos on the Geppetto Avatars website, the virtual assistant appears on a web screen and engages in a realistic conversation with the patient.

Sophie has access to the patient’s medical records and asks prompting question to ascertain the patient’s current condition. Based on the patient response, Sophie can ask additional questions to understand details about improvements or problems with the patient’s medical situation. Sophie’s user interface includes a method for patients to take photos of problem areas, such as swollen hands or rashes, and submit them for physician examination.

Sophie can review the patient’s medications with them and ask if they are still providing the desired improvement. She can inquire about whether the patient is doing any physical therapy exercises that may have been assigned.

Sophie can also pick up on visual and intonation cues to assess the patient’s mood. If the patient is upset or unhappy, Sophie senses this and adjusts the conversation accordingly. In addition, she can issue questionnaires to the patient to gather more data about their current condition.

If Sophie works as well in real life as she does in the demo videos on the Geppetto Avatars website, the technology could be of huge benefit to health care practitioners and their patients. Typical physicians have so many patients in their practice that it’s hard for them to spend adequate time with each patient. Having a virtual assistant like Sophie would offer patients a way to get the attention they need, when they need it, while ensuring that the physician stays up to date on their condition.

Why is the company called Geppetto Avatars? I don’t know for sure, but it’s interesting that Geppetto was the name of the woodcarver in Pinocchio who made the wooden puppet boy that came to life. These intelligent avatars aren’t flesh and blood, but they can support humans by acting as a reliable proxy to their health care provider.