IBM’s Watson is not only smart enough to win at Jeopardy, it’s got what it takes to be a valuable expert resource for physicians responsible for making complex healthcare decisions. According to a recent article in Robotics Trends entitled The Robot Will See You Now, the IBM Watson team has demonstrated how Watson can utilize the same skills it employed for its Jeopardy win, to now aid doctors in diagnosing illnesses and recommending treatments. These breakthroughs in the area of cognitive computing are providing humans with a new partner for helping with complex knowledge processing tasks.
Watson has enormous processing power. According to sources, Watson is capable of processing the equivalent of a million books per seconds. Combine Watson’s processing horsepower with its language processing and question analyzing and answering skills, and it can search through data and make associations far more quickly than a human can. According to the Robotics Trends article, Memorial Sloan-Kettering, a top institution in the area of cancer research and treatment, is working with IBM to teach Watson the latest information from medical research, symptom analysis, and treatment options. Medical documentation from hundreds of sources is growing exponentially. All of the data is valuable, but a human physician has no way of keeping up with the deluge of information.
That’s where Watson comes in. With its tireless ability to ingest unstructured data and understand relationships and trends within the information, Watson can sort through whole libraries in seconds and recommend possible matches between symptoms, diagnoses, and treatments. Watson also has the ability to learn, which means that its understanding of medical literature and real life medical cases is constantly expanding. And though none of the articles discuss this point, Watson is not subject to unconscious human bias and anchoring, which can cause physicians to make a diagnosis based on only one or two symptoms, while ignoring other possibly important inputs.
An InformationWeek article provides a more in-depth look at how Memorial Sloan-Kettering is working with Watson.
In an upcoming post, I’ll write about the technology behind Watson that’s an offshoot of IBM’s DeepQA and the Open Advancement of Question Answering (OAQA) Initiative.
In my last few posts, I’ve delved into each of the five categories of key virtual assistant components that were outlined in the VisionMobile report Beyond Siri: Breaking Down the Virtual Assistant Market. This post takes a look at the physical presentation of the virtual agent. While many of the current mobile personal assistants do not display avatars, the majority of virtual assistants on corporate websites or online retail sites do have a physical presence. Avatar technology is still immature. Even the best avatars tend to look cartoonish and waxen.
There’s little doubt, though, that companies will continue to work on improving avatar technologies. Virtual agents of the future will more closely mimic real humans until one day, it will be difficult to distinguish between agent and human.
Amy L. Baylor, a Professor at Florida State University, has researched and written about using avatars in conjunction with virtual agents. Her work has found that the virtual presence and appearance of avatars can have a dramatic impact on how a user relates to virtual agents, or other conversational digital assistants. In her paper Promoting motivation with virtual agents and avatars: role of visual presence and appearance, Baylor describes research showing that users relate better to agents that appear like people within their own peer group. Younger users feel comfortable, and even willing to take advice, from virtual agent avatars that appear to be “cool” members of their own clique. Physical age, hair style, and attire are all important factors. These findings may influence how enterprise and customer support agents are designed.
In a recent article on PSFK.com, author Emma Hutchings describes work being performed at Toshiba’s Cambridge Research Lab on avatar technology. This is an example of important advances being made in generating more lifelike avatars that users can connect with. The Research Lab devised a collaboration between speech and vision groups to create a digital avatar that can rotate its head, blink, move its lips, while displaying a repertoire of realistic emotions. In order to develop their avatar model, the team recorded the facial expressions of an actress and then mapped them to basic emotions such as anger, happiness, and sadness on a digital 3D head. See the interesting results for yourself in the talking head video.
Avatars are here and they’re the user interface of the future.
In my last post, I listed the five key components of a mobile virtual assistant, as outlined in the VisionMobile report Beyond Siri: The Next Frontier in User Interfaces . I wrote about Speech Recognition and Natural Language Processing. The next component of intelligent virtual assistants that I’d like to explore is User Profiling. User Profiling entails learning as much information about a user’s preferences, environment, and general context of usage as possible. The virtual agent then leverages this information to shape how it responds to the user and to his or her requests. Another term that is often used to describe similar behavior is personalization.
One example of user profiling, or personalization, that readily comes to mind are the product recommendations that online retailers offer frequent shoppers. The retailers track your previous purchases and gradually learn about your preferences. If you’re a regular purchaser of mystery novels, the recommendation engine is likely to alert you to the hottest mystery titles the next time you visit the site. This is certainly a more valuable service to you than if the site were to repeatedly annoy you with offers for romance novels, ignoring the fact that you’ve never bought a romance title.
Virtual assistants of the future will most likely use similar information gathering strategies in order to improve the services they offer. Your personal mobile assistant might take note of the route you take to work everyday to alert you of a traffic problem. A truely helpful virtual assistant may be able to order food or medicine automatically based on your preferences and usage. Customer facing virtual agents will most likely try to determine a user’s location, past buying habits, and other demographic information in order to appropriately tailor their responses and recommendations.
In a recent white paper called Digital Personal Assistant for the Enterprise, IT@Intel describes a current in-house Intel project to develop an intelligent virtual agent that can assist Intel workers in more effectively accomplishing work-related tasks. Over time, the digital assistant will be designed to support various usage models, including acting as an executive admin and a collaborative assistant. The first release of the Intel enterprise virtual assistant takes into consideration the user’s location to offer a visual map with walking directions when the user is searching for a meeting room in an unfamiliar building. Future iterations of the digital agent will integrate more user profile information to provide a higher level of service. For example, the virtual agent is supposed to become so knowledgeable of its user’s behavior and moods that it will be able to automatically detect when the employee has entered “deep problem-solving mode” and then restrict all non-critical disturbances.
User profiling on the part of future virtual digital agents may give us concerns about loss of privacy, but the potential benefits appear to be huge.
VisionMobile published a very informative report last June on the history of Virtual Assistant technology called Beyond Siri: The Next Frontier in User Interfaces. One section of the report dissects the five core components of a mobile virtual agent. These components are relevant to web-based customer support virtual assistants as well. The five virtual assistant components are:
- Speech Recognition
- Natural Language Processing
- User Profiling
- Search & Recommendations
- Avatar Visualization
For this post, I’m going to start by looking more closely at the first two components. What’s the difference between Speech Recognition and Natural Language Processing? Based on the definitions provided in the VisionMobile paper, Speech Recognition is the ability of the software agent to understand spoken sounds and transcribe these into written language. When you speak the words “What’s the current weather in Miami?” the software has to be able to recognize these sounds as words.
Julius is an example of an open source speech recognition decoder software. This technology is so central to the success of mobile virtual assistants, or any virtual agents that rely on spoken language as their user interface, that a provider’s speech recognition software can be a major differentiator in the virtual agent marketplace.
Natural Language Processing describes the software agents ability to parse spoken words, put them into context, and extrapolate their meaning. To do this successfully, the agent needs to have some understanding of the speaker and the speakers environment and situation. The virtual agent also uses Natural Language Processing to formulate responses to the speaker’s questions. The goal of Natural Language Processing is to enable life-like back and forth dialogue. Current technology still seems to be a long way from achieving this goal, but vendors are working hard to improve the conversational ability of their virtual agent software.
OpenNLP is a machine learning based toolkit for processing natural language text. Check out the OpenNLP website for more information on the software’s capabilities.
Whether you’re interested in purchasing virtual software agent to support your business and customers or in building a web-based or mobile virtual assistant, you’ll be interested in understanding as much as you can about of both Speech Recognition and Natural Language Processing.
Nuance Communications recently announced some cool new features on their Dragon mobile assistant app for Android.
The first new capability for the Dragon mobile assistant is a location sharing and friend finder feature. If you want to let a friend know where you are, you can ask Dragon to send that friend a text that links to your current location on a map. Likewise, if you’re trying to locate someone but you’re just not finding them, Dragon can show you a map with their position on it (assuming they give their permission)
The second new Dragon feature, which really resonates with me, is the mobile assistant’s ability to diel up any telephone number that’s in a calendar appointment. So if you have a conference call scheduled for 1pm, Dragon will ask you just before the meeting if you’d like it to call the number for you. Now that’s service!
End-to-End hands=free text messaging is the last new item the press release mentions. Not only can Dragon transcribe and send text messages you dictate, but it can read aloud incoming text messages and respond to them with your dictated reply.
The race is on for a digital virtual assistant that can understand spoken language and provide users with exactly the information they need, when and how they need it. Oh, and the digital agent should be able to carry on a meaningful conversation as well. In a recent PCWorld article covering South by Southwest Interactive in Austin, Amit Singhal of Google is quoted as positioning the understanding of speech, natural language, and conversation as some of the key challenges facing the search giant today.
People’s need to search for information has become integral to our interactions with the web. When we are online, we are nearly always searching for something, be it news of the world, updates from friends, information on specific goods and services, or what’s playing at the local cinema or on TV. The list goes on. Google imagines a near-term future where smart virtual agents are embedded in our world, in wearable devices such as Google Glass, and where these agents can quickly act on our behalf to get us the information we need. Our virtual assistants should be able to understand what we’re saying when we speak to them. And they should be able to answer us in a way that’s not only informative, but natural.
Competing with Google in the arena of intelligent voice-activated agents, albeit indirectly, are companies that provide voice recognition systems for use across multiple problem spaces. In my next post, I’ll take a look at an article that examines some of the companies in this arena and that highlights how today’s most successful virtual agents are deployed. In the meantime, I hope you enjoy the referenced article with information from Google’s Amit Singhal.
A recently published survey sponsored by Nuance Communications revealed that a majority of physicians believe virtual assistant technologies will lead to positive changes in the healthcare industry within the next five years. Many doctors think that virtual agents capable of speech recognition will improve the way they access data and update medical records. The survey also shows that doctors believe mobile virtual assistants can help patients monitor their health and even modify their behaviors to support them in reaching their health goals.
This infographic on Healthcare IT News shows how survey respondents predict the influence of virtual assistant technologies on healthcare of the future.
Note that Nuance Communications is the maker of Dragon desktop dictation solutions. It will be interesting to watch for new technologies from Nuance, and other virtual agent providers, to see how intelligent virtual agents can deliver on the promise of improvements for healthcare providers and the patients they serve.