Imagine this. You crawl into bed at night and drift off into a deep sleep. While you’re floating through dreamland, your ever wakeful personal virtual agent is sifting through hundreds and hundreds of twitter feeds to find out what books people are talking about. Not only is the agent smart enough to know when a tweet mentions a book title, it can also surmise from the tweet context whether the comment about the book is positive or negative.
You wake up in the morning, shuffle into the kitchen, and find a fresh pot of coffee already made.
“What’s new today?” you ask your personal virtual agent.
“It’s a lovely day,” it responds. “You don’t have any meetings until noon. And I found three books that I think you’d really be interested in reading. Would you like to see them?”
It might not be long before this scenario becomes reality. Parakweet is a new company that has launched several Twitter information mining products. One of the products is called Bookvi.be and it uses natural language processing technology to parse and comprehend tweets about books. The same technology is also used to gather together movie recommendations. The technology challenges that Parakweet has overcome aren’t trivial. Lots and lots of people tweet about books. But how can a NLP agent really detect if what they’re saying about the book is positive or negative? Parakweet CEO Ramesh Haridas says that their technology has tackled this challenge.
Users can sign up for a Bookvi.be account and have book recommendations sent to them. It remains to be seen how long it’ll be before the technology is integrated into a coffee-making intelligent personal agent!
VisionMobile published a very informative report last June on the history of Virtual Assistant technology called Beyond Siri: The Next Frontier in User Interfaces. One section of the report dissects the five core components of a mobile virtual agent. These components are relevant to web-based customer support virtual assistants as well. The five virtual assistant components are:
- Speech Recognition
- Natural Language Processing
- User Profiling
- Search & Recommendations
- Avatar Visualization
For this post, I’m going to start by looking more closely at the first two components. What’s the difference between Speech Recognition and Natural Language Processing? Based on the definitions provided in the VisionMobile paper, Speech Recognition is the ability of the software agent to understand spoken sounds and transcribe these into written language. When you speak the words “What’s the current weather in Miami?” the software has to be able to recognize these sounds as words.
Julius is an example of an open source speech recognition decoder software. This technology is so central to the success of mobile virtual assistants, or any virtual agents that rely on spoken language as their user interface, that a provider’s speech recognition software can be a major differentiator in the virtual agent marketplace.
Natural Language Processing describes the software agents ability to parse spoken words, put them into context, and extrapolate their meaning. To do this successfully, the agent needs to have some understanding of the speaker and the speakers environment and situation. The virtual agent also uses Natural Language Processing to formulate responses to the speaker’s questions. The goal of Natural Language Processing is to enable life-like back and forth dialogue. Current technology still seems to be a long way from achieving this goal, but vendors are working hard to improve the conversational ability of their virtual agent software.
OpenNLP is a machine learning based toolkit for processing natural language text. Check out the OpenNLP website for more information on the software’s capabilities.
Whether you’re interested in purchasing virtual software agent to support your business and customers or in building a web-based or mobile virtual assistant, you’ll be interested in understanding as much as you can about of both Speech Recognition and Natural Language Processing.