I first wrote about Expect Labs over a year and a half ago, when their MindMeld speech recognition and natural language processing technology drove an innovative social listening app tied into Facebook. Since that time, Expect Labs has pivoted their offering into a voice-focused Software as a Service.
MindMeld now provides app owners and developers with what Expect Labs calls an intelligent voice interface. MindMeld uses semantic mapping technology to create a knowledge graph of the application’s content. It then uses the graph to improve the accuracy of its NLP engine in understanding precisely what users are asking when they talk to a voice-enabled interface.
A recent article in Macworld reported on Fetch’s use of the MindMeld service to power their mobile concierge app. Fetch makes it easier for you to buy the things you need by easily connecting you to specialists and personal shoppers. Users can tell Fetch what they need, from a plane ticket to an order of flowers and chocolate for a significant other, and the app will set the wheels in motion to have a specialist fill the request quickly and painlessly.
Now that Fetch has partnered with MindMeld, they’ve been able to create a voice-enabled app that’s optimized for the Apple Watch. Fetch users can use voice commands for on-demand concierge services right from the Watch.
The MacWorld articles cites Expect Labs data showing that people spend 60% of their online time on mobile devices. In contrast, only 10% of purchases are made from mobile devices. In the article, Tim Tuttle, CEO and founder of Expect Labs, voices the opinion that this discrepancy could be accounted for in the current complexity of carrying out purchases from the mobile device form factor.
Intelligent voice-enabled interfaces, like those made possible by MindMeld, are aiming to simplify our interactions with mobile devices. If Fetch’s MindMeld-powered Apple Watch app is any indication, voice interfaces will transform our wearables and smart phones into the personal assistants we’ve always dreamed of. The age of voice is here, and Expect Labs is well-positioned to fuel the positive transition to intelligent natural language interfaces. To see more examples of MindMeld’s technology in action, check out the Expect Labs’ demo page.
Expect Labs issued a press release last week about the launch of their MindMeld app for the iPad. The press release describes Mindmeld as an anticipatory intelligenct assistant app. It can listen to you as you talk to it or to one or more friends on Facebook, and then go out and search for related content either within Facebook or on the web.
How does the app work? I watched the short app demonstration of MindMeld on the Expect Labs webpage. It appears that you have to be a Facebook user to take advantage of MindMeld, as the only way to log in is through your Facebook account. It seems that the idea is for you to join in a conversation with friends–either one that’s currently underway or one that you initiate. MindMeld then listens to what you are saying and it starts displaying what it believes to be relevant and helpful content on the MindMeld screen.
If you and your friends are planning a trip to the BCS Championship game to watch Auburn battle Florida State, for example, I suppose that MindMeld would show you Facebook feeds from anyone trying to sell tickets to the game or maybe even offering a place to stay in Pasadena. MindMeld would probably also show things like airline tickets, hotel specials, or driving directions.
I’m a bit confused as to how the conversations work. Are you actually carrying on a live conversation where you can hear all your friends talking, like in a Google circles call? Or are you and your friends each just talking to MindMeld separately and the app listens to each person and pulls out the things it hears that seem relevant or interesting? I’m guessing that it’s the former, and that you can actually hear your friends speaking.
It’ll be interesting to see how MindMeld functions in reality and whether people find it helpful to be bombarded with content that an app thinks you might like. If you say something like “I drank way too much last night,” will it show hangover remedies? Or will it show you the news feeds of all your other friends that recently typed or said “I drank way too much last night?” Right now when you google that same phrase, the results are mostly in the latter category. Misery loves company, so that might be helpful. But it could just as well be annoying.
I’m confident that there are valid use cases for an app like MindMeld, though. Speech-based search will definitely be a part of our normal lives in the future. The question about how often and under what circumstances we want apps listening into our conversations remains open.
Techcrunch ran an article last week scooping the fact that Intel acquired a Spanish natural language startup back in May of this year. The acquired company was called Indisys and they specialized in computational linguistics and virtual agent (or “intelligence assistant”) technologies. Ingrid Lunden of Techcrunch speculates that Intel will use the Indisys technology to continue building out its “perceptual computing” framework.
Perceptual computing is the term that Intel seems to have coined for software than can sense a user’s motions and gestures to control the user interface. Intel offers a perceptual computing software developer kit (SDK) that developers can use in conjunction with a special camera to create gesture-based games and other interactive software.
So how does natural language fit into the vision for perceptual computing? There’s an obvious link between gesturing and speaking. One can imagine that besides just motioning at a game to get the onscreen character to move, a player would like to be able to give verbal commands as well. Interacting with software by gesturing and talking has implications beyond gaming platforms. In her Techcrunch article, Lunden mentions that Intel has demonstrated multiple devices that showcase their “gesture and natural language recognition business.”
Now that Intel has purchased Indisys, they’ll have at least the basis for advanced language recognition and even virtual agent technologies to incorporate into their product set. It remains to be seen how perceptual computing and conversational software will intersect.
Venture Beat recently reported that Google has acquired two speech technology patents from SR Tech Group LLC. A press release from SR Tech Group LLC identified the patents as U.S. Patent No. 7,742,922, titled “Speech interface for search engines” and U.S. Patent No. 8,056,070, titled “System and method for modifying and updating a speech recognition program.”
The filing date for the first patent was November 9, 2006. Reading the abstract of the technology covered by the patent, it sounds like a very generic description of a voice-activated search. The user says what he/she wants to look up, the application uses speech recognition and natural language processing to determine how best to construct the search query, and the application runs the query and returns the result. The second patent describes a system that a user or system administrator can employ to makes updates to the grammar (as in underlying language database) of a speech recognition program.
I’m not a patent attorney, but based on the generic flavor of both of these patents, it seems like Google may have acquired them as a defensive maneuver. Having these broad reaching patents could give them ammunition against other companies that might want to declare future patent infringements in other technology areas. It’s not readily apparent that either patent offers breakthroughs that wold drastically improve Google Now or Google’s recently demonstrated conversational search functionality for Chrome. It’ll certainly be interesting to observe how Google continues to build out speech activated search and how other companies look to compete within the same arena. There seems little doubt that conversational search will play an important role in search, and in virtual agent technologies of the near future.
SpeechTek 2013 is scheduled to take place from August 19-21 at the New York Marriott Marquis in New York City. This looks like an interesting conference for anyone interested in speech technologies, virtual agents, and personal digital assistants. The conference offers four tracks: Business Strategies, Voice Interaction Design, Customer Experiences, and Technology Advances.
The SpeechTek 2013 Business Strategies track focuses on how voice technology, web self-service and other related technologies can be leveraged to provide competitive advantage and improved customer service. The Voice Interaction Design track is geared toward application developers and offers technical tracks on designing, building, and testing applications that use voice technologies and natural language processing. The Customer Experiences sessions focus on using speech recognition and web self-service to improve customer interactions and gain valuable insights into customer behavior. Technology Advances is forward-looking and examines the latest breakthroughs in voice and virtual agent technologies.
SpeechTek 2103 looks like three full days of great information and knowledge sharing on on all the key areas related to the topic of virtual agents. Whether you attend or not, you can follow the action using hashtag #SpeechTek on Twitter.
Imagine this. You crawl into bed at night and drift off into a deep sleep. While you’re floating through dreamland, your ever wakeful personal virtual agent is sifting through hundreds and hundreds of twitter feeds to find out what books people are talking about. Not only is the agent smart enough to know when a tweet mentions a book title, it can also surmise from the tweet context whether the comment about the book is positive or negative.
You wake up in the morning, shuffle into the kitchen, and find a fresh pot of coffee already made.
“What’s new today?” you ask your personal virtual agent.
“It’s a lovely day,” it responds. “You don’t have any meetings until noon. And I found three books that I think you’d really be interested in reading. Would you like to see them?”
It might not be long before this scenario becomes reality. Parakweet is a new company that has launched several Twitter information mining products. One of the products is called Bookvi.be and it uses natural language processing technology to parse and comprehend tweets about books. The same technology is also used to gather together movie recommendations. The technology challenges that Parakweet has overcome aren’t trivial. Lots and lots of people tweet about books. But how can a NLP agent really detect if what they’re saying about the book is positive or negative? Parakweet CEO Ramesh Haridas says that their technology has tackled this challenge.
Users can sign up for a Bookvi.be account and have book recommendations sent to them. It remains to be seen how long it’ll be before the technology is integrated into a coffee-making intelligent personal agent!
The race is on for a digital virtual assistant that can understand spoken language and provide users with exactly the information they need, when and how they need it. Oh, and the digital agent should be able to carry on a meaningful conversation as well. In a recent PCWorld article covering South by Southwest Interactive in Austin, Amit Singhal of Google is quoted as positioning the understanding of speech, natural language, and conversation as some of the key challenges facing the search giant today.
People’s need to search for information has become integral to our interactions with the web. When we are online, we are nearly always searching for something, be it news of the world, updates from friends, information on specific goods and services, or what’s playing at the local cinema or on TV. The list goes on. Google imagines a near-term future where smart virtual agents are embedded in our world, in wearable devices such as Google Glass, and where these agents can quickly act on our behalf to get us the information we need. Our virtual assistants should be able to understand what we’re saying when we speak to them. And they should be able to answer us in a way that’s not only informative, but natural.
Competing with Google in the arena of intelligent voice-activated agents, albeit indirectly, are companies that provide voice recognition systems for use across multiple problem spaces. In my next post, I’ll take a look at an article that examines some of the companies in this arena and that highlights how today’s most successful virtual agents are deployed. In the meantime, I hope you enjoy the referenced article with information from Google’s Amit Singhal.