The Sydney Morning Herald recently published an article by Iain Gillespie in their Digital Life section on advances in deep learning technologies. Gillespie quotes Tim Baldwin of the University of Melbourne as confirming that deep learning has gained some new ground recently, helped along by Moore’s Law and the advent of ever faster computational processing power that’s needed to successfully train multilayered neural networks.
Ray Kurzweil is quoted as saying that he’s looking for 2029 to be the date when intelligent software develops both logical and emotional intelligence. Everyone probably has an opinion on Kurzweil technology predictions, but there’s certainly evidence that machine learning has made a lot of progress in just the last five years. Some of this progress is evident in speech recognition applications, recommendation engines, and control systems such as self-driving cars. Kurzweil’s description of intelligent assistants of the future sounds reminiscent of the capabilities exhibited by Samantha, the intelligent talking operating system in the movie Her, which I wrote about earlier this month.
An example often used to show both the promise and current limitations of neural networks is a Google X lab experiment that fed millions of still YouTube images into Google Brain, an AI system based on neural networks. The Gillespie article mentions this example too. After evaluating the million+ data points, Google Brain was able to independently recognize the images of human faces, human bodies, and–unexpectedly–cats. The cat recognition capability provided fodder for lots of geek jokes. (New York Times: “How many computers to find a cat? 16,000).
The Gillespie article got me searching for more information on deep learning. There’s a recent article on the topic in Nature by Nicola Jones. Jones calls deep learning a revival of an older AI technique: neural networks. A concept inspired by the architecture of the brain, neural networks consist of a hierarchy of relatively simple input/ouput components that can be taught to select a preferred outcome and to remember that right answer. When these simple learning components are strung together and can operate in parallel, they are capable of processing large amounts of data and performing useful analysis (such as correctly determining what someone is saying, even when there’s distracting background noise).
One of the ongoing debates in the machine learning field revolves around the effectiveness of unsupervised versus supervised learning for AI. Some researchers believe that the best way to teach an artificial intelligence system is to prime the database with facts about the world (“dolphins are mammals, marlin are fish”). Supervised learning typically refers to explicitly teaching a computer system/neural network by presenting it with linear data sets and giving it the next right answer. Being able to predict the next logical output in a sequence is key to machine learning.
Unsupervised learning involves feeding a neural network or other system of computer algorithms with data sequences that it analyzes to find meaningful patterns and relationships on its own. The Google Brain cat example referred to earlier is an example.
Regardless of the techniques used, it seems evident that some form of machine learning will be a critical force, if not the force, behind advances in virtual agent / intelligent virtual assistant technologies. To achieve true conversational capability, virtual agents will have to be able to routinely understand and engage their human dialogue partners. For a very in depth and informative article on machine learning, I recommend the article “Machine Learning, Cognition, and Big Data” by Steve Oberlin.