A flurry of press attention has surrounded the recent announcement of Sirius, an intelligent personal assistant built by a group from the University of Michigan. Most of the articles focus on the fact that Sirius is based on open source software. The implication is that U-M has created a technology that could lead to a generation of open source competitors to Siri, Google Now, and Cortana.
The paper published by the U-M Engineering team that built Sirius, however, indicates that the open source intelligent assistant wasn’t designed to take on commercial competitors. Instead, the team needed a fully functioning intelligent assistant to experiment with the hardware server designs required to support these technologies. The U-M team’s starting hypothesis was that intelligent assistants are “approaching the computational limits of current data center architectures.”
Why do intelligent personal assistants put such a heavy burden on backend systems? It’s because of the type of data these assistants are sending from the mobile device to the backend server, and because of the way that the data has to be processed. When you speak to Siri or Google Now, compressed voice files have to be transferred to a server, where complex speech recognition functions are executed. If the request requires intensive data analysis to obtain a response, this can consume a high degree of computational power as well. And the end-to-end process of asking your question and receiving a correct response has to happen at lightning speed. In other words, the latency between question and answer needs to be kept to a minimum.
To test out the best possible server architectures, the U-M team built Sirius with the following core capabilities:
- speech recognition
- image matching
As noted in all the other articles about Sirius, the team leveraged open source projects for the foundation of their assistant. Refer to the team’s paper for a complete listing of all the open source components used in Sirius.
The team’s experiments proved that certain server architectures result in significant latency improvements for intelligent assistant queries. The team found that the best architecture to support these queries is a field-programmable gate array (FPGA) architecture, which is a type of specialized integrated circuit. FPGA is a high-cost solution. If a somewhat lower cost solution is required, then the study found that a design based on Graphical Processing Units (GPUs) is also well suited to intelligent assistant query processing.
For me at least, another takeaway from the U-M experiment is that operating a commercial grade intelligent personal assistant isn’t for the faint of heart. Providing the low latency service needed to keep the user happy requires a large data center investment. Having an operational intelligent personal assistant built on open source software is only the start. You need all the backend compute power to make it work.