Microsoft's new artificial intelligence system, Project Adam, can identify images, including photos of a particular breed of dog. Microsoft
We’re entering a new age of artificial intelligence.
Drawing on the work of a clever cadre of academic researchers. the biggest names in tech—including Google, Facebook. Microsoft, and Apple—are embracing a more powerful form of AI known as “deep learning,” using it to improve everything from speech recognition and language translation to computer vision, the ability to identify images without human help.
In this new AI order, the general assumption is that Google is out in front. The company now employs the researcher at the heart of the deep-learning movement, the University of Toronto’s Geoff Hinton. It has openly discussed the real-world progress of its new AI technologies, including the way deep learning has revamped voice search on Android smartphones. And these technologies hold several records for accuracy in speech recognition and computer vision.
But now, Microsoft’s research arm says it has achieved new records with a deep learning system it calls Adam, which will be publicly discussed for the first time during an academic summit this morning at the company’s Redmond, Washington headquarters. According to Microsoft, Adam is twice as adept as previous systems at recognizing images—including, say, photos of a particular breed of dog or a type of vegetation—while using 30 times fewer machines (see video below). “Adam is an exploration on how you build the biggest brain,” says Peter Lee, the head of Microsoft Research.
The Project Adam team. From left to right: Karthik Kalyanaraman, Trishul Chilimbi, Johnson Apacible, Yutaka Suzue. Microsoft
Lee boasts that, when running a benchmark test called ImageNet 22K, the Adam neural network tops the (published) performance numbers of the Google Brain, a system that provides AI calculations to services across Google’s online empire, from Android voice recognition to Google Maps. This test deals with a database of 22,000 types of images, and before Adam, only a handful of artificial intelligence models were able to handle this massive amount of input. One of them was the Google Brain.
But Adam doesn’t aim to top Google with new deep-learning algorithms. The trick is that the system better optimizes the way its machines handle data and fine-tunes the communications between them. It’s the brainchild of a Microsoft researcher named Trishul Chilimbi, someone who’s trained not in the very academic world of artificial intelligence, but in the art of massive computing systems.
How It Works
Like similar deep learning systems, Adam runs across an array of standard computer servers, in this case machines offered up by Microsoft’s Azure cloud computing service. Deep learning aims to more closely mimic the way the brain works by creating neural networks—systems that behave, at least in some respects, like the networks of neurons in your brain—and typically, these neural nets require a large number of servers. The difference is that Adam makes use of a technique called asynchrony.
As computing systems get more and more complex, it gets more and more difficult to get their various parts to trade information with each other, but asynchrony can mitigate this problem. Basically, asynchrony is about splitting a system into parts that can pretty much run independently of each other, before sharing their calculations and merging them into a whole. The trouble is that although this can work well with smartphones and laptops—where calculations are spread across many different computer chips—it hasn’t been that successful with systems that run across many different servers. as neural nets do. But various researchers and tech companies—including Google—have been playing around with large asynchronous systems for years now, and inside Adam, Microsoft is taking advantage of this work using a technology developed at the University of Wisconsin called, of all things, “HOGWILD! ”
HOGWILD! was originally designed as something that let each processor in a machine work more independently. Different chips could even write to the same memory location, and nothing would stop them from overwriting each other. With most systems, that’s considered a bad idea because it can result in data collisions—where one machine overwrites what another has done—but it can work well in some situations. The chance of data collision is rather low in small computing systems, and as the University of Wisconsin researchers show, it can lead to significant speed-ups in a single machine. Adam then takes this idea one step further, applying the asynchrony
of HOGWILD! to an entire network of machines. “We’re even wilder than HOGWILD! in that we’re even more asynchronous,” says Chilimbi, the Microsoft researcher who dreamed up the Adam project.
Although neural nets are extremely dense and the risk of data collision is high, this approach works because the collisions tend to result in the same calculation that would have been reached if the system had carefully avoided any collisions. This is because, when each machine updates the master server, the update tends to be additive. One machine, for instance, will decide to add a “1” to a preexisting value of “5,” while another decides to add a “3.” Rather than carefully controlling which machine updates the value first, the system just lets each of them update it whenever they can. Whichever machine goes first, the end result is still “9.”
Microsoft says this setup can actually help its neural networks more quickly and more accurately train themselves to understand things like images. “It’s an aggressive strategy, but I do see why this could save a lot of computation,” says Andrew Ng, a noted deep-learning expert who now works for Chinese search giant Baidu. “It’s interesting that this turns out to be a good idea.”
An example of how Adam works. Microsoft
Ng is surprised that Adam runs on traditional computer processors and not GPUs—the chips originally designed for graphics processing that are now used for all sorts of other math-heavy calculations. Many deep learning systems are now moving to GPUs as a way of avoiding communications bottlenecks, but the whole point of Adam, says Chilimbi, is that it takes a different route.
Neural nets thrive on massive amounts of data—more data than you can typically handle with a standard computer chip, or CPU. That’s why they get spread across so many machines. Another option, however, is to run things on GPUs, which can crunch the data more quickly. The problem is that if the AI model doesn’t fit entirely on one GPU card or a single server running several GPUs, the system can stall. The communications systems in data centers aren’t fast enough to keep up with the rate at which GPUs handle information, creating data gridlocks. That’s why, some experts say, GPUs aren’t ideal right now for scaling up very large neural nets. Chilimbi, who helped design the vast array of hardware and software that underpins Microsoft’s Bing search engine, is among them.
Should We Go HOGWILD?
Microsoft is selling Adam as a “mind-blowing system,” but some deep-learning experts argue that the way the system is built really isn’t all that different from Google’s. Without knowing more details about how they optimize the network, experts say, it’s hard to know how Chilimbi and his team achieved the boosts in performance they are claiming.
Microsoft’s results are “kind of going against what people in research have been finding, but that’s what makes it interesting,” says Matt Zeiler, who worked on the Google Brain and recently started his own deep-learning company Clarifai. He’s referring to the fact that the accuracy of Adam increases as they add more machines. “I definitely think more research on HOGWILD! would be great to know if that’s the big winner here.”
Microsoft’s Lee says the project is still “embryonic.” So far, it’s only been deployed through an internal app that will identify an object after you’ve snapped a photo of it with your mobile phone. Lee has used it himself to identify dog breeds and bugs that might be poisonous. There’s not a clear plan to release the app to the public yet, but Lee sees definite uses for the underlying technology in e-commerce, robotics, and sentiment analysis. There’s also talks within Microsoft of exploring whether Adam’s efficiency could improve if run on field-programmable arrays, or FPGAs, processors that can be modified to run custom software. Microsoft has already been experimenting with these chips to improve Bing .
Lee believes Adam could be part of what he calls an “ultimate machine intelligence,” something that could function in ways that are closer to how we humans handle different types of modalities—like speech, vision, and text—all at once. The road to that kind of technology is long—people have been working towards it since the 50s—but we’re certainly getting closer.
Share this story on Facebook Share this story on Twitter Share this story on Pinterest Share this story via Email Comment on this story