AI Without the Costly GPU Chips? Seattle Startup Sees a Way

Picture a world in which cameras, sensors, watches, and other devices, equipped with commodity computer chips, recognize and understand what’s happening around them.

The basic devices are there now—billions of them—but they can’t handle the complex, resource-hungry algorithms that identify objects in pictures or translate text from one language to another, the kinds of inferences that come from state-of-the-art artificial intelligence systems.

“Our goal is to basically get AI” into these commodity computer chips, says Ali Farhadi, co-founder and CEO of, a six-person Seattle startup that just spun out of the Allen Institute for Artificial Intelligence and raised $2.6 million in seed funding, led by Madrona Venture Group. “And if we can do that, then the world will be a different world.”

Different in the sense that many of the technology trends that we call AI today, and which require powerful, expensive computers—often running in the remote data centers owned by cloud computing giants—would be accelerated, and expanded.

“Imagine at every corner of a street, there should be a $5 computer [and] camera that can see things and understand what’s happening,” Farhadi says. “Our cars would have lots of cameras that can see. My watch can actually listen and process things.”

If Xnor can deliver the technology enabling this different world, it would mean a great stride toward the democratization of AI that so many of the tech giants espouse. While the likes of Microsoft (NASDAQ: MSFT) and Google (NASDAQ: GOOG) are paying more than just lip service to this idea—publishing their research on new machine learning algorithms, open-sourcing training datasets the algorithms learn from, and selling access to machine learning capabilities that run on their own clouds—running state-of-the-art AI remains an expensive proposition.

Machine learning’s reliance on high-powered graphics processing units—originally meant for gaming and video applications—has driven business to chip makers like Nvidia (NASDAQ: NVDA). The Santa Clara, CA-based company—whose stock is trading at near-record levels, buoyed by use of its chips in AI and automotive applications—reports earnings for its fourth quarter and full fiscal 2017 on Thursday.

The success of Nvidia and its GPU-making competitors, in a way, ties back to the inspiration for

“We were sort of sick and tired of students coming to us and asking for GPU servers every time,” says Farhadi, who holds joint appointments at the Allen Institute and University of Washington computer science department. His co-founder, Mohammad Rastegari, was a PhD student at University of Maryland before joining the Allen Institute. “Students would come and say, ‘Can I buy another GPU server?’ And that was another $30,000 machine,” Farhadi says.

The high cost of AI-capable machines—whether procured directly or rented from a cloud computing provider—points to the opportunity in creating a more efficient model for performing AI tasks. Such a model would “open up a huge pool of opportunities for AI at the hands of customers in an inexpensive way,” Farhadi says.

To grasp how aims to enable that future, you need to dig down into what actually happens when a machine performs an AI task.

When people say “AI” today, they’re often talking about the output of machine learning algorithms used in deep neural networks—and one kind in particular: convolutional neural networks, short-handed as CNNs or ConvNets.

To infer something about an image or piece of audio or text, a CNN must perform billions of convolutional operations, Farhadi says. A convolutional operation, in simple terms, is a calculation involving two floating-point numbers. Floating-point numbers are a common way of storing numbers digitally that balances precision and memory usage. Each floating point number requires 32 bits.

So, each convolutional operation, requiring two 32-bit numbers, is “sort of expensive” in terms of computing and memory resources, Farhadi says. “And when you do billions of those, this is where basically things fall apart and you need to have GPUs and expensive machines to be able to run AI,” he says.’s approach involves substituting binary numbers—either a zero or a one, taking up just a fraction of the memory—for floating-point numbers.

“We’re replacing those 32 bits to store each single number in these gigantic neural networks with one bit,” Farhadi says. “And imagine that you need to do this billions of times to be able to do an inference, to be able to run your AI. So a billion times 32 versus a billion times one, and that’s sort of the magic.”

Farhadi says the Xnor model has been shown to deliver accuracy on par with a standard neural network model, except on “very resource-constrained devices,” which “might see a 1 or 2 percent drop in the performance.”

The company borrows its name—and the name of its light-weight machine learning model—from the digital logic gate XNOR, which takes two binary numbers and generates another binary number.

What this all means, Farhadi says, is “we’re going to basically be at least 60 times faster in terms of computation, 30 times lighter in terms of memory, and the same range in terms of the power efficiency. Those are the three main bottlenecks for getting AI into people’s hands. We oftentimes don’t have devices that are powerful enough, that are power-efficient enough, and that have enough memory” to accommodate state-of-the-art AI, he adds.

So now—or soon—your phone, or your $5 Raspberry Pi Zero, could be able to recognize images without the assistance of powerful cloud computing systems that handle this kind of work today.

There are the obvious cost benefits to the new approach, but also potential data-privacy and data-ownership advantages. “No data needs to leave your device,” Farhadi says.’s seed funding will support hiring and further development of its platform—an API and software development kit—to allow people to begin doing efficient deep learning on everyday devices. Farhadi sees the company’s customers as chip makers, app developers, and device makers.

Allen Institute CEO Oren Etzioni and Madrona managing director Matt McIlwain have joined’s board of directors as part of the company’s funding and spinout.

Benjamin Romano is editor of Xconomy Seattle. Email him at bromano [at] Follow @bromano

Trending on Xconomy