SiCortex: High Performance Computing Without the High Electric Bills
(Page 2 of 2)
network, where messages hop from node to node on the way to their final destination. So Reilly, Leonard, and Mucci started looking for a network topology—a scheme for wiring up the backplane—that would let them connect a large number of nodes without requiring messages to make too many hops to get through the network.
That, says Reilly, was when Leonard learned about a topology called a Kautz graph. First described in 1968, Kautz graphs have exactly the properties the SiCortex founders were looking for: they can have lots of nodes, but at a fairly low cost in terms of connections and hops. In fact, if you triple the number of nodes in a Kautz graph, the number of hops required to traverse the graph grows by just one. In a 324-node Kautz graph where every node has just three outgoing and three incoming links, for instance, data can get from one node to any other node in five hops or less. Increase the number of nodes to 972, and it’s still only six hops—a lot fewer than the number required by other parallel-computing topologies the industry has tried, such as 3-D meshes and “hypercubes.”
The 972-node architecture was the one that Reilly, Leonard, and Mucci decided to build. The company lined up venture backing from the likes of Polaris Venture Partners, Prism Venture Partners, Flagship Ventures, JK&B Capital, and Chevron Technology Ventures. And six years and $42 million later, the result is SiCortex’s top-of-the-line machine, the SC5832. About the size of two refrigerators side by side, the computer contains 972 custom chips, each configured with a switch, six processors, and a communications engine that addresses messages going across the backplane. (6 x 972 = 5,832 processors in all, hence the name.)
Reilly says the machine has found adherents so far among two types of organizations—universities and defense intelligence agencies, both of which suffer from power and space constraints. “We have a federal customer who confesses that he is out of power,” says Reilly. “Every time he buys a new computer, an old one has to leave the room. For them, it’s all about solving more problems within the power budget they’ve got. We are talking with other people where the imperative is to fit the system into a given physical volume with limited cooling capacity. In an airplane, for example, you have more than enough electricity, because Pratt & Whitney provides this wonderful generator hanging off of each wing, but you can’t put a Winnebago-sized air conditioner on top of the plane to add more cooling.”
Ironically, making their machines run cooler and more efficiently wasn’t SiCortex’s original goal; they wanted them to be fast. Reilly says the founders knew from the beginning that they’d have to use relatively slow processors, since the machine was going to contain thousands of them, and the machine as a whole could only draw so much power from the wall socket. But the speedy backplane meant that the processors could spend more of their time computing and less waiting around for data. Other tricks also helped with efficiency: for example, the engineers tweaked the machine’s compilers and its Linux-based operating system to dispense with “speculative execution,” an often wasteful process in which some data is loaded into memory based on predictions that it will be needed later.
“It’s the absolute antithesis of everything I spent the previous 20 years of my career working on, but the simplicity is what led to the power efficiency,” says Reilly.
Novell veteran Christopher Stone was brought in as CEO this summer to ramp up SiCortex’s sales and marketing effort. He argues that the number of technical-computing customers running up against power limitations is growing. “More and more of our educational customers have power consumption as an issue—they’re being told they are going to be capped,” Stone says. And in an announcement last week, SiCortex said that recent tweaks have made its newest computer models twice as efficient as previous ones—at least when the expense of power, cooling, staffing, and space is taken into account. “You have to include those in your total-cost-of-ownership measurement,” says Stone. “Once you do that, you start to see phenomenal returns.”
Google probably won’t come calling any time soon: SiCortex’s massively parallel systems aren’t tailored for data-center operations like searching, indexing, and serving Web pages. But with SiCortex’s help, many a customer may be able to keep its computer facility going—even if they don’t have a river running through it.