Intel Fires Back At Nvidia With Nervana AI Chips, New Movidius VPU
'With this next phase of AI, we're reaching a breaking point in terms of computational hardware and memory,' Intel exec Naveen Rao says of the chipmaker's new Nervana and Movidius chips.
Intel is firing back at Nvidia's growing prominence in artificial intelligence with the launch of its Nervana Neural Network processors for deep learning and the reveal of a new Movidius visual processing unit for edge media, computer vision and inference applications.
The Santa Clara, Calif.-based company announced the launch of the Nervana NNP-T1000 and NNP-I1000 chips for deep learning training and inference as well as the 2020 release window for its next-generation Movidius VPU at the Intel AI Summit on Tuesday. Development of the products stemmed from Intel's acquisitions of Nervana and Movidius in 2016.
[Related: Nvidia's Jetson Xavier NX Is 'World's Smallest Supercomputer' For AI]
Intel said its AI products are expected to generate $3.5 billion in sales for the 2019 fiscal year, and the chipmaker is hoping the new Nervana and Movidius products will continue that momentum for a portfolio that it said is "the broadest in breadth and depth in the industry."
"With this next phase of AI, we're reaching a breaking point in terms of computational hardware and memory," Naveen Rao, corporate vice president and general manager of Intel's AI Products Group, said in a statement (pictured in photo above). "Purpose-built hardware like Intel Nervana NNPs and Movidius Myriad VPUs are necessary to continue the incredible progress in AI."
The new Nervana and Movidius products are arriving as Nvidia's GPUs have gained considerable relevance for AI workloads in the data center market. The GPU powerhouse has been targeting deep learning applications with its Tesla V100 GPUs for training and T4 Tensor Core GPUs for inference in addition to selling ready-to-deploy server appliances like the DGX-2 to accelerate adoption.
Meanwhile, Intel is developing its own slate of GPUs for everything from gaming to AI. The company has previously said its first discrete GPU, which will be based on the company's 10-nanometer manufacturing process, will launch next year while a 7-nanometer data center GPU will arrive in 2021. Both chips are purpose-built ASICs.
Intel said it's now shipping the NNP-T1000, which is built for training algorithms, and the NNP-I1000, which is built for inference, with the aim of providing customers with a "systems-level AI approach" that includes a full software stack with open components and deep learning framework integration.
The chipmaker said the NNP-T1000 provides "the right balance between computing, communication and memory," making it suitable for anything from small compute clusters to large supercomputers.
Rao said the NNP-T1000 provides 95 percent scaling capabilities on important training models like Res-Net 50 and BURT, with very little degradation when running them across 32 chips. To help push the new chips, Intel has developed a new pod reference design that consists of 10 racks and 480 NNP-T1000 cards interconnected using the company's glueless fabric.
"This ASIC was purposefully designed with distributed training […] no switch required," Rao said.
The NNP-I1000, on the other hand, provides customers with a power- and budget-efficient for running "intense, multimodal inference at real-world scale using flexible form factors. "
Compared to a server rack with Nvidia's T4 inference GPUs, a server rack running with NNP-I1000 provides nearly four times the compute density, according to Rao.
"What we have is the most inferences per second you can jam in a single rack unit," he said.
Intel said the Nervana chips were developed for the AI processing needs of social media titan Facebook and Chinese tech giant Baidu.
"We are excited to be working with Intel to deploy faster and more efficient inference compute with the Intel Nervana Neural Network Processor for inference and to extend support for our state-of-the-art deep learning compiler, Glow, to the NNP-I," Misha Smelyanskiy, director of AI system co-design at Facebook, said in a statement.
With the chipmaker's next-generation Movidius Myriad VPU, code-named Keem Bay, Intel is promising more than 10 times the inference performance over the previous generation and six times the power efficiency of competing processors when the chip comes out in the first half of 2020.
Jonathan Ballon, vice president of Intel's IoT Group, called the new Movidius VPU "ground-breaking, purpose-built" AI architecture for the edge, offering inference performance that is better than GPUs at a fraction of the power, size and cost when compared to similar products.
For instance, the new Movidius VPU provides four times better performance than Nvidia's TX2 chip and performance that is on par with Nvidia's Xavier chip at one-fifth of the power. Ballon said that's important because customers don't just care about performance.
"Customers care about power, size and latency," he said.
The company also announced its new Intel DevCloud for the Edge program, which, along with Intel OpenVINO, aims to make it easier to protype and test AI solutions on Intel's broad range of processors before they make any purchases. Ballon said more than 2,700 customers are already using Intel DevCloud, "and they are loving it."
"Customers will now be able to model and simulate in DL Workbench the performance of their model and deploy for free in a number of hardware configurations," he said.