IBM Ramps Up AI, Analytics Via New File, Object Storage

'IBM has its own object store technology, and has added performance and new capabilities. But they haven't ignored everybody else's object stores. The data is already there. IBM is opening it all up,' says John Zawistowski, global systems solutions executive at Sycomp.

ARTICLE TITLE HERE

IBM Thursday introduced new storage hardware and software aimed at placing its storage at the center of large-scale data requirements for artificial intelligence and analytics workloads.

The new offerings are aimed at helping to build the kind of information architecture needed to get the most out of businesses' fast-changing data, said Eric Herzog, IBM's chief marketing officer and vice president of worldwide storage channels.

"The new stuff is all about storage solutions for AI, big data and business analytics," Herzog told CRN. "IBM thinks customers need an information architecture to build AI before they can collect and analyze their data and feed it into their AI systems."

IBM storage technology has always been an important part of customers' high-performance computing, artificial intelligence and machine-learning infrastructures, said John Zawistowski, global systems solutions executive at Sycomp, a Foster City, Calif.-based solution provider and IBM channel partner.

"Why IBM? It's the way they integrated the AI software platform and storage," Zawistowski told CRN. "And the way IBM understands the importance of doing that. And the way IBM technology performs."

Herzog said IBM's latest technologies focus on two parts of the information architecture needed to meet the needs of artificial intelligence and analytics: the collection of the data and the organization of the data.

On the collection side, IBM Thursday introduced the IBM Elastic Storage System 5000, a new storage system for building data lakes featuring performance, density and scalability, he said.

The IBM Elastic Storage System 5000, like all arrays in the Elastic Storage System line including the all-flash storage ESS 3000 introduced late last year, runs the IBM Spectrum Scale software for advanced storage management of unstructured data for cloud, big data, analytics, object storage and more.

However, unlike the ESS 3000, the ESS 5000 is a hard disk drive-based array to provide performance and low-cost capacity, Herzog said. It is available in two versions. The SL model fits in a standard 36U rack and scales to 8.8 petabytes with six SL enclosures, while the SC model fits in a deeper rack to scale up to 13.5 petabytes with eight SC enclosures.

A single node offers bandwidth of up to 55 GBps, with each additional node in the same cluster adding an additional 55-GBps bandwidth, Herzog said.

With the ESS 5000, customers can build a data lake with a single name space of up to 8 yottabytes, which Herzog said compares to 176 petabytes when using NetApp FAS hardware or 64 petabytes with Dell EMC Isilon hardware.

"To even do an exabyte, you need at least eight NetApp FAS systems and even more with Dell EMC Isilon," he said.

Also, while the ESS 3000 is based on Intel processors to target more edge workloads, the ESS 5000 is based on IBM Power9 processors, Herzog said.

The ESS 5000 seamlessly integrates with IBM's ESS 3000 NVMe nodes or older ESS systems, he said.

IBM also introduced IBM Spectrum Scale Data Acceleration for AI, software that Herzog said allows access and data movement between IBM Spectrum Scale and object storage stored on-premises or in the cloud.

Featurewise, the key change is the ability for IBM Spectrum Scale, which is focused on managing file data, to now also work with object data and seamlessly connect object storage to AI and big data workloads, Herzog said. When used with the ESS 5000, IBM Spectrum Scale Data Acceleration for AI gives both file and object data up to 55-GBps performance per node, he said.

IBM Spectrum Scale Data Acceleration for AI is slated to be available in the fourth quarter, he said.

IBM also enhanced its IBM Cloud Object Storage software with up to a 300 percent increase in data read performance and up to 30 percent lower latency across geographically dispersed nodes, Herzog said. Existing installations can be upgraded in place non-disruptively, he said.

On the data organization side of the information architecture, IBM enhanced its IBM Spectrum Discover software for managing metadata to provide data insight for petabyte-scale unstructured data.

While IBM Spectrum Discover previously was deployed in virtual machines, the new version can now launch in a containerized version in Red Hat OpenShift, Herzog said. It now also supports heterogeneous environments, including data sources on NetApp, Dell EMC Isilon and other platforms.

Also new is the ability to create catalogs of metadata, which is key to managing millions of files, Herzog said.

"You can find the data from large stores of data using the metadata," he said. "This allows you to go across terabytes of data to rapidly find the data needed to run your workloads. Our competitors' metadata technology works only across their own hardware. IBM lets it work across all our competitors' gear."

IBM, with its new technology, is providing access to data that customers have not been able access easily in the past, Zawistowski said.

"Everybody has their own AI and ML [machine-learning] stories," he said. "EMC has theirs. NetApp has theirs. On the hardware side, they all have hard drives and flash. But the software part, that's where IBM stands alone because of the large global name space and the billions of files it can access. Others are much more limited. And IBM is a lot more open in terms of accessing those silos of data."

The ability to work across multiple vendors' data stores is important, Zawistowski said.

"IBM has its own object store technology, and has added performance and new capabilities," he said. "But they haven't ignored everybody else's object stores. The data is already there. IBM is opening it all up."