The Coolest Big Data System And Cloud Platform Companies Of The 2023 Big Data 100

Part 3 of CRN’s 2023 Big Data 100 takes a look at the vendors solution providers should know in the big data system and cloud platform space.

All Systems Go

Today’s “big data stack” includes databases, data management software and data analytics tools – all critical components of an effective operational or analytical data system. But all those technologies run on the foundational systems – hardware servers and cloud platforms – that are provided by some of the biggest companies in the IT industry.

As part of the CRN 2023 Big Data 100, we’ve put together the following list of big data systems and cloud platform vendors that solution providers should be familiar with.

Many of these companies are well-known names in the channel, including Dell Technologies, Hewlett Packard Enterprise and IBM, that develop the underlying hardware and software that power big data IT including analytics and data-intensive operational applications.

In the cloud, where many businesses are deploying big data projects, cloud service giants like Amazon Web Services, Google Cloud and Snowflake provide the platforms for those initiatives.

And long-established software giants like Microsoft, Oracle and SAP provide foundational cloud systems, databases and other supporting software for big data initiatives, in addition to offering their own portfolios of data management and data analysis software.

This week CRN is running the Big Data 100 list in a series of slide shows, organized by technology category, spotlighting vendors of business analytics software, database systems, data warehouse and data lake systems, data management and integration software, data observability tools, and big data systems and cloud platforms.

Some vendors market big data products that span multiple technology categories. They appear in the slideshow for the technology segment in which they are most prominent.

Amazon Web Services

CEO: Adam Selipsky

AWS operates the cloud platform that is the big data system for many businesses, IT vendors and solution providers who rely on the cloud giant’s services to store and manage their data and run their operational and analytical big data cloud workloads.

AWS also offers its own extensive portfolio of big data services including databases (including Amazon Aurora and Amazon RDS managed relational databases and Amazon DynamoDB managed NoSQL database); data analytics (Amazon Athena for SQL queries, Amazon Redshift data warehousing and Amazon Kinesis real-time streaming data and video analysis); and data management (AWS Glue, AWS Lake Formation and AWS Data Exchange).

At the AWS re:Invent conference in December the company launched AWS Data Exchange for Amazon S3, enabling data subscribers to access third-party data files from Amazon S3 buckets; and AWS Data Exchange for AWS Lake Formation, helping data subscribers to find and subscribe to third-party data sets that are managed directly through AWS Lake Formation.

Dell Technologies

CEO: Michael Dell

Dell provides the servers, storage systems and other IT infrastructure systems that many businesses and organizations rely on to run their operational and analytical big data workloads.

In January the company expanded its Dell PowerEdge server portfolio with 13 next-generation servers, incorporating 4th Gen Intel Xeon Scalable processors, for high-performance computing across data centers, public cloud and edge computing locations.

But the IT giant goes beyond just selling “big iron” for big data. It has developed “validated designs” for analytics that provide partners and customers with blueprints for big data systems including data lakehouses, real-time data streaming, and edge analytics for industry 4.0. Many of the designs incorporate products from other vendors such as Confluent, DataStax and Splunk.

Google Cloud

CEO: Thomas Kurian

Like its rivals AWS and Microsoft Azure, the Google Cloud Platform (GCP) is the foundation for many big data initiatives for customers and partners.

Google Cloud’s big data services encompass the database arena including the company’s Cloud SQL relational database, Firebase NoSQL real-time database, and AlloyDB for PostgreSQL. In data analytics Google’s offerings include the BigQuery data warehouse, Dataflow streaming analytics, Analytics Hub for exchanging data analytics assets, and the Looker platform for business intelligence, data applications and embedded analytics.

In an example of how Google Cloud works with other big data tech developers, Google Cloud and data analytics software developer unveiled an expanded strategic partnership last month that included building ThoughtSpot’s software-as-a-service, AI-powered analytics to run natively on GCP.

Hewlett Packard Enterprise

President and CEO: Antonio Neri

Like Dell, IBM and other IT infrastructure providers, Hewlett Packard Enterprise offers a range of servers, data storage systems and other products that form the foundation for on-premises, private cloud and public cloud big data operations.

HPE GreenLake for Big Data, part of the company’s GreenLake edge-to-cloud offerings, is a complete workload system for the Hadoop lifecycle that includes hardware, software and services. The “stack” includes ProLiant DL or Apollo servers, Red Hat Linux, HPE Insight Cluster Management Utility, and Enterprise Cloudera or Hortonworks Hadoop, along with design, deployment and support services.

HPE also offers HPE AI and data transformation services. Earlier this month HPE unveiled new file, block and backup recovery data services, built on HPE Alletra Storage MP, for data-intensive workloads.

IBM

CEO: Arvind Krishna

Like Dell Technologies, Hewlett Packard Enterprise and other leading IT infrastructure providers, IBM markets mainframe computers, servers, data storage systems and other products that form the foundation for big data tasks either within a customer’s data center or in the cloud.

IBM also provides a comprehensive line of software at all levels of the big data system stack, including the DB2 database, IBM Business Analytics Enterprise (including the Cognos analytics tool, InfoSphere information server and Netezza performance server), and the IBM Cloud Pak for Data for developing a data fabric architecture.

In February IBM acquired StepZen, developer of a GraphQL server with a unique architecture for quickly developing GraphQL APIs. GraphQL is an emerging query language that helps organizations interact with data that is scattered across different data stores – such as data warehouses and data lakes – in on-premises and cloud systems.

Microsoft

CEO: Satya Nadella

Software giant Microsoft offers a broad portfolio of products that cover the big data stack from top to bottom.

Microsoft Azure, along with AWS and Google Cloud, is one of the leading go-to cloud platforms upon which many big data software developers run their products and many businesses assemble their big data operations.

The company offers Azure Big Data Services and Azure Analytics Services, packaged services on the Azure cloud platform, including Azure Synapse Analytics, HD Insight, Azure Data Lake Analytics, Azure Data Factory, Azure Stream Analytics, Azure Analysis Services, Azure Data Explorer, Azure Data Share, Azure Event Hub, and Azure Time-Series Insights.

Microsoft’s SQL Server relational database is one of the industry’s foundational products for data management and the company’s Power BI is one of the most popular data visualization and business intelligence tools on the market.

NetApp

CEO: George Kurian

NetApp’s core business has traditionally been in manufacturing data storage systems and it’s a mainstay of the annual CRN Storage 100 with such products as its NetApp AFF C-Series flash storage system.

But with cloud computing and data increasingly scattered across multiple locations, the line between data storage and data management is becoming increasingly blurred. NetApp today offers a growing portfolio of data management and monitoring software.

In June 2021 NetApp acquired Data Mechanics, which developed a managed platform for big data processing and cloud analytics.

Oracle

CEO: Safra Catz

Oracle’s flagship relational database system, currently available as the Oracle Database 19c long-term release and Oracle database 21c “innovation release,” is at the core of many big data systems operated by businesses and organizations.

Just last month the software giant debuted a free version of its upcoming Oracle Database 23c, now in beta, for developers who build data-driven applications.

Oracle has a broad portfolio of cloud infrastructure and cloud application products that play in the big data realm including the Oracle Autonomous Database, NoSQL and MySQL databases, Oracle Analytics, Oracle Big Data Service, and data lake services on the Oracle Cloud Infrastructure (OCI).

Oracle is also working on MySQL HeatWave Lakehouse, a data lakehouse system that combines transaction processing, analytics and machine learning in a single MySQL database.

SAP

CEO: Christian Klein

SAP is best known for its enterprise software including ERP and CRM applications. But the Waldorf, Germany-based company is also a major player in the big data software space with tools for data management and analytics.

Much of what SAP offers in big data tech today is built into the SAP Business Technology Platform (SAP BTP), a system that pulls together data analytics, AI, application development, application integration and automation in a unified environment.

Specific products within SAP BTP include SAP Analytics Cloud, SAP Master Data Governance, and – just launched in March – the SAP Datasphere unified service for data integration, data cataloging, data warehousing, semantic modeling and virtualizing workloads across SAP and non-SAP data.

Snowflake

CEO: Frank Slootman

Since its official launch in 2014, Snowflake has grown beyond its initial focus on cloud data warehousing services to become a comprehensive data cloud platform for storing and managing big data, operating data warehouses and data lakes, performing data analytics and machine learning tasks, and even running data-intensive operational applications.

Over the last year the company has debuted data clouds for specific industries such as healthcare and life sciences, financial services and retail/CPG. In February it added the Telecom Data Cloud to that lineup and, just this month, a manufacturing industry offering.

Snowflake also has aggressively recruited ISVs to develop their applications for the Snowflake data cloud using the company’s Snowpark developer framework. The company also has entered the data brokerage business with its Snowflake Marketplace where third-party vendors can sell data and applications.