The considerations and requirements for data management are constantly evolving. There are new realities for managing data and data-centric workloads across the enterprise in a unified and comprehensive manner:
- Use cases were previously focused on efficiently storing and processing data in batch processes. Now there are increasing needs for integrating the entire data life cycle and for processing in both real time and batch.
- Technology infrastructure formerly demanded the co-location of compute and storage to avoid costly network transfers. Now the needs of high-performance analytics drive a move toward disaggregated compute and storage, where each can be sized and scaled independently.
- From a user experience viewpoint, it used to be acceptable to deploy and run in timeframes of weeks, months, or even quarters. Now the expectation is to be able to spin up services in minutes, give users their own clusters, and get insights quickly.
- From the privacy, security, and governance perspectives, the primary concerns were formerly about network perimeter and physical access controls. Now, with the entire data life cycle being managed, operators need fine-grained authentication and authorization at the workload and data layers.
CDP Private Cloud Base is the unification of Cloudera Distribution for Apache Hadoop (CDH) and Hortonworks Data Platform (HDP), giving customers the best of both worlds. This new product combines the best technologies from Cloudera and Hortonworks, with new features and enhancements across the stack, to form a comprehensive data platform that encompasses the entire data life cycle. This unified distribution is a scalable and customizable platform where you can securely run many types of data analytics workloads.
CDP Private Cloud Base can be a stand-alone data analytics platform. It also supports a hybrid or multicluster solution, where compute tasks can be separated from data storage, and where data can be accessed from remote clusters. In this case, the CDP Private Cloud Base cluster is deployed alongside CDP Private Cloud Experiences, a separate computing cluster running on Red Hat OpenShift Container Platform that can be deployed with CDP Private Cloud Base. This approach provides a foundation for containerized applications by managing storage, table schema, authentication, authorization, and governance in CDP Private Cloud Base. It consists of various components such as Apache HDFS, Apache Hive 3, Apache HBase, and Apache Impala, along with many other components for specialized workloads. You can select any combination of these services to create clusters that address your business requirements and workloads.
CDP Private Cloud Base contains several preconfigured packages of data services or
shapes for common workloads, including:
- Data Engineering to ingest, transform, and analyze data.
- Data Mart to browse, query, and explore your data in an interactive way.
- Operational Database for low latency writes, reads, and persistent access to data for Online Transactional Processing (OLTP) use cases.
- The ability to create your own services.
- Data Flow and Streaming with Kafka
In summary, CDP Private Cloud Base is a stand-alone instance of Cloudera Data Platform. It can also be deployed with the CDP Private Cloud Experiences cluster to form the complete CDP Private Cloud.