Cloudera Data Engineering in CDP Private Cloud 1.3 uses containers to run on a private cloud
Cloudera launched an edition of its data processing and management platform that takes advantage of containers to run on a private cloud.
Santiago Giraldo, senior director of product marketing for data services at Cloudera, says Cloudera Data Engineering (CDE) in CDP Private Cloud 1.3 is designed to be deployed in on-premises IT environments. The company also announced that CDE version 1.3 can be deployed on both the Red Hat OpenShift platform based on Kubernetes and the Amazon Elastic Compute Service (ECS) cloud platform.
Previously, Cloudera launched a public cloud edition of CDE in 2020 that made it possible to deploy the Apache Spark framework for analyzing data on top of Kubernetes clusters. IT teams can now build applications on either platform and then run them across a hybrid cloud computing environment by first shifting workloads and then using tools such as Cloudera Replication Manager to migrate data when required, Giraldo notes.
IT teams can also burst workloads from an on-premises IT environment to a public cloud whenever additional capacity is needed, adds Giraldo. Regardless of approach, CDE in CDP Private Cloud makes it easier to isolate noisy data-intensive workloads to ensure service level agreements (SLAs) are met, he adds.
A shared data experience (SDX) set of pipelines also makes it possible to apply the same security and governance model to any instance of CDE, he adds. IT teams can also programmatically deploy complex pipelines with job dependencies using Apache Airflow, open source software based on directed acyclic graphs (DAGs), that make it possible to visualize and monitor pipelines running in production environments.
Available Cloudera application programming interfaces (APIs) and command-line interfaces (CLIs) make it simpler to integrate its platform within DevOps workflows. Workloads can scale independently of compute and storage and, thanks to Kubernetes, can be managed via the same control plane, says Giraldo. Workloads are deployed as containers in virtual clusters that connect to a storage cluster dubbed CDP Base. Collectively, those capabilities make it easier to add additional workloads into a multitenant environment without adversely affecting the performance of the applications already running, adds Giraldo.
As Kubernetes becomes more widely employed, it’s becoming clearer that the management of compute, data and applications will converge around a common control plane. It may still require a team of IT professionals to manage those functions but the days when IT teams needed to deploy a separate control plane for each function are coming to an end. That control plane also provides the mechanism to unify DevOps and data operations as the number of stateful applications deployed on Kubernetes clusters continues to steadily increase, notes Giraldo.
It’s not quite clear whether 2022 will see Kubernetes cross the proverbial enterprise IT chasm. However, the question is now how long it will take for Kubernetes to be more pervasively deployed in enterprise IT environments. As that transition occurs, expect to see a massive change in the way enterprise IT environments are managed.