Home > Workload Solutions > Data Analytics > Guides > Dell PowerScale and Cloudera Data Platform Private Cloud Data Services > Cloudera Machine Learning
Cloudera Machine Learning (CML) is a web application that provides a UI that simplifies the action of managing workloads and resources for data scientists. It offers users a convenient way to deploy and scale their analytical pipeline and collaborate with their colleagues in a secure compartmentalized environment.
Step 1: Provision a new workspace by clicking the ‘Provision Workspace’ button on the CML Workspaces page. Provide a workspace name and select the environment. Reference the Cloudera Machine Learning documentation for detailed options.
Click the ‘Provision Workspace’ button to proceed.
Step 2: Click the Workspace Name of the newly provisioned workspace to begin using the workspace and provide the Kerberos authentication details for the user by navigating to ‘User Settings > Hadoop Authentication.
Step 3: Deploy the ‘Churn Modeling with scikit-learn’ AMP to our workspace by clicking on the AMP and then clicking the ‘Configure Project’ button.
Applied ML Prototypes (AMPs) provide reference example machine learning projects in Cloudera Machine Learning. More than simplified quickstarts or tutorials, AMPs are fully developed expert solutions created by Cloudera’s research arm, Fast Forward Labs.
Note: You can also view the code and project details by clicking the ‘View on GitHub’ link.
Step 4: The next page allows you to modify the environment and runtime variables. We’re going to leave everything with the default values except for enabling Spark.
Click the ‘Launch Project’ button to proceed.
CML will run a series of steps to ingest, train and deploy the model.
Using the OneFS File System explorer, we see that the dataset to be trained was uploaded to PowerScale OneFS.
Step 5: Once the project is deployed, open the App by either clicking on ‘Open’ in Step 7 of the AMP deployment or Click the ‘Applications’ link in the left navbar.
You can drill into a line item and dynamically update the data.