Monday, June 21, 2021

Updates to Azure Arc-enabled Machine Learning


Azure Machine Learning (AML) team is excited to announce the availability of Azure Arc-enabled Machine Learning (ML) public preview release. All customers of Azure Arc-enabled Kubernetes now can deploy AzureML extension release and bring AML to , and the edge using Kubernetes on their hardware of choice.


The design for Azure Arc-enabled ML helps IT Operators leverage native Kubernetes concepts such as namespace, node selector, and resources requests/limits for ML compute utilization and optimization. By letting the IT operator manage ML compute setup, Azure Arc-enabled ML creates a seamless AML experience for data scientists who do not need to learn or use Kubernetes directly. Data scientists now can focus on models and work with tools such as Azure Machine Learning AML Studio, AML 2.0 CLI, AML Python SDK, productivity tools like Jupyter notebook, and ML frameworks like TensorFlow and PyTorch.


IT Operator experience – ML compute setup

Once Kubernetes cluster is up and running, IT Operator can follow 3 simple steps below to prepare cluster for AML workload:

  • Connect Kubernetes cluster to Azure via Azure Arc
  • Deploy AzureML extension to Azure Arc-enabled cluster
  • Create a compute target for Data Scientists to use

For the first two steps, IT Operator can simply run the following two CLI commands to accomplish:



Once AzureML extension installation completes in cluster, you will see following AzureML pods running inside the :



With your cluster ready to take AML workload, now you can head over to AML Studio portal and create compute target for Data Scientists to use, see below AML Studio compute attach UI:



Note by clicking “New->Kubernetes (preview)”, the Azure Arc-enabled Kubernetes clusters will automatically appear in a dropdown list for IT Operator to attach. In the process of Studio UI attach operation, IT Operator could provide an optional JSON configuration file specifying namespace, node selector, and resources requests/limits to be used for the compute target being created. With these advanced configurations in compute target, IT Operator help Data Scientists target a subset of nodes such as GPU pool or CPU pool for training job, improves compute resource utilization and avoids fragmentation. For more information about creating compute targets using these custom properties, please refer to AML documentation.


For upcoming Azure Arc-enabled ML update release, we plan to support compute target creation through CLI command as well, and ML compute setup experience will be simplified to following 3 CLI commands:



Note when connecting Kubernetes cluster to Azure via Azure Arc, IT Operator can also specify configuration setting to enable outbound proxy server. We are pleased to announce that Azure Arc-enabled Machine Learning fully supports model training on-premises with outbound proxy server connection.


Data Scientist experience – train models

Once the attached Kubernetes compute target is available, Data Scientist can discover the list of compute targets in AML Studio UI compute section. Data Scientist can choose a suitable compute target to submit training job, such as GPU compute target, or CPU compute target with proper resources requests for particular training job workload, such as the # of vCPU cores and memory. Data Scientist can submit job either through AML 2.0 CLI or AML Python SDK, in either case Data Scientist will specify compute target name at job submission time. Azure Arc-enabled ML supports the following built-in AML training features seamlessly:

For those Data Science professionals who have used AML Python SDK, existing AML Python SDK examples and notebooks, or your existing projects will work out-of-box with a simple change of compute target name in Python script. If you are not familiar with Python SDK yet, please refer to above links to get started.


AML team is extremely excited that Azure Arc-enabled ML supports the latest and greatest AML 2.0 CLI training job submission, which is in public preview also. Train models with AML 2.0 CLI is simple and easy with following CLI command:



Let’s take a look at job YAML file:



Note job YAML file specifies all training job needed resources and assets including training scripts and compute target. In this case, Data Scientist is using Azure Arc-enabled compute target created by IT Operator earlier. Running the job creation CLI command will submit job to Azure Arc-enabled Kubernetes cluster and opens AML Studio UI portal for Data Scientist to monitor job running status, analyze metrics, and examine logs. Please refer to Train models with the 2.0 CLI for more information and examples.



Get started today

In this post, we provided status updates to Azure Arc-enabled Machine Learning and showed how IT Operator can easily setup and prepare Azure Arc-enabled Kubernetes cluster for AML workload, and how Data Scientist can easily train models with AML 2.0 CLI and Kubernetes compute target.


To get started with Azure Arc-enabled ML for training public preview, visit Azure Arc-enabled ML Training Public Preview Github repository, where you can find detailed documentation for IT Operator and Data Scientist, and examples for you to try out easily. In addition, visit the official AML documentation to find more information.


Azure Arc-enabled ML also supports model training with interactive job experience and debugging, which is in private preview. Please sign up here for interactive job private preview.


After Data Scientist trains a model, ML Engineer or Model Deployment Pro can deploy the model with Azure Arc-enabled ML on the same Arc-enabled Kubernetes cluster, which is in private preview too. Please sign up here for inference private preview.

Also, check out these additional great AML blog posts!







Posted at