For the final step in the DBaaS platform deployment in the lab, we deployed Azure Arc-enabled data services, July 2021 release. You can use Azure Arc-enabled data services to run Azure Arc-enabled SQL Managed Instance and Azure Arc-enabled PostgreSQL Hyperscale on any CNCF-compliance Kubernetes cluster that is running on-premises, at the edge, and in other hyperscale environments.
For the DBaaS solution architecture, we installed Azure Arc-enabled data services and created Azure Arc-enabled SQL Managed Instances and SQL databases on the aks-lab-workloads-1 cluster, which ran on AKS-HCI as depicted in the following figure.
Figure 14. Azure Arc data services architecture in the lab
The Azure Arc data controller is the orchestrator in the Azure Arc-enabled data services architecture. The data controller is a collection of Kubernetes resources that provide services that are related to provisioning, elasticity, recoverability, monitoring, and high availability. The bootstrapper pod plays an important role in the creation of new Azure Arc-enabled SQL Managed Instances. The bootstrapper instructs AKS-HCI to create Kubernetes resources like pods, services, and persistent volumes by communicating with the Kubernetes API. In short, the data controller shields IT staff and developers from the complexities of Kubernetes using automation.
Two types of connectivity modes are available for the data controller:
The first step in deploying the data controller was to install the client tools related to Azure data services like Azure Data Studio and the Azure Data Studio extensions. Other tools were already installed when deploying and configuring AKS-HCI. All the specific software versions are noted in the Management and Operations section.
The next step would be to connect the AKS-HCI cluster to Azure using Azure Arc-enabled Kubernetes, but we already completed that when we deployed the aks-lab-workloads-1 cluster.
We also registered the Microsoft.AzureArcData resource provider and create a service principal (SPN). Then, we assigned the service principal to the appropriate role – Monitoring Metrics Publisher – for uploading usage data, metrics, and logs to Azure Monitor. We ran the following Azure CLI commands to create the SPN.
Note: For the test environment, we scoped the service principal at the subscription level. The documentation provides commands that scope the SPN to the resource group level.
az ad sp create-for-rbac --name azure-arc-metrics --role Contributor --scopes /subscriptions/<subscription ID>
az role assignment create --assignee <application ID> --role 'Monitoring Metrics Publisher' --scope subscriptions/<subscription ID>
The first command produced the following output. These values were used during the deployment of the data controller:
{
"appId": "<application ID>",
"displayName": "azure-arc-metrics",
"name": "http://azure-arc-metrics",
"password": "<password>",
"tenant": "<Azure Active Directory tenant ID>"
}
The data controller can be deployed in indirect mode using Azure Data Studio, Azure CLI, and Kubernetes tools. In the lab, we used the Azure portal to create a Jupyter Notebook that we downloaded and ran in Azure Data Studio against the aks-lab-workloads-1 cluster. When we performed the testing, we used the insiders build of Azure Data Studio.
Note: Currently, you can deploy only one data controller per Kubernetes workload cluster. Future releases of Azure Arc-enabled data services will provide the opportunity for more than one data controller in a workload cluster.
We browsed to the Azure Arc data controller blade in the Azure portal and chose to deploy using indirect connectivity mode. Then we opened a link to the Jupyter Notebook in Azure Data Studio. The following figures show the workflow in the Azure Portal.
Figure 15. Data controller details in Azure portal
Figure 16. Open link in Azure Data Studio for data controller deployment
Note: Before attempting to deploy the data controller using Azure Data Studio, we created a custom storage class. To perform this operation, we used the Create on AKS on Azure Stack HCI article in the Microsoft documentation.
After clicking the Open link in Azure Data Studio button, ADS opened on our local machine to a workflow for deploying the data controller in indirect mode. Figure 17 shows the final step in the workflow that summarizes our configuration choices. The notebook ran after clicking Deploy.
Figure 17. Final step in data controller deployment in ADS
The data controller did not appear in the Azure Arc blade of the Azure portal until we uploaded the usage data for the first time. We also uploaded the metrics and log data to Azure using the following commands:
$ENV:SPN_AUTHORITY='https://login.microsoftonline.com'
$ENV:WORKSPACE_ID=’<workspace ID>'
$ENV:WORKSPACE_SHARED_KEY='<workspace shared key>'
$ENV:SPN_TENANT_ID='<AAD tenant ID>'
$ENV:SPN_CLIENT_ID='<application ID>'
$ENV:SPN_CLIENT_SECRET='<password>'
az arcdata dc export --type usage --path usage.json --k8s-namespace arc --use-k8s --force
az arcdata dc upload --path usage.json
az arcdata dc export --type metrics --path metrics.json --k8s-namespace arc --use-k8s --force
az arcdata dc upload --path metrics.json
az arcdata dc export --type logs --path logs.json --k8s-namespace arc --use-k8s --force
az arcdata dc upload --path logs.json
Figure 18 shows how the data controller appears in the Azure portal after the first usage data upload.
Figure 18. Data controller appearing in Azure portal
The following table lists the primary tools that we used to perform the deployment, life cycle management, and monitoring Azure Arc data services.
Table 6. Tools used at the Azure Arc data services layer
Tool | Purpose | Version |
Azure Data Studio (including required extensions) | Cross-platform database management tool for hybrid cloud environments. The following extensions were also installed: Azure Arc (v0.9.6) and Azure CLI (v0.1.0). | Insiders build (1.32.0) |
Azure CLI (including required extensions) | Used to interact with Azure Arc data services resources. The arcdata extension v1.0.0 was also installed. | 2.27.0 |
kubectl | Used for a wide variety of management and maintenance activities in Kubernetes. It also has commands for data services management. | 1.21.2 |
Azure portal | Used to produce a notebook for creating data controller through Azure Data Studio. | N/A |
Grafana | Grafana is another tool for visualizing metrics, logs, and other data in a wide variety of formats. | Version deployed with data controller |
Kibana | Elasticsearch, Fluent Bit, and Kibana are used for SQL-MI logging. | Version deployed with data controller |
Depending on the testing scenario, we used different tools to create the Azure Arc-enabled SQL Managed Instances – Azure Data Studio, Azure CLI, and kubectl. Software developers and other consumers of the platform could use any of these tools to accomplish self-service provisioning depending on their use case. Two service tiers are available for Azure Arc-enabled SQL Managed Instances:
In Management and operations of the AKS-HCI section of this white paper, we discussed using Azure Monitor Container Insights and open-source tools like Grafana to monitor performance at the workload cluster level. The data controller was also deployed with its own instance of Grafana for analyzing performance data within the managed instances. The data controller also has its own instance of Kibana for viewing logging data. The following figure shows where we found the URLs to Grafana and Kibana in Azure Data Studio.
Figure 19. Azure Arc-enabled SQL Managed Instance monitoring URLs