by Lucas Santos and Aaron Wislang
Kubernetes has become a platform of choice for building cloud native applications. Kubernetes is highly scalable, highly available, and easy to use, and has many other advantages that make it an excellent choice for building distributed applications.
However, its distributed nature means monitoring everything that is happening within the cluster can be a challenge. Prometheus and Grafana make our experience better.
In this article, we will set up a Kubernetes cluster using Azure Kubernetes Service (AKS) and deploy Prometheus and Grafana to gather monitoring data and visualize them.
Understand the tooling
Prometheus is an open source project that was originally created at SoundCloud in 2012, and contributed to the Cloud Native Computing Foundation (CNCF) in 2016 as the second open source software project after Kubernetes itself.
Prometheus collects and stores metrics from various sources and exposes them to the user in a way that is easy to understand and consume. It’s a tool that can monitor the health of your cluster, the performance of your applications, and the availability of your services.
Prometheus uses an exporter architecture. Exporters are APIs that may collect or receive raw metrics from a service and expose them in a specific format that Prometheus consumes.
Once Prometheus discovers a new exporter (or if you configure one), it will start collecting metrics from these services and store them in persistent storage.
Grafana is a web application that is used to visualize the metrics that Prometheus collects. It will not produce any metrics, but collects and displays them in a way that’s easy to understand through plots, charts and dashboards.
Prerequisites
We will be creating a Kubernetes cluster using Azure Kubernetes Service (AKS), you will need an Azure account, the Azure CLI, Kubectl and Helm.
You will be able to install the latest versions of Kubectl and Helm using the Azure CLI, or install them manually if you prefer.
Install the CLI tools on your local machine since you will need a forward a local port to access both the Prometheus and Grafana web interfaces.
Create an Azure Kubernetes Service (AKS) Cluster
Sign into the Azure CLI by running the login command.
az login
Install kubectl.
az aks install-cli
Install helm.
az acr helm install-cli
Create two bash/zsh variables which we will use in subsequent commands. You may change the syntax below if you are using another shell.
RESOURCE_GROUP=aks-prometheus
AKS_NAME=aks1
Create a resource group. We have chosen to create this in the eastus Azure region.
az group create --name $RESOURCE_GROUP --location eastus
Create a new AKS cluster using the az aks create command. Here we create a 3 node cluster using the B-series Burstable VM type which is cost-effective and suitable for small test/dev workloads such as this.
az aks create --resource-group $RESOURCE_GROUP \
--name $AKS_NAME \
--node-count 3 \
--node-vm-size Standard_B2s \
--generate-ssh-keys
This may take a few minutes to complete.
Authenticate to the cluster we have just created.
az aks get-credentials \
--resource-group $RESOURCE_GROUP \
--name $AKS_NAME
We can now access our Kubernetes cluster with kubectl. Use kubectl to see the nodes we have just created.
kubectl get nodes
Install Prometheus and Grafana
Prometheus can be installed either by using Helm or by using the official operator step by step. We’ll use the Helm chart because it’s quick and easy.
The operator is part of the kube-prometheus project, which is a set of Kubernetes manifests that will not only install Prometheus but also configure Grafana to be used along with it and make all the components highly available. Let’s install Prometheus using Helm.
Add its repository to our repository list and update it.
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
Install the Helm chart into a namespace called monitoring, which will be created automatically.
helm install prometheus \
prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--create-namespace
The helm command will prompt you to check on the status of the deployed pods.
kubectl --namespace monitoring get pods -l "release=prometheus"
Make sure the pods all "Running" before you continue. If in the unlikely circumstance they do not reach the running state, you may want to troubleshoot them.
Explore the Prometheus and Grafana web interfaces
By default, all the monitoring options for Prometheus will be enabled. Let’s leave it this way for now.
Create a port forward to access the Prometheus query interface.
kubectl port-forward --namespace monitoring svc/prometheus-kube-prometheus-prometheus 9090
Open http://localhost:9090 in your web browser and explore the UI to see the raw metrics inside Prometheus.
Prometheus uses Prometheus Query Language (PromQL) to allow you to query time-series data.
Need something higher-level? We can visualize these metrics in Grafana, which we can also port forward to as follows.
kubectl port-forward --namespace monitoring svc/prometheus-grafana 8080:80
You will need to stop the previous port forward command, or run this in another terminal if you would like to run them side by side.
Open http://localhost:8080 in your web browser.
The default username for Grafana is admin and the default password is prom-operator. You can change it in the Grafana UI later.
Note: To ensure security, do not expose your Prometheus or Grafana endpoints to the public internet using a Service or Ingress.
Go to Dashboards -> Manage where you will see many dashboards that have been created for you.
These are all created by the Prometheus operator to ease the configuration process.
Click on the etcd dashboard and you’ll see an empty dashboard. What has happened?
Since AKS is a managed Kubernetes service, it doesn’t allow you to see internal components such as the etcd store, the controller manager, the scheduler, etc. So, there’s no point in even trying to get those metrics out of the cluster because we won’t make it. Let's just disable this option by upgrading our Prometheus release:
helm upgrade prometheus \
prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--set kubeEtcd.enabled=false \
--set kubeControllerManager.enabled=false \
--set kubeScheduler.enabled=false
Once executed, the output won’t change for you, the dashboard will continue to be empty, but we won’t be wasting resources trying to get its metrics.
Note: If you are running an older version of Kubernetes, it might be necessary to turn off the https metrics serving from the kubelet, since they expose the metrics over HTTP. For this, you’ll need to set the kubelet.serviceMonitor.https parameter in the helm chart to false:
helm upgrade prometheus \
prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--set kubeEtcd.enabled=false \
--set kubeControllerManager.enabled=false \
--set kubeScheduler.enabled=false \
--set kubelet.serviceMonitor.https=false
If you would like to clean up the Azure resources, run the following command which will delete everything in your resource group and avoid ongoing billing for these resources.
az group delete --name $RESOURCE_GROUP
We hope you enjoy monitoring your cloud native applications with Prometheus and Grafana!
Next, you may wish to explore our First party Azure Managed service for Grafana developed in partnership with Grafana Labs!
Posted at https://bit.ly/3ln5BkF