Eagle Vision with Prometheus & Grafana

Written by Vivek Matthew on August 5, 2022; tagged under prometheus, grafana, monitoring, devops, o11y

When moving from a monolith to a microservice architecture for addressing scaling challenges, a key point to consider is monitoring. In a microservices-based architecture, applications are deployed onto distributed, dynamic and transient containers. This makes monitoring essential for preventing the next service disruption, and when required, figuring out the cause of a disruption or service outage. For a cloud based architecture like an AWS cloud, a managed solution like Amazon CloudWatch provides a view into the state of a microservice architecture, but such managed solutions do come with a set of limitations despite the benefits they offer. That’s where Prometheus, along with Grafana, fits in. In this blog post, we cover the why and what of monitoring. Then using Prometheus and Grafana we build a solution to address the common monitoring requirements of a microservices-based architecture.

Monitoring

For designing an insightful monitoring solution, it is important to keep track of what and why we are monitoring any particular service.

Why Monitor?

The initial reason for monitoring a service would be for keeping track of when it faces a disruption or breaks. This is certainly a key reason as alerting is an essential part of monitoring. However, with a well designed monitoring solution, additional capabilities open up:

What to Monitor?

Monitoring a microservice-based architecture is not limited to a Kubernetes Cluster or application in the cluster. Instead, there are four main components to monitor in general:

Monitoring Tools

After planning out the monitoring requirements of the microservice architecture, the next part is setting up the tools to do the monitoring. With the overwhelming list of options available, deciding which one to go with can be a challenge.

CloudWatch (Managed Monitoring Tool)

When starting off with a monitoring solution, a managed one like Amazon CloudWatch is a potential option. In the case of CloudWatch, initial setup is quick and integrates easily with all AWS provided services for monitoring an AWS cloud. However, exclusively depending on CloudWatch can soon lead to a couple problems:

Prometheus

Prometheus is open-source and built specifically for monitoring cloud native apps deployed on dynamic cloud environments. As a popular CNCF supported projects, it has a lot of community support and integrations with other applications. As a monitoring tool build for cloud native apps, it can natively scrape metrics and integrate with the Kubernetes API Server. The flexible labels-based time series database can be queried with the powerful PromQL language for generating alerts and dashboards. In short, Prometheus does the following:

Notably missing in the above list is dashboards, which is where Grafana fits in to complement Prometheus.

Along with dashboards, there are a few other application monitoring features that Prometheus does not handle as well as the alternatives:

Grafana

Grafana is an open-source metric visualization tool that supports multiple data sources, Prometheus being one of the most popularly used data sources. The analytics and visualizations provided by Grafana help make sense of all the monitoring data scraped by Prometheus. The transformations available in Grafana make it easy to manipulate data in Prometheus, such as converting non-time-series data into tables. The intuitive editor and annotations allow for creating interactive, searchable and insightful dashboards quickly, either from scratch or by importing one of the many thousands of available community-built dashboards.

Prometheus & Grafana Setup

The kube-prometheus-stack Helm chart maintained by the Prometheus community can be used to easily deploy a Prometheus monitoring stack onto a Kubernetes cluster.

Once the stack is up, exporters can be setup depending on what needs to be monitored, a couple examples are:

With the data scraped and stored in Prometheus, dashboards on Grafana are provisioned either with the Grafana community built dashboards or custom-built dashboard JSONs that can be imported.

After all these three are setup, a monitoring solution is ready:

The above setup is available on this Git repo

What does it looks like?

With the deployment, provisioning and configuration done, metrics will be scraped from exporters, alerts will fire on the configured receivers and dashboards will be visible on Grafana. Using these Grafana dashboards and the alerts setup in Prometheus, this monitoring solution can serve multiple purposes, including:

The below screenshots show the dashboards that are provisioned by the setup present in the Git repo, mentioned in the previous section.

Blackbox Dashboard

Blackbox Dashboard

PostgreSQL Dashboard

PostgreSQL Dashboard

PostgreSQL Query Drill-Down

PostgreSQL Query Drill-Down Dashboard

Redis Dashboard

Redis Dashboard

CloudWatch Dashboards

CloudWatch Dashboard

What next?

The Prometheus and Grafana monitoring solution built in this blog will go a long way, but the capabilities of Prometheus extend ever further:

And with the huge community and CNCF support, Prometheus gets better and better!

If you have any questions or feedback, feel free to drop us a mail at team@codemancers.com.