Back to all blogposts

Metrics in optimization process: Grafana custom dashboard tutorial (4/4)

Grzegorz Matuszak Kamil Rosenberger

Grzegorz, Kamil

Multiple authors

You may know already how important it is to measure the performance of your app, or even how to integrate with tools such as Prometheus. But if you can’t visualize your data in an easy-to-read organized manner, you won’t get much out of it. In the last part of the Metrics in optimization process series, we’re taking you to a Grafana custom dashboard crash course so that you can produce the kind of dashboards even the Business department will understand.

In the previous three articles, we talked about why you should measure your app, how you can integrate your app with Prometheus and how to actually go about collecting data

Since we have now a lot of data to show we should think about the presentation layer. The true purpose of it is to present data in a way that gives us what we need, rather than just as a disorganized pile of trash. It will not just  make pretties, but also far more useful. Another important thing is that the data, which we collected with Prometheus, doesn’t have an easy-to-read format. Today it’s time to show ugly metrics data in a beautiful way.

As you know from previous articles, we collect data from Kubernetes and our application. We are going to show an example of how to configure a dashboard, what is important, and why we did it this way. We are not going to show and describe all options in Grafana, just the ones that we needed to configure our charts (hey, it’s a crash course, after all!).

Kubernetes dashboards

Since we have already covered a lot in previous parts, it’s a good idea to sum up what we have now. With that, we can create our own custom Grafana dashboards.

Kubernetes has many dashboards of its own. They provide information about resources. With the cluster created and Prometheus integration out of the way, it’s good to check them.

Kubernetes dashboards

As you can see, it is quite a long list.  We don’t want to get you bored with describing every dashboard and every metric they provide in detail. Let’s focus on “Kubernetes / Compute Resources / Namespace (Pods)”.

This dashboard displays information about CPU, memory and network usage. These pieces of information are crucial to find out if optimization helped or if the last deployment added some particularly “heavy” scripts.

Namespace

Grafana custom dashboard

In our application, we should have many custom dashboards to accommodate each type of data that we need. A good practice is to not include too much data in a single dashboard – it will be hard to read and understand. What’s more, the loading time for that page may increase dramatically.

First, we need to choose the “+” option on the upper side of the left sidebar. Next, let’s go for the “Dashboard” option from the drop-down menu.

Dashboard

Grafana gives us two starting points – chart type or data. Usually, we want to use default charts, but it’s good to try all the different charts. In our case, we are going to create a default chart. In order to do that, we need to pick the “add query” option.

Add query

Here comes the trickiest part – we need to define WHAT metrics we want to show and HOW we want to show them.

Configuration process tab 1

First, we need to discuss specific options to understand how everything will work:

  • Chart – this is where we are going to show our data.
  • Query – in our case we should have configured PromQL by default. If it’s not, please change it to the “Prometheus” option.
  • Add query – we can show more than one data source in one chart. It’s useful when we need to show a new data source but we want to keep the old one.
  • Query inspection – our debugging tool to improve the experience when we need to find a problem with a query.
  • Query box – a place where we define what and how we want to see. 

Metrics – we use them to choose the data that we want to show. In this case, we collected “app_request_execution_time_seconds” so that we can search for this metric. If the application hasn’t collected a particular type of data, it will not be on the list. Remember to run any endpoint in your app first to generate a minimum amount of data. In our case “app_request_execution_time_seconds”  is an histogram. 

Prometheus logged three elements:

  • Sum – summary of logged information.
  • Count – counted data. 
  • Bucket – in some cases we don’t want exact data, just an approximation. If we configured buckets in our app, we can use them to approximate data.

In our example, we are going to show an execution time per request. We are going to use “sum” and “count” to prepare the exact data we want to see.
The easiest way to show it is to write:

 “app_request_execution_time_seconds_sum/app_request_execution_time_seconds_count”.

Unfortunately in real project situations, it’s not enough. Usually, we have many pods that collect data. Also, we want to see and analyze data per routing.

First, we need to use the “sum” function which is going to sum up data from different pods:

sum(app_request_execution_time_seconds_sum)/sum(app_request_execution_time_seconds_count)”

Sum function

But now we can only see one element instead of all the routings we called! To fix this problem, we are going to need to group our data using “by (label)”:

”sum(app_request_execution_time_seconds_sum) by (router)/sum(app_request_execution_time_seconds_count) by (router)”

By label

Congratulations! You configured your first chart with metrics!

Still, it’s not a pretty look. We can see “ugly” routing labels instead of clear data.

Ugly labels

To fix this problem, we need to declare data in the “legend” field. Just put ”{{router}}”` and it’s done.

As we discussed in the previous article, we collect data for specific periods so we need to put the correct “min” step (in my case it will be “2 min”).

The final query configuration is:

Final configuration

And that’s all! Now you can see the proper metrics:

Pretty visualization

In the example application which we created for this article, we have only 8 API endpoints. Everything is easy to read. The problem starts when we have 50 or more endpoints.

Time to improve our visualization. We need to move to the second tab. In this tab, we can change our chart type (in our case the best one will be “graph”).

Second tab

Draw modes

Most options depend on our visual taste. I like defaults, so I will stay with it.

There are only two options which improve visibility that I want to mention:

  • Null value – we want to see data even if the value is zero so I will change it to “null as zero”.
  • Mode – when you point to a specific chart element, you will see information about all endpoints within a specific time range. If you want to see the highest time, it will be better to change it to “single”.

Grafana custom dashboard legend

The current legend is not very pretty and we need to change it. It brings a lot of useful information. The data it shows is typically used for creating business reports.

We want to see everything as a table and move it to the right to improve better visibility and take less space. As far as values are concerned, the most useful statistics ones are the average and max peak. For some statistics it’s good to set decimal to a more general value (we changed it to 2), we are not interested in thousandths.

Legend

Grafana custom dashboard thresholds

Thresholds are a very nice feature that helps us find data that is greater/lesser than a specified value. With Thresholds, it’s much easier to find routings to optimize.

As a first step, we want to know which endpoints are slower than 0.3 seconds. We will see it easily when we configure thresholds with “gt 0.3”. In a real project, we suggest setting it to 1.0 seconds.

Thresholds

General options for Grafana custom dashboard

Title – the name for our new chart, the name should be short and easy to understand – e.g. when we log the execution time in an application we can name it “App request execution time”.

Description – a place when we can put more information on why and what we show.

General options

That’s it. Our new chart is configured. We can save it. We will be redirected to the new dashboard with the first chart. In our case, we needed to repeat this process for each useful information.

The full configuration is available in the example app repository. The visualization will look like that:

Full configuration

Grafana custom dashboard – summary

Grafana helps us visualize and understand what happens in an application:

  • It provides information such as which actions increased/decreased speed and resource utilization. 
  • It makes it possible to prepare a business report about progress. 
  • With alerts, it’s easy to get information that something is wrong. 

Life with all this information is so much easier!

The optimization process is a long journey, hard to accomplish without additional help. It’s important to define goals, analyze exactly what should be measured. We need software which will help solve problems. In this case, Prometheus was the best tool for us. It delivered necessary information and provided integrations with many programming languages and Kubernetes out of the box.

With a deep understanding of how data collection and presentation layers work, it’s easy to find real, useful information about an application. Even business was satisfied when we prepared a presentation on how an application worked before and after optimization (thanks for the charts, Grafana!).

We can’t forget about end-users who stopped complaining about application speed.

Finally, it made us happy as developers, because we were able to find out exactly how our changes affect the application.

And if you are searching for a team that knows how to optimize application performance, saving your precious time and resources in the process, contact The Software House. 👇

You may also like

What do you want to achieve?





    You can upload a file (optional)

    Upload file

    File should be .pdf, .doc, .docx, .rtf, .jpg, .jpeg, .png format, max size 5 MB

    Uploaded
    0 % of

    or contact us directly at [email protected]

    This site is protected by reCAPTCHA and the Google
    Privacy Policy and Terms of Service apply.

    Thanks

    Thank you!

    Your message has been sent. We’ll get back to you in 24 hours.

    Back to page
    24h

    We’ll get back to you in 24 hours

    to get to know each other and address your needs as quick as possible.

    Strategy

    We'll work together on possible scenarios

    for the software development strategy in sync with your goals.

    Strategy

    We’ll turn the strategy into an actionable plan

    and provide you with experienced development teams to execute it.

    Our work was featured in:

    Tech Crunch
    Forbes
    Business Insider

    Aplikujesz do

    The Software House

    CopiedTekst skopiowany!

    Nie zapomnij dodać klauzuli:

    Kopiuj do schowka

    Jakie będą kolejne kroki?

    Phone

    Rozmowa telefoniczna

    Krótka rozmowa o twoim doświadczeniu,
    umiejętnościach i oczekiwaniach.

    Test task

    Zadanie testowe

    Praktyczne zadanie sprawdzające dokładnie
    poziom twoich umiejętności.

    Meeting

    Spotkanie w biurze

    Rozmowa w biurze The Software House,
    pozwalająca nam się lepiej poznać.

    Response 200

    Response 200

    Ostateczna odpowiedź i propozycja
    finansowa (w ciągu kilku dni od spotkania).

    spinner