Metrics in optimization process: Start to measure your app with the Prometheus tool (2/4)

5 min

read

Metrics are a great way to find useful information about our application and infrastructure. Before we start using metrics and preparing reports for the business, we need to integrate our app with some tools. That’s why today our focus is on how to measure, which means that we’re going to delve deep into how our application code works in the first place. We’re going to take a look at the Prometheus monitoring system and get deep into what is Prometheus monitoring.

In the previous article, we described how it is possible to uncover various problems with an app by finding bottlenecks in the project, why metrics are great to do this, and why we went for the Prometheus server monitoring to accomplish our objectives. Today, we’re going to actually start measuring the app.

Try using Prometheus metrics yourself!

Most of us prefer to check things by themselves rather than implementing them without knowing if this whole thing works as expected. Luckily, there is a live demo that contains the default Prometheus dashboard and is integrated with Grafana.

What is Prometheus? All you need to know about integration with Prometheus

One of the best things about the Prometheus monitoring tool is the support for many languages. In most cases, we don’t need to worry about how to write integration and handle all the best practices. Here you can find a list of client libraries to work with the Prometheus software. 

Currently, there are around 20 libraries for different technologies! If your technology is not on the list, you can follow a special guide which explains how to write your own integration.

The application which we needed to optimize and integrate metrics for was written in PHP/Symfony. We’re going to present examples in these technologies. Still, it would be easy enough to transfer it to something else.

Let’s get back to the application.

We wanted to save time and deliver a solution as soon as possible. We used the existing bundle instead of writing a new one. Our choice was the tweedgolf bundle. It’s a nice and small library which has got everything that we needed.

Define Prometheus data collection method

How does Prometheus work? Let’s take a closer look at monitoring Prometheus and Prometheus architecture.

Prometheus offers two different ways to collect data. Make sure you choose the one most suitable for your needs. More on that below.

  1. Push-gateway – the application pushes data to the special gateway (collector). It’s useful when we want to capture information for short-lived jobs.
  2. Scraped – the application is asked by a Prometheus instance for data that was collected.

In our case, we needed to focus on the general flow and behavior of an application. The second method was the best for us. We’re now going to talk about it in detail.

Prometheus implementation

To achieve our main goal – measuring how our code works – we need to start measuring data just before we call the controller and stop it just after the code was executed.

Luckily, the Symfony Framework uses events so we’re able to create subscribers that handle two events. 

Here is the basic concept for our code:

  • validate data which we get from request/event (e.g. sometimes Symfony doesn’t return the correct action),
  • check if we want to collect data for the specific endpoint (we don’t want to measure some actions because it doesn’t add any values),
  • start collecting data,
  • do the controller action,
  • prepare the collected data,
  • save the collected data to the storage.

Let’s take a look at a code example which shows how to start measuring the execution time for endpoints:


Now we need to add the configuration to know how we want to save everything:


And some additional config for subscriber:

Share the collected data with Prometheus instance

In-memory storage is temporary. It’s not recommended to use it for metrics data because we want to be able to search the data many times. To that end, we’re going to use the Prometheus instance.

By default, storage such as Redis removes old data. It will cause problems – incorrect results after some time. If your storage doesn’t remove old data, metrics will kill an application. We had this problem. It took only a month for our Redis storage to be full. The configuration didn’t allow overriding data. Also, too much data brings optimization problems.

To fix all these problems, we have to remove metrics after each controller call!

The Prometheus instance will ask our application for data collected in the storage. To do that, we need to add an entry point. Remember to add configuration for your server to only allow the Prometheus instance to ask for data. Otherwise, our metrics will not be relevant.

Basic concept:

  • get data from storage,
  • remove collected metrics from storage,
  • return collected data.

Time to explain the data returned by our controller. The first time that we saw it, we were a little confused about its meaning. But as strange as it might look, it’s actually very easy to read. We think of it as a small curiosity. In the normal flow, we won’t have to worry about it. Grafana will read this data and prepare graphs. 

Currently, our controller returns data as follows:

# HELP app_request_execution_time_seconds Request duration
# TYPE app_request_execution_time_seconds histogram
app_request_execution_time_seconds_count{router="api_get_second_example",env="PROMETHEUS_",application="example_app"} 1
app_request_execution_time_seconds_sum{router="api_get_second_example",env="PROMETHEUS_",application="example_app"} 0.07
app_request_execution_time_seconds_bucket{le="+Inf",router="api_get_second_example",env="PROMETHEUS_METRICS",application="example_app"} 1

Here is the legend

# HELP _comment_ is our description to know what specific metric count

# TYPE _name_ _type_ is a information about specific collector

_name_{_label-name_=_label-value_} _value_ Information about collected data

We can read the result like this:

We got a histogram which collects the request duration. Histogram is named  app_request_execution_time_seconds. It has additional labels to filter data such as routing called api_get_second_example. The environment is called PROMETHEUS_METRICS and the application’s name is example_app. Currently, there is only 1 request logged and it took 0.07 seconds.

OK, our application collects data. We’re able to share this data with the Prometheus instance. Time to tackle the last challenge – we don’t have infinite space for metrics on the disk. To solve this problem, we need to make a decision regarding the precision of the data we want.

We had many arguments about it before we found the Golden mean. We can’t collect data too often because it will be hard to read. We can’t collect data too rarely because we will not see current values.

We wanted to show metrics using Grafana. Grafana has an interesting feature called the step time. It defines how often it should take the collected data (to be presented in a graph form). For one day it takes data every minute, for two days every 2 minutes, for 30 days once per day. It’s good because it prevents a situation when we grab too much data and CPU/RAM are not able to handle it. For us, it’s problematic because data is not summarized. It’s just a peak from the specified time.

To better understand this, consider these examples:

E.g. 1

When we select data from last 30 days to show and we collect data every 15 seconds, we will see 30 results (not 4 * 60  * 24 * 30 = 172 800 results) in a graph taken at a specific hour; e.g.

 

DateQuantity
21.02.2020 12:12:1223
22.02.2020 12:12:1225
23.02.2020 12:12:12 28

E.g 2.

We want to see the last 5 minutes with a step time of one minute and application-collected data every 15 seconds.

Collected data:

DateQuantity
24.02.2020 12:12:0012
24.02.2020 12:12:151
24.02.2020 12:12:304
24.02.2020 12:12:4516
24.02.2020 12:13:008
24.02.2020 12:13:151
24.02.2020 12:13:303
24.02.2020 12:13:4511
24.02.2020 12:14:005

We will see 5 results with data from the specified peak:

DateQuantity
24.02.2020 12:12:0012
24.02.2020 12:13:008
24.02.2020 12:14:00
24.02.2020 12:15:00
24.02.2020 12:16:00

not summary like this

DateQuantityExplanation
12:12:0033because we collected in 60 seconds 12+1+4+16
12:13:0023because we collected in 60 seconds 8+1+3+11

If we collected data between these taken peaks, the graph will not show it.

Don’t use these metrics to check if a specific endpoint/place in code is used or not. Just put there a normal log (and check after a minimum of one month if it was logged)  and talk with the business if the functionality is supported. Also, remember that some jobs may be executed once per year etc.

Usually, you need to check data for the last 7 days. We decided to collect data every 2 minutes. To clarify how it worked:

request 17:11:11 counter 1

request 17:11:12 counter 2

request 17:11:12 counter 3

request 17:12:13 counter 58

If we ask the controller for data we will get the counter 58. So we have only one data row which represents data for 58 requests in two minutes. It’s much easier to prepare a report for the business on how their application works.

That’s it! We have everything that is needed in the application’s code to get data using Prometheus.

Summary & Example Prometheus monitoring integration

As we promised in the first part of the series, we prepared a simple application to show you how it should all be implemented.

It contains all required validations, blacklists for endpoints, more metrics and other important stuff required to set up metrics specified for PHP with the Symfony implementation.

This repository contains two ways to make the setup. The first is based on Docker configuration file. If you work as a developer, it should be very easy to start and check how to integrate your application. The other way is setup with Kubernetes and kind (the Kubernetes Prometheus combination) – we will talk about it in the next part. On the whole, the third article will focus on Prometheus DevOps’ work. We will explain how to configure the Prometheus server and Prometheus database which will collect all our data in one place.

So, what do you think about monitoring with Prometheus tool and this kind of monitoring solution in general?

What do you want to achieve?





or contact us directly at [email protected]

This site is protected by reCAPTCHA and the Google
Privacy Policy and Terms of Service apply.

Thanks

Thank you!

Your message has been sent. We’ll get back to you in 24 hours.

Back to page
24h

We’ll get back to you in 24 hours

to get to know each other and address your needs as quick as possible.

Strategy

We'll work together on possible scenarios

for the software development strategy in sync with your goals.

Strategy

We’ll turn the strategy into an actionable plan

and provide you with experienced development teams to execute it.

Our work was featured in:

Tech Crunch
Forbes
Business Insider

Aplikujesz do

The Software House

CopiedTekst skopiowany!

Nie zapomnij dodać klauzuli:

Kopiuj do schowka

Jakie będą kolejne kroki?

Phone

Rozmowa telefoniczna

Krótka rozmowa o twoim doświadczeniu,
umiejętnościach i oczekiwaniach.

Test task

Zadanie testowe

Praktyczne zadanie sprawdzające dokładnie
poziom twoich umiejętności.

Meeting

Spotkanie w biurze

Rozmowa w biurze The Software House,
pozwalająca nam się lepiej poznać.

Response 200

Response 200

Ostateczna odpowiedź i propozycja
finansowa (w ciągu kilku dni od spotkania).

spinner

Fireside chat for C-level specialists: State of Frontend 2020

Sign up!