CloudKitty

Rate Prometheus metrics with Cloudkitty: a tutorial using Traefik

This blogpost explains how to rate Prometheus metrics with Cloudkitty. In this tutorial, we use Traefik as a metric source.

Intended audience: System administrators and developers interested in containers and Prometheus, who wants to assess and follow their applications resources consumption.

By Martin CAMEY, CloudKitty Developer & Cloud Consultant @Objectif Libre

Cloudkitty is currently under intensive development, and some new features are being added. At first, Cloudkitty was – and is still – the rating and chargeback solution of OpenStack. With the emergence of containers and monitoring solutions, Cloudkitty intends to widen its scope, to become containers-friendly and even ‘out-the-box’ metrics-friendly.

Pre-requisites

In order to follow this tutorial, you will need a Linux machine with Git, Docker and Docker-compose installed, along with an Internet connection to retrieve the Docker images.
We base our tutorial on Linux distributions using apt as package manager – but feel free to adapt the following commands to use it with other distributions.

How it works

Cloudkitty modular architecture uses classes called “Collector” to retrieve metrics from sources. Once retrieved, the metrics are aggregated and then valued using rating policies defined by the operators.
The valuation step means that Cloudkitty will affect a value respectively to the consumed cpu, memory, disk or any other resources you wanted to. Theses values will be used to generate rating reports in json or csv format for each instance.

The work in progress regarding Prometheus Collector allows to rate metrics from the famous monitoring software. It is pending to be added in the development branch and then be integrated to the stable-release. Coupled to Cloudkitty, the usage is super-simple, since Prometheus is made for time series aggregation and queries purposes. The only things to do is to specify to Cloudkitty the correct requests, and the corresponding rating formulas.

A step-by-step tutorial using Traefik

For a Proof-of-Concept purpose, we will use Traefik, the modern load-balancer and reverse proxy, as a metrics source. Traefik natively supports metrics export to Prometheus, using minimalist configuration.

Basing our example on a simple case

Let’s take a basic case: we have two different containers, both exposed with Traefik and we want to rate differently the number of requests for each container.

The global architecture with all the deployed services and connections looks like this:

Schema CloudKitty Prometheus

We will configure Cloudkitty to register a dataset (a data structure containing both rated values and metric metadata) for each hour our application has been running. Rating the number of requests by hour, we will be able to generate rated reports hourly, daily, weekly or even yearly.

For this example, we run our services with the official Docker images for Prometheus, Traefik, RabbitMQ and MySQL (all used by Cloudkitty). We will also run two containers exposing simple “Hello World” static HTML files, with the official Nginx Docker image. All the above services are launched using docker-compose.

Configuration

Let’s start by defining the infrastructure for that demonstration. We use the following docker-compose.yml file to deploy our services:

docker-compose.yml

version: "2.2"

services:

  prometheus:
    image: prom/prometheus
    hostname: prometheus
    ports:
      - "9090:9090"
    volumes:
      - /path/to/prometheus.yml:/etc/prometheus/prometheus.yml:ro
      - prometheus:/prometheus
    networks:
      - cloudkitty_prometheus_traefik_net

  traefik:
    image: traefik:1.6
    hostname: traefik
    ports:
      - "80:80"
      - "8080:8080"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - /path/to/traefik.toml:/etc/traefik/traefik.toml:ro
    networks:
      - cloudkitty_prometheus_traefik_net

  cloudkitty_db:
    image: mysql:5.7
    hostname: mysql
    environment:
      - MYSQL_ROOT_PASSWORD=asecurepassword
      - MYSQL_USER=cloudkitty
      - MYSQL_PASSWORD=anothersecurepassword
      - MYSQL_DATABASE=cloudkitty
    volumes:
      - mysql:/var/lib/mysql
    ports:
      - "3306:3306"
    networks:
      - cloudkitty_prometheus_traefik_net

  cloudkitty_queue:
    image: rabbitmq:3.7.5-alpine
    hostname: rabbitmq
    environment:
      - RABBITMQ_DEFAULT_USER=cloudkitty
      - RABBITMQ_DEFAULT_PASS=asecurepassword
    volumes:
      - rabbitmq:/var/lib/rabbitmq:rw
    ports:
      - "5672:5672"
    networks:
      - cloudkitty_prometheus_traefik_net

  app1:
    image: nginx:alpine
    labels:
      traefik.frontend.rule: "Host:app1"
    volumes:
      - /path/to/html/app1/:/usr/share/nginx/html:ro
    networks:
      - cloudkitty_prometheus_traefik_net

  app2:
    image: nginx:alpine
    labels:
      traefik.frontend.rule: "Host:app2"
    volumes:
      - /path/to/html/app2/:/usr/share/nginx/html:ro
    networks:
      - cloudkitty_prometheus_traefik_net

  volumes:
    prometheus:
    mysql:
    rabbitmq:

  networks:
    cloudkitty_prometheus_traefik_net:

Note that some volumes are created to prevent data loss, in case containers restart.
We can now configure the files needed by each component. Those files are used by containers that are created/managed by the previous docker-compose.

prometheus.yml

scrape_configs:
  - job_name: 'traefik'
    scrape_interval: 5s

    static_configs:
      - targets: ['traefik:8080']
      labels:
        group: 'traefik_group'

traefik.toml

[entryPoints]
  [entryPoint.traefik]
    address = ":8080"
  [entryPoint.http]
    address = ":80"

[api]
  entryPoint = "traefik"

[docker]
  endpoint = "unix:///var/run/docker.sock"
  domain = "docker.localhost"
  watch = true

[metrics]
  [metrics.prometheus]
    entryPoint = "traefik"

Finally, we can start all the services defined in the docker-compose.yml file using this command:

$ docker-compose up -d

Installing Cloudkitty

In this tutorial, we will use a Python virtualenv to install Cloudkitty.

Let’s get Cloudkitty sources from GitHub, install the dependencies and prepare the Cloudkitty configuration and logging folders. Please adapt the credentials for your current user with chown and chmod if needed.

In a terminal:

(ck_env) $ sudo mkdir /etc/cloudkitty /var/log/cloudkitty
(ck_env) $ cp etc/cloudkitty/api_paste.ini /etc/cloudkitty

Virtualenv creation and Cloudkitty installation:

$ sudo apt install python-virtualenv
...
$ virtualenv ck_env
...
$ source ck_env/bin/activate
(ck_env) $ git clone https://github.com/openstack/cloudkitty.git
...
(ck_env) $ cd cloudkitty
(ck_env) $ pip install pymysql
...
(ck_env) $ pip install -r requirements.txt
...
(ck_env) $ python setup.py install
...

Let’s configure the newly installed Cloudkitty:

/etc/cloudkitty/cloudkitty.conf

[DEFAULT]
verbose = True
log_dir = /var/log/cloudkitty
auth_strategy = noauth
transport_url = rabbit://cloudkitty:asecurepassword@localhost:5672/

[database]
connection = mysql+pymysql://cloudkitty:password@localhost/cloudkitty

[collect]
fetcher = source
collector = prometheus
window = 1800
period = 3600
services = compute, volume, network.bw.in, network.bw.out, network.floating, image
metrics_conf = /etc/cloudkitty/metrics.yml

[storage]
backend = sqlalchemy

Now that Cloudkitty is installed and configured, we need to initialize the Cloudkitty database.

(ck_env) $ cloudkitty-dbsync upgrade
...
(ck_env) $ cloudkitty-storage-init

Finally, we need to define the metrics we want to rate. The metrology configuration file metrics.yml of Cloudkitty manages metrics information.

In our case, the Traefik metric we are going to rate is traefik_backend_requests_total. This metric is stored in Prometheus as a Counter. It simply counts the number of requests to Traefik, for each backend and status code, and so, it only increases. Cloudkitty needs to know the exact amount of requests done for each hour our application has been running. To know the number of new requests between two hours, we will retrieve a range vector by providing both start and stop timestamp parameters, along with the increase() function provided by Prometheus. This function will calculate the increase in the time series in the range vector for each counter.

Even if there is a counter by backend and by status code for the same metric, we can retrieve them in a single query thanks to Docker and Prometheus labels. The only things we need to do are: to specify the PromQL queries in the Cloudkitty metrics.yml configuration file; and to add rating formulas using the cloudkitty-api. We already set the query field in Cloudkitty configuration.

/etc/cloudkitty/metrics.yml

name: Prometheus

fetcher: source
collector: prometheus

period: 3600 # An hour in seconds
wait_periods: 1
window: 1800

url: http://localhost:9090/api/v1/

services_objects:
  compute: instance
  volume: volume
  network.bw.out: instance_network_interface
  network.bw.in: instance_network_interface
  network.floating: network
  image: image
  radosgw.usage: ceph_account

metrics:
  traefik_backend_requests_total:
    endpoint: query_range
    query: 'increase(traefik_backend_requests_total[$period])'
    unit: request
    metadata:
      - backend

It is time to push our rating formulas using the cloudkitty-api.

First, start the cloudkitty-api:

(ck_env) $ cloudkitty-api -p 8889
...
********************************************************************************
STARTING test server cloudkitty.api.app.build_wsgi_app
Available at http://localhost:8889/
...
********************************************************************************

Once the cloudkitty-api is running, open a new terminal. We will use the following commands to push the rating formulas, modestly priced by 10 cents by request for the first application, and 30 cents by request for the second one.

Create a ‘request_number’ group for the rating rules:

$ curl -X POST -H 'Content-Type: application/json' \
> -d '{"name": "request_number"}' \
> 'http://localhost:8889/v1/rating/module_config/hashmap/groups'

It will return something like this:

{"group_id": "8bcea13d-102f-44f5-b164-152e39745865", "name": "request_number"}

Create a rating ‘service’ to apply to the retrieved metric:

$ curl -X POST -H 'Content-Type: application/json' \
> -d  '{"name": "traefik_backend_requests_total"}' \
> 'http://localhost:8889/v1/rating/module_config/hashmap/services'

The result looks like this:

{
  "service_id": "7f3a1d40-c91b-470e-bd62-930468e36dc2",
  "name": "traefik_backend_requests_total"
}

Create a ‘field’ that match the respective backends:

$ curl -X POST -H 'Content-Type: application/json' \
> -d '{"service_id": "7f3a1d40-c91b-470e-bd62-930468e36dc2", "name": "backend"}' \
> 'http://localhost:8889/v1/rating/module_config/hashmap/fields'

Once created, the API will return something like this:

{
  "service_id": "7f3a1d40-c91b-470e-bd62-930468e36dc2",
  "field_id": "2310d614-fdaf-4040-a336-75e49a98456d",
  "name": "backend"
}

We are almost done! We just need to set the rating value of our formulas.
Let’s start with the first backend by creating a mapping that makes a link between a group, the ‘field’ value and the rating value:

$ curl -X POST -H 'Content-Type: application/json' \
> -d '{"group_id": "8bcea13d-102f-44f5-b164-152e39745865", \
> "service_id": "7f3a1d40-c91b-470e-bd62-930468e36dc2", \
> "field_id": "2310d614-fdaf-4040-a336-75e49a98456d", \
> "value": "backend-app1-cloudkitty-prometheus-traefik", "cost": "0.1"}' \
> 'http://localhost:8889/v1/rating/module_config/hashmap/mappings'

The result of the created mapping:

{
  "tenant_id": null,
  "field_id": "2310d614-fdaf-4040-a336-75e49a98456d",
  "value": "backend-app1-cloudkitty-prometheus-traefik",
  "mapping_id": "6b4bf3a1-9d43-4c92-93d5-06c87a6e6bfb",
  "cost": "0.1000000",
  "service_id": null,
  "group_id": "8bcea13d-102f-44f5-b164-152e39745865",
  "type": "flat"
}

Same operation for the second backend:

$ curl -X POST -H 'Content-Type: application/json' \
> -d '{"group_id": "8bcea13d-102f-44f5-b164-152e39745865", \
> "service_id": "7f3a1d40-c91b-470e-bd62-930468e36dc2", \
> "field_id": "2310d614-fdaf-4040-a336-75e49a98456d", \
> "value": "backend-app2-cloudkitty-prometheus-traefik", "cost": "0.3"}' \
> 'http://localhost:8889/v1/rating/module_config/hashmap/mappings'

And the returned result:

{
  "tenant_id": null,
  "field_id": "2310d614-fdaf-4040-a336-75e49a98456d",
  "value": "backend-app1-cloudkitty-prometheus-traefik",
  "mapping_id": "3e077906-629b-493f-9b9b-1cb32b5f405d",
  "cost": "0.3000000",
  "service_id": null,
  "group_id": "8bcea13d-102f-44f5-b164-152e39745865",
  "type": "flat"
}

And Voilà! Our rating formulas are created and stored in Cloudkitty database.
Finally, we just have to activate the HashMap rating module responsible for rating formulas application:

$ curl -X PUT -H 'Content-type: application/json' \
> -d '{"module_id": "hashmap", "enabled": "true"}' \
> 'http://localhost:8889/v1/rating/modules'

 

Run the Cloudkitty rating software

Now that all our components are configured, let’s run the cloudkitty-processor, the software responsible for the rating and chargeback part.

We can stop the cloudkitty-api in the first terminal and use the following command to start the cloudkitty-processor:

(ck_env) $ cloudkitty-processor --config-file /etc/cloudkitty/cloudkitty.conf \
> --log-file /var/log/cloudkitty/processor.log

And we are almost done! We just need to generate some requests to the previously deployed applications that then will be rated by Cloudkitty.

In this tutorial, we will use ApacheBench (a.k.a ab), but feel free to use your favorite HTTP client.
Let’s install it and generate some requests to both our applications.

In another terminal, install the required package:

$ sudo apt install apache2-utils

Let’s make 150 queries to the first application and 90 queries to the second one:

$ ab -H "Host:app1" -n 150 -c 10 http://localhost/
$ ab -H "Host:app2" -n 90 -c 10 http://localhost/

That’s it!
Traefik uses the provided HTTP headers to redirect the requests to the correct backend. We can now look for the metrics in Prometheus reaching http://localhost:9090 in a browser, and for the backend and status code in Traefik reaching http://localhost:8080.

The rating report generation will be the subject of a new article. Stay tuned!

Conclusion

Even if Cloudkitty will keep on being used as the OpenStack rating and chargeback solution, it is evolving, and its modular architecture allows Cloudkitty extension to the whole cloudnative ecosystem. The Prometheus Collector for Cloudkitty reflects that evolution, even if still under development.

Cloudkitty needs some contributors! If you need new functionalities in the Prometheus Collector or just another collector, feel free to propose and implement new Collectors!