Cpp如何集成Prometheus和Grafana

Cpp

发布日期: 2024-06-05

更新日期: 2024-06-05

文章字数: 1.2k

阅读时长: 5 分

阅读次数:

概述

监控是非常有必要的，就像打日志对于调试程序一样，日志打得好是可以避免手动gdb一行行排除错误的。同理对于评价一个程序，需要从各个方面的指标去评价，最常见也是最常用的就是QPS的监控了。

Prometheus

长话短说，这里用的是Prometheus Client Library for Modern C++，可以自己make install，不过官方提供了Ubuntu PPA源就直接用Ubuntu PPA源了，添加源后的安装readme里面没说，去ubuntu release看后知道名字叫prometheus-cpp-dev，直接安装即可。

#include <prometheus/counter.h>
#include <prometheus/exposer.h>
#include <prometheus/registry.h>

#include <array>
#include <chrono>
#include <cstdlib>
#include <memory>
#include <string>
#include <thread>

int main() {
  using namespace prometheus;

  // create an http server running on port 8080
  Exposer exposer{"127.0.0.1:8080"};

  // create a metrics registry
  // @note it's the users responsibility to keep the object alive
  auto registry = std::make_shared<Registry>();

  // add a new counter family to the registry (families combine values with the
  // same name, but distinct label dimensions)
  //
  // @note please follow the metric-naming best-practices:
  // https://prometheus.io/docs/practices/naming/
  auto& packet_counter = BuildCounter()
                             .Name("observed_packets_total")
                             .Help("Number of observed packets")
                             .Register(*registry);

  // add and remember dimensional data, incrementing those is very cheap
  auto& tcp_rx_counter =
      packet_counter.Add({{"protocol", "tcp"}, {"direction", "rx"}});
  auto& tcp_tx_counter =
      packet_counter.Add({{"protocol", "tcp"}, {"direction", "tx"}});
  auto& udp_rx_counter =
      packet_counter.Add({{"protocol", "udp"}, {"direction", "rx"}});
  auto& udp_tx_counter =
      packet_counter.Add({{"protocol", "udp"}, {"direction", "tx"}});

  // add a counter whose dimensional data is not known at compile time
  // nevertheless dimensional values should only occur in low cardinality:
  // https://prometheus.io/docs/practices/naming/#labels
  auto& http_requests_counter = BuildCounter()
                                    .Name("http_requests_total")
                                    .Help("Number of HTTP requests")
                                    .Register(*registry);

  // ask the exposer to scrape the registry on incoming HTTP requests
  exposer.RegisterCollectable(registry);

  for (;;) {
    std::this_thread::sleep_for(std::chrono::seconds(1));
    const auto random_value = std::rand();

    if (random_value & 1) tcp_rx_counter.Increment();
    if (random_value & 2) tcp_tx_counter.Increment();
    if (random_value & 4) udp_rx_counter.Increment();
    if (random_value & 8) udp_tx_counter.Increment();

    const std::array<std::string, 4> methods = {"GET", "PUT", "POST", "HEAD"};
    auto method = methods.at(random_value % methods.size());
    // dynamically calling Family<T>.Add() works but is slow and should be
    // avoided
    http_requests_counter.Add({{"method", method}}).Increment();
  }
  return 0;
}

docker-compose

想要可视化我们的监控数据，我们还得启动Prometheus服务端，以及可视化的grafana工具，编写个docker-compose脚本拉取最新的就行， node_exporter用来帮助监控系统的数据。

services:
  prometheus:
    container_name: prometheus
    image: prom/prometheus:latest
    network_mode: "host"
    volumes:
      - prometheus_yml_path :/etc/prometheus/prometheus.yml
      - /etc/localtime:/etc/localtime
    ports:
      - "port1:9090"
    restart: on-failure

  grafana:
    container_name: grafana
    image: grafana/grafana:latest
    network_mode: "host"
    ports:
      - "port2:3000"
    restart: on-failure
    volumes:
      - /etc/localtime:/etc/localtime
      - ./data/grafana:/var/lib/grafana

  # linux node_exporter
  node_exporter:
    image: quay.io/prometheus/node-exporter:latest
    restart: always
    container_name: qps_node_exporter
    command:
      - '--path.rootfs=/host'
    ports:
      - "port3:9100"
    volumes:
      - your_path

然后去配置prometheus_yml

global:
  scrape_interval: 5s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 10s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=` to any timeseries scraped from this config.
  - job_name: "prometheus"
    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.
    static_configs:
      - targets: ["ip:port1"]

  - job_name: "server"
    static_configs:
      - targets: ["ip:port2"]

  - job_name: "node"
    static_configs:
      - targets: ["ip:port3"]

docker-compose，启动！

一般遇到grafana一直重启的问题，通常是因为文件权限没有给对，直接777即可，一般来说启动成功的结构如下图：

配置 Grafana

Grafana默认账号密码admin，进去先直接配置数据源。

选择后，在 Prometheus 数据源配置页面，配置这个数据源的 HTTP URL 就可以了。如果你点击“Save & test”按钮后提示成功，那么数据源就配置好了。

接下来，我们再添加一个 node-exporter 仪表板（dashboard），把从 node-exporter 拉取的度量数据以图形化方式展示出来。这个时候我们不需要手工一个一个设置仪表板上的 panel，Grafana 官方有现成的 node-exporter 仪表板可用，我们只需要在 grafana 的 import 页面中输入相应的 dashboard ID，就可以导入相关仪表板的设置：