This article is part of Smart Infrastructure monitoring series, we’ve already covered how to Install Prometheus Server on CentOS 7 and how to Install Grafana and InfluxDB on CentOS 7. We have a Ceph cluster on production that we have been trying to find good tools for monitoring it, lucky enough, we came across Prometheus and Grafana.

Ceph Cluster monitoring with Prometheus requires Prometheus exporter that scrapes meta information about a ceph cluster. In this guide, we’ll use DigitalOcean Ceph exporter.

Prerequisites:

  1. Installed Prometheus Server.
  2. Installed Grafana Server.
  3. Docker installed on a Server to run Prometheus Ceph exporter. It should be able to talk to ceph cluster.
  4. Working Ceph Cluster
  5. Access to Ceph cluster to copy ceph.conf configuration file and the ceph.<user>.keyring in order to authenticate to your cluster.

Follow below steps for a complete guide on how to set this up.

Step 1: Install Prometheus Server and Grafana:

Use these links for how to install Prometheus and Grafana.

Install Prometheus Server on CentOS 7 and  Install Grafana and InfluxDB on CentOS 7.

Step 2: Install Docker on Prometheus Ceph exporter client

Please note that Prometheus Ceph exporter client should have access to Ceph cluster network for it to pull Cluster metrics. Install Docker on this server using our official Docker installation guide:

How to install Docker CE on Ubuntu / Debian / Fedora / Arch / CentOS

Also, install docker-compose. Since you added docker repository before installing Docker Engine, you should be able to install docker-compose from yum or apt-get.

sudo yum -y install docker-compose

For Ubuntu:

sudo apt-get -y install docker-compose

Step 3: Build Ceph Exporter Docker image

Once you have Docker Engine installed and service running. You should be ready to build docker image from DigitalOcean Ceph exporter project. Consider installing Git if you don’t have it already.

sudo yum -y install git

If you’re using Ubuntu, run:

sudo apt-get -y install git

Then clone the project from Github:

git clone https://github.com/digitalocean/ceph_exporter.git

Switch to the ceph_exporter directory and build docker image:

$ docker build -t ceph_exporter

This will build an image named ceph_exporter. It may take a while depending on your internet and disk write speeds.

Step 4: Start Prometheus ceph exporter client container

Copy ceph.conf configuration file and the ceph.<user>.keyring to /etc/ceph directory and start docker container host’s network stack. You can use vanilla docker commands, docker-compose or systemd to manage the container. For docker command line tool, run below commands.

$ docker run -it -v /etc/ceph:/etc/ceph --net=host \
-p=9128:9128 digitalocean/ceph_exporter

For docker-compose, create the following file:

# Example usage of exporter in use
version: '2'
services:
  ceph-exporter:
    image: ceph_exporter
    restart: always
    network_mode: "host"
    volumes:
        - /etc/ceph:/etc/ceph
    ports:
        - '9128:9128'

Then start docker container using:

$ docker-compose up -d

For systemd, create service unit file like below:

$ sudo vim /etc/systemd/system/ceph_exporter.service

[Service]
Restart=always
TimeoutStartSec=0
ExecStartPre=-/usr/bin/docker kill ceph_exporter
ExecStartPre=-/usr/bin/docker rm ceph_exporter

ExecStart=/usr/bin/docker run \
--name ceph_exporter \
-v /etc/ceph:/etc/ceph \
--net=host \
-p=9128:9128 \
ceph_exporter

ExecStop=-/usr/bin/docker kill ceph_exporter
ExecStop=-/usr/bin/docker rm ceph_exporter

Check container status:

sudo systemctl status ceph_exporter

You should get output like below if all went fine.

Step 5: Open 9128 on the firewall.

I use firewalld since this is a CentOS 7 server, allow access to port 9128 from your trusted network.

$ sudo firewall-cmd --permanent --add-rich-rule 'rule family="ipv4" \ 
source address="192.168.10.0/24" port protocol="tcp" port="9128" accept'

$ sudo firewall-cmd --reload

Test access with nc or telnet command.

$ telnet 127.0.0.1 9128
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.

$ nc -v 127.0.0.1 9128
Ncat: Version 6.40 ( http://nmap.org/ncat )
Ncat: Connected to 127.0.0.1:9128.

Step 6: Configure Prometheus scrape target with Ceph exporter

We need to define the Prometheus static_configs line for created ceph exporter container.  Edit the file /etc/prometheus/prometheus.yml on your Prometheus server to look like below.

scrape_configs:
    - job_name: prometheus
      static_configs:
          - targets: ['localhost:9090']
    - job_name: 'ceph-exporter'
      static_configs:
        - targets: ['localhost:9128']
          labels:
            alias: ceph-exporter

Replace localhost with your ceph exporter host IP address. Remember to restart Prometheus service after making the changes:

sudo systemctl restart prometheus

Step 7: Add Prometheus Data Source to Grafana

Login to your Grafana Dashboard and add Prometheus data source. You’ll need to provide the following information:

Name: Name given to this data source
Type: The type of data source, in our case this is Prometheus
URL: IP address and port number of Prometheus server you’re adding.
Access: Specify if access through proxy or direct. Proxy means access through Grafana server, direct means access from the web.

Save the settings by clicking save & Test button.

Step 8: Import Ceph Cluster Grafana Dashboards

The last step is to import the Ceph Cluster Grafana Dashboards. From my research, I found the following Dashboards by Cristian Calin.

Ceph Cluster Overview: https://grafana.com/dashboards/917
Ceph Pools Overview: https://grafana.com/dashboards/926
Ceph OSD Overview: https://grafana.com/dashboards/923

We will use dashboard IDs 917, 926 and 923 when importing dashboards on Grafana.

Click the plus sign (+)> Import to import dashboard. Enter the number that matches the dashboard you wish to import above.

To View imported dashboards, go to Dashboards and select the name of the dashboard you want to view.

For OSD and Pools dashboard, you need to select the pool name / OSD number to view its usage and status. SUSE guys have similar dashboards available on https://github.com/SUSE/grafana-dashboards-ceph

Other Prometheus Monitoring guides:

How to Monitor Redis Server with Prometheus and Grafana in 5 minutes

How to Monitor Linux Server Performance with Prometheus and Grafana in 5 minutes

How to Monitor BIND DNS server with Prometheus and Grafana

Monitoring MySQL / MariaDB with Prometheus in five minutes

How to Monitor Apache Web Server with Prometheus and Grafana in 5 minutes