AlmaLinux

Install Prometheus and Node Exporter on Rocky Linux 10 / AlmaLinux 10

Prometheus is the go-to open source monitoring system for infrastructure and application observability. It works on a pull-based model, scraping metrics from instrumented targets at defined intervals and storing them in a built-in time-series database (TSDB). Its powerful query language, PromQL, lets you slice, aggregate, and alert on metrics with precision. This guide walks through a full Prometheus stack deployment on Rocky Linux 10 or AlmaLinux 10, covering the Prometheus server, Node Exporter, Alertmanager, alerting rules, and Grafana integration.

Original content from computingforgeeks.com - post 122717

Architecture Overview

Before jumping into installation, it helps to understand how the pieces fit together:

  • Prometheus Server – The core component. It discovers targets, scrapes their /metrics endpoints on a schedule, stores the data in its local TSDB, evaluates alerting rules, and serves PromQL queries.
  • Node Exporter – A lightweight agent that runs on every Linux server you want to monitor. It exposes hardware and OS metrics (CPU, memory, disk, network) on port 9100.
  • Alertmanager – Receives firing alerts from Prometheus, deduplicates them, groups related alerts, and routes notifications to email, Slack, PagerDuty, or other receivers.
  • Grafana – Connects to Prometheus as a data source to build dashboards and visualizations. Not part of Prometheus itself, but nearly always deployed alongside it.

The data flow is straightforward: Node Exporter exposes metrics, Prometheus scrapes and stores them, alerting rules trigger when thresholds are breached, and Alertmanager delivers the notifications.

Prerequisites

  • A Rocky Linux 10 or AlmaLinux 10 server with root or sudo access
  • Firewall access to ports 9090 (Prometheus), 9100 (Node Exporter), and 9093 (Alertmanager)
  • At least 2 GB of RAM for a small deployment (plan more for production)
  • A working DNS or /etc/hosts entries if monitoring multiple hosts

Ensure your system is up to date before proceeding:

sudo dnf update -y

Step 1: Install Prometheus from Binary

Prometheus is distributed as a statically compiled binary. There is no RPM package in the base repos, so we install it manually and manage it with systemd. Start by creating a dedicated system user and the required directories.

sudo groupadd --system prometheus
sudo useradd --system -s /sbin/nologin -g prometheus prometheus

Create directories for configuration and data storage:

sudo mkdir -p /etc/prometheus /var/lib/prometheus

Download the latest Prometheus 2.x release. Replace the version number below if a newer release is available on the Prometheus releases page.

PROM_VERSION="2.54.1"
cd /tmp
curl -LO https://github.com/prometheus/prometheus/releases/download/v${PROM_VERSION}/prometheus-${PROM_VERSION}.linux-amd64.tar.gz
tar xvf prometheus-${PROM_VERSION}.linux-amd64.tar.gz

Move the binaries into /usr/local/bin and the console libraries into /etc/prometheus:

cd prometheus-${PROM_VERSION}.linux-amd64
sudo cp prometheus promtool /usr/local/bin/
sudo cp -r consoles console_libraries /etc/prometheus/

Set ownership on all Prometheus directories:

sudo chown -R prometheus:prometheus /etc/prometheus /var/lib/prometheus

Verify the installation:

prometheus --version

Step 2: Configure prometheus.yml

The main configuration file controls global settings, scrape jobs, and rule file paths. Create /etc/prometheus/prometheus.yml with the following content:

sudo tee /etc/prometheus/prometheus.yml <<'EOF'
global:
  scrape_interval: 15s
  evaluation_interval: 15s
  scrape_timeout: 10s

alerting:
  alertmanagers:
    - static_configs:
        - targets:
            - localhost:9093

rule_files:
  - "/etc/prometheus/rules/*.yml"

scrape_configs:
  - job_name: "prometheus"
    static_configs:
      - targets: ["localhost:9090"]

  - job_name: "node_exporter"
    static_configs:
      - targets:
          - "localhost:9100"
          - "server2.example.com:9100"
          - "server3.example.com:9100"
EOF

Key parameters to understand:

  • scrape_interval – How often Prometheus pulls metrics from each target. 15 seconds is a solid default. Lower values give more granularity but increase storage and load.
  • evaluation_interval – How often alerting and recording rules are evaluated.
  • scrape_timeout – If a target does not respond within this window, the scrape is marked as failed.
  • static_configs targets – A list of host:port pairs for each scrape job. For dynamic environments, consider using file-based service discovery or Consul SD instead.

Validate the configuration before starting the service:

promtool check config /etc/prometheus/prometheus.yml

Step 3: Create Prometheus Systemd Service

Create a systemd unit file so Prometheus starts on boot and can be managed with standard systemctl commands:

sudo tee /etc/systemd/system/prometheus.service <<'EOF'
[Unit]
Description=Prometheus Monitoring System
Documentation=https://prometheus.io/docs/
Wants=network-online.target
After=network-online.target

[Service]
User=prometheus
Group=prometheus
Type=simple
ExecReload=/bin/kill -HUP $MAINPID
ExecStart=/usr/local/bin/prometheus \
  --config.file=/etc/prometheus/prometheus.yml \
  --storage.tsdb.path=/var/lib/prometheus \
  --storage.tsdb.retention.time=30d \
  --web.console.templates=/etc/prometheus/consoles \
  --web.console.libraries=/etc/prometheus/console_libraries \
  --web.listen-address=0.0.0.0:9090 \
  --web.enable-lifecycle
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

Start and enable the service:

sudo systemctl daemon-reload
sudo systemctl enable --now prometheus

Check that Prometheus is running:

sudo systemctl status prometheus

Open the firewall port if firewalld is active:

sudo firewall-cmd --permanent --add-port=9090/tcp
sudo firewall-cmd --reload

At this point, browsing to http://your-server-ip:9090 should show the Prometheus web UI. The “prometheus” target under Status > Targets should be in the UP state.

Step 4: Install Node Exporter on Target Servers

Node Exporter needs to run on every Linux server you want to collect OS-level metrics from. Repeat these steps on each target host, including the Prometheus server itself if you want host metrics from it.

Create a dedicated user for Node Exporter:

sudo useradd --system -s /sbin/nologin node_exporter

Download and install the binary:

NODE_VERSION="1.8.2"
cd /tmp
curl -LO https://github.com/prometheus/node_exporter/releases/download/v${NODE_VERSION}/node_exporter-${NODE_VERSION}.linux-amd64.tar.gz
tar xvf node_exporter-${NODE_VERSION}.linux-amd64.tar.gz
sudo cp node_exporter-${NODE_VERSION}.linux-amd64/node_exporter /usr/local/bin/

Create the systemd unit file:

sudo tee /etc/systemd/system/node_exporter.service <<'EOF'
[Unit]
Description=Prometheus Node Exporter
Documentation=https://prometheus.io/docs/guides/node-exporter/
Wants=network-online.target
After=network-online.target

[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/usr/local/bin/node_exporter \
  --collector.systemd \
  --collector.processes
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

Start and enable Node Exporter:

sudo systemctl daemon-reload
sudo systemctl enable --now node_exporter

Open the firewall port:

sudo firewall-cmd --permanent --add-port=9100/tcp
sudo firewall-cmd --reload

Confirm metrics are being exposed by curling the endpoint:

curl -s http://localhost:9100/metrics | head -20

You should see lines like node_cpu_seconds_total, node_memory_MemTotal_bytes, and hundreds of other metric families.

Step 5: Add Node Exporter Targets to Prometheus

If you already listed your targets in the prometheus.yml file from Step 2, they will be scraped automatically. To add more servers later without restarting Prometheus, use file-based service discovery. Update the node_exporter job in prometheus.yml:

  - job_name: "node_exporter"
    file_sd_configs:
      - files:
          - "/etc/prometheus/targets/nodes.yml"
        refresh_interval: 30s

Then create the targets file:

sudo mkdir -p /etc/prometheus/targets

sudo tee /etc/prometheus/targets/nodes.yml <<'EOF'
- targets:
    - "192.168.1.10:9100"
    - "192.168.1.11:9100"
    - "192.168.1.12:9100"
  labels:
    env: "production"
    dc: "us-east-1"
EOF

Prometheus picks up changes to this file automatically based on the refresh_interval, so you can add or remove targets without touching the main config. Reload the configuration to apply the job change:

sudo systemctl reload prometheus

Step 6: Verify Metrics and Run PromQL Queries

Open the Prometheus web UI at http://your-server-ip:9090. Navigate to Status > Targets and confirm all node_exporter endpoints show a State of UP with a recent Last Scrape timestamp.

Head to the Graph tab and try these PromQL queries to make sure data is flowing:

CPU usage percentage per host (1-minute average):

100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[1m])) * 100)

Available memory in GB:

node_memory_MemAvailable_bytes / 1024 / 1024 / 1024

Disk space used percentage on root filesystem:

100 - ((node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) * 100)

Network traffic received in Mbps:

rate(node_network_receive_bytes_total{device!="lo"}[5m]) * 8 / 1024 / 1024

If these queries return data, your Prometheus and Node Exporter stack is working correctly. For a deeper look at monitoring Linux servers, see our guide on monitoring Linux servers with Prometheus and Grafana.

Step 7: Install Alertmanager

Alertmanager handles alert routing and notification delivery. Download and install it the same way as Prometheus:

sudo useradd --system -s /sbin/nologin alertmanager
sudo mkdir -p /etc/alertmanager /var/lib/alertmanager

AM_VERSION="0.27.0"
cd /tmp
curl -LO https://github.com/prometheus/alertmanager/releases/download/v${AM_VERSION}/alertmanager-${AM_VERSION}.linux-amd64.tar.gz
tar xvf alertmanager-${AM_VERSION}.linux-amd64.tar.gz
sudo cp alertmanager-${AM_VERSION}.linux-amd64/alertmanager /usr/local/bin/
sudo cp alertmanager-${AM_VERSION}.linux-amd64/amtool /usr/local/bin/

Create the Alertmanager configuration with email and Slack receivers:

sudo tee /etc/alertmanager/alertmanager.yml <<'EOF'
global:
  resolve_timeout: 5m
  smtp_smarthost: 'smtp.example.com:587'
  smtp_from: '[email protected]'
  smtp_auth_username: '[email protected]'
  smtp_auth_password: 'your-smtp-password'
  smtp_require_tls: true

route:
  group_by: ['alertname', 'instance']
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 4h
  receiver: 'email-notifications'
  routes:
    - match:
        severity: critical
      receiver: 'slack-critical'

receivers:
  - name: 'email-notifications'
    email_configs:
      - to: '[email protected]'
        send_resolved: true

  - name: 'slack-critical'
    slack_configs:
      - api_url: 'https://hooks.slack.com/services/YOUR/WEBHOOK/URL'
        channel: '#alerts-critical'
        send_resolved: true
        title: '{{ .GroupLabels.alertname }}'
        text: '{{ range .Alerts }}{{ .Annotations.description }}{{ end }}'

inhibit_rules:
  - source_match:
      severity: 'critical'
    target_match:
      severity: 'warning'
    equal: ['alertname', 'instance']
EOF

Set ownership and create the systemd unit:

sudo chown -R alertmanager:alertmanager /etc/alertmanager /var/lib/alertmanager

sudo tee /etc/systemd/system/alertmanager.service <<'EOF'
[Unit]
Description=Prometheus Alertmanager
Documentation=https://prometheus.io/docs/alerting/alertmanager/
Wants=network-online.target
After=network-online.target

[Service]
User=alertmanager
Group=alertmanager
Type=simple
ExecStart=/usr/local/bin/alertmanager \
  --config.file=/etc/alertmanager/alertmanager.yml \
  --storage.path=/var/lib/alertmanager \
  --web.listen-address=0.0.0.0:9093
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

Start the service:

sudo systemctl daemon-reload
sudo systemctl enable --now alertmanager

Open the firewall port and verify the UI:

sudo firewall-cmd --permanent --add-port=9093/tcp
sudo firewall-cmd --reload

The Alertmanager UI is available at http://your-server-ip:9093. For more on configuring alerting across your infrastructure, check our guide on setting up Alertmanager for Prometheus.

Step 8: Create Recording Rules and Alerting Rules

Recording rules precompute frequently used PromQL expressions, improving dashboard performance. Alerting rules define the conditions that trigger notifications. Create a rules directory and add your rule files:

sudo mkdir -p /etc/prometheus/rules

Create a recording rules file for common aggregations:

sudo tee /etc/prometheus/rules/recording_rules.yml <<'EOF'
groups:
  - name: node_recording_rules
    interval: 15s
    rules:
      - record: instance:node_cpu_utilization:rate5m
        expr: 100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

      - record: instance:node_memory_utilization:ratio
        expr: 1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)

      - record: instance:node_filesystem_usage:ratio
        expr: 1 - (node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"})
EOF

Create an alerting rules file:

sudo tee /etc/prometheus/rules/alerting_rules.yml <<'EOF'
groups:
  - name: node_alerts
    rules:
      - alert: InstanceDown
        expr: up == 0
        for: 3m
        labels:
          severity: critical
        annotations:
          summary: "Instance {{ $labels.instance }} is down"
          description: "{{ $labels.instance }} of job {{ $labels.job }} has been unreachable for more than 3 minutes."

      - alert: HighCpuUsage
        expr: instance:node_cpu_utilization:rate5m > 85
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "High CPU on {{ $labels.instance }}"
          description: "CPU utilization on {{ $labels.instance }} has been above 85% for 10 minutes. Current value: {{ $value | printf \"%.1f\" }}%"

      - alert: HighMemoryUsage
        expr: instance:node_memory_utilization:ratio > 0.9
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High memory on {{ $labels.instance }}"
          description: "Memory utilization on {{ $labels.instance }} is above 90%. Current value: {{ $value | humanizePercentage }}"

      - alert: DiskSpaceLow
        expr: instance:node_filesystem_usage:ratio > 0.85
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Disk space low on {{ $labels.instance }}"
          description: "Root filesystem on {{ $labels.instance }} is {{ $value | humanizePercentage }} full."

      - alert: NodeExporterDown
        expr: up{job="node_exporter"} == 0
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "Node Exporter down on {{ $labels.instance }}"
          description: "Node Exporter on {{ $labels.instance }} has been unreachable for 2 minutes."
EOF

Validate the rules and reload Prometheus:

promtool check rules /etc/prometheus/rules/*.yml
sudo systemctl reload prometheus

Navigate to Alerts in the Prometheus UI to confirm your alerting rules are loaded and showing as Inactive (green). They will transition to Pending and then Firing when conditions are met.

Step 9: Connect Grafana for Dashboards

Prometheus is excellent for data collection and alerting, but Grafana provides the visualization layer most teams need. If you do not have Grafana installed yet, follow our dedicated guide on installing Grafana on Rocky Linux / AlmaLinux.

Once Grafana is running, add Prometheus as a data source:

  1. Log into Grafana at http://your-server-ip:3000
  2. Go to Connections > Data Sources > Add data source
  3. Select Prometheus
  4. Set the URL to http://localhost:9090 (or the Prometheus server address if Grafana is on a different host)
  5. Click Save and Test to confirm the connection

Import the community Node Exporter dashboard for instant visibility into your fleet:

  1. Go to Dashboards > Import
  2. Enter dashboard ID 1860 (Node Exporter Full) and click Load
  3. Select your Prometheus data source and click Import

This dashboard gives you CPU, memory, disk, and network panels for every host running Node Exporter, with no manual panel creation required.

Production Tips

Running Prometheus in a lab is straightforward. Keeping it reliable in production takes some additional planning.

Retention and storage sizing. The default retention period is 15 days. The systemd unit in this guide sets it to 30 days via --storage.tsdb.retention.time=30d. You can also cap storage by size with --storage.tsdb.retention.size=50GB. Prometheus uses roughly 1 to 2 bytes per sample. Calculate your storage needs as: number_of_series * scrape_frequency * bytes_per_sample * retention_seconds.

Federation. For large environments, use a hierarchical federation setup. Deploy per-datacenter or per-team Prometheus instances that scrape local targets, then have a global Prometheus federate aggregated metrics from each. This keeps scrape load distributed and TSDB sizes manageable.

Remote write for long-term storage. If you need metrics beyond what local TSDB retention allows, configure remote_write to send data to Thanos, Cortex, VictoriaMetrics, or Grafana Mimir. This decouples long-term storage from the Prometheus server lifecycle.

High availability. Run two identical Prometheus servers scraping the same targets. Alertmanager handles deduplication of alerts from both instances. For queries, put a load balancer in front or use Thanos Query to deduplicate and merge data from both replicas.

Security. Prometheus 2.x supports TLS and basic authentication natively. For production deployments, enable TLS on the Prometheus web interface and on Node Exporter endpoints. Use a reverse proxy like Nginx if you need more advanced authentication or need to expose the UI externally.

Troubleshooting Common Issues

Targets showing as DOWN. The most common cause is a firewall blocking port 9100. Verify with curl http://target-ip:9100/metrics from the Prometheus server. If that times out, check firewalld rules on the target host. Also confirm Node Exporter is actually running with systemctl status node_exporter.

TSDB corruption after crash. If Prometheus fails to start after an unclean shutdown with errors referencing WAL or chunks, try starting with the --storage.tsdb.wal-compression flag or run a repair:

promtool tsdb clean /var/lib/prometheus

As a last resort, remove the WAL directory and let Prometheus rebuild. You will lose uncompacted data (the last two hours):

sudo systemctl stop prometheus
sudo rm -rf /var/lib/prometheus/wal
sudo systemctl start prometheus

High memory usage. Prometheus memory consumption scales with the number of active time series. Common culprits include high-cardinality labels (like user IDs or request paths in custom metrics). Use prometheus_tsdb_head_series to check your active series count. If it is growing unbounded, audit your metric instrumentation and drop unnecessary labels with metric_relabel_configs.

Slow queries. PromQL queries over long time ranges with high cardinality will be slow. Use recording rules (Step 8) to precompute heavy aggregations. Also check if --query.max-samples needs adjustment for legitimate large queries.

Configuration reload not working. If you enabled --web.enable-lifecycle in the systemd unit, you can trigger a hot reload via HTTP:

curl -X POST http://localhost:9090/-/reload

Alternatively, systemctl reload prometheus sends a SIGHUP to the process and achieves the same result.

Conclusion

You now have a complete Prometheus monitoring stack on Rocky Linux 10 or AlmaLinux 10: Prometheus server scraping metrics, Node Exporter exposing host-level data on all targets, Alertmanager routing notifications to email and Slack, recording and alerting rules for common failure conditions, and Grafana for dashboards. From here, expand your setup with service discovery for dynamic infrastructure, remote write for long-term retention, and additional exporters for databases, web servers, and application-specific metrics.

Related Articles

Monitoring How To Install Zabbix 7.0 on Ubuntu 24.04|22.04 AlmaLinux Install Prometheus and Grafana on Rocky Linux 10 / AlmaLinux 10 AlmaLinux Remove Gnome Desktop on Rocky / Alma / CentOS 9|8 Rocky Linux Install VirtualBox 7.0 on Rocky Linux 9 / AlmaLinux 9

4 thoughts on “Install Prometheus and Node Exporter on Rocky Linux 10 / AlmaLinux 10”

  1. One small change you can make
    for Rocky9/Alma9 change the blow ling in the yrm.repo file

    baseurl=https://packagecloud.io/prometheus-rpm/release/el/8/$basearch

    to
    baseurl=https://packagecloud.io/prometheus-rpm/release/el/9/$basearch

    Reply
  2. Hi

    Do you know why I get this message when I try to install it?

    Warning: failed loading ‘/etc/yum.repos.d/prometheus.repo’, skipping.

    Thank you

    Reply
    • Problem was with out article do this.

      Delete file sudo rm /etc/yum.repos.d/prometheus.repo

      Then run the script to perform proper repo setup.


      curl -s https://packagecloud.io/install/repositories/prometheus-rpm/release/script.rpm.sh | sudo bash

      Reply

Leave a Comment

Press ESC to close