DailyGlimpse

Monitor Fail2ban in Real-Time with Prometheus and Grafana

AI
May 4, 2026 · 2:55 AM

In the tenth lesson of the Fail2ban Mastery course, we tackle a critical issue: the silent failure of Fail2ban. Without proper monitoring, a stopped service can leave your server exposed. This lesson shows how to set up Prometheus and Grafana to visualize Fail2ban metrics and receive alerts.

The Silent Failure Problem

Fail2ban can stop working due to log rotation, configuration errors, or resource limits, often without any obvious sign. By the time you notice, your server may already be compromised. This is where monitoring becomes essential.

Diagnostic Commands with fail2ban-client

Before diving into monitoring, learn to check Fail2ban status manually:

sudo fail2ban-client status
sudo fail2ban-client status sshd

These commands show active jails and banned IPs, but they don't provide historical data or alerts.

Reading the Fail2ban Log

The log file /var/log/fail2ban.log contains detailed information about bans, unbans, and errors. Logs are useful for troubleshooting but are not designed for real-time monitoring.

Installing the Prometheus Exporter

The fail2ban_exporter exposes Fail2ban metrics to Prometheus. Install it from the official repository or compile from source:

wget https://github.com/example/fail2ban_exporter/releases/latest/download/fail2ban_exporter-linux-amd64.tar.gz
tar xzf fail2ban_exporter-linux-amd64.tar.gz
sudo mv fail2ban_exporter /usr/local/bin/

Create a systemd service and configure it to scrape metrics from the Fail2ban socket.

Building a Grafana Dashboard

Add Prometheus as a data source in Grafana, then create a dashboard with panels for:

  • Number of active bans per jail
  • Ban rate over time
  • Fail2ban service uptime
  • Recent failures from the log

Use these PromQL queries:

# Active bans
fail2ban_jail_banned_total{state="banned"}

# Ban rate
rate(fail2ban_jail_banned_total[5m])

Alert Hygiene

Set up alerting rules in Prometheus for critical conditions:

  • Fail2ban service down
  • Ban rate spike (possible attack)
  • No bans for 24 hours (possible service failure)

Example alert rule:

groups:
  - name: fail2ban
    rules:
      - alert: Fail2banDown
        expr: up{job="fail2ban"} == 0
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Fail2ban exporter is down"

By implementing this monitoring stack, you eliminate silent failures and gain visibility into your server's security posture.

This lesson is part of the Fail2ban Mastery course. For the full series, visit the Dargslan YouTube channel.