Lewati ke konten
Kembali ke Blog

Cara Monitoring Server Linux dengan Prometheus dan Grafana

· · 8 menit baca

Monitoring infrastruktur IT adalah krusial untuk memastikan availability dan performance. Prometheus dan Grafana adalah kombinasi modern yang powerful untuk monitoring dan observability. Artikel ini membahas setup lengkap monitoring stack untuk server Linux.

Arsitektur Monitoring Stack

Komponen:
1. Prometheus: Time-series database untuk metrics
2. Grafana: Dashboard dan visualization
3. Node Exporter: Agent untuk Linux system metrics
4. Alertmanager: Alert routing dan management

1. Instalasi Prometheus

Download dan Setup Prometheus

# Buat user untuk prometheus
sudo useradd --no-create-home --shell /bin/false prometheus

Download Prometheus (cek versi terbaru di github.com/prometheus/prometheus)

cd /tmp wget https://github.com/prometheus/prometheus/releases/download/v2.45.0/prometheus-2.45.0.linux-amd64.tar.gz

Extract

tar xvfz prometheus-2.45.0.linux-amd64.tar.gz

Copy binaries

sudo cp prometheus-2.45.0.linux-amd64/prometheus /usr/local/bin/ sudo cp prometheus-2.45.0.linux-amd64/promtool /usr/local/bin/

Set ownership

sudo chown prometheus:prometheus /usr/local/bin/prometheus sudo chown prometheus:prometheus /usr/local/bin/promtool

Buat direktori config dan data

sudo mkdir /etc/prometheus sudo mkdir /var/lib/prometheus

sudo chown prometheus:prometheus /etc/prometheus sudo chown prometheus:prometheus /var/lib/prometheus

Konfigurasi Prometheus

sudo nano /etc/prometheus/prometheus.yml
global:
  scrape_interval: 15s
  evaluation_interval: 15s
  external_labels:
    monitor: 'prometheus'

alerting: alertmanagers:

  • static_configs:
    • targets: ['localhost:9093']

rule_files:

  • "alert_rules.yml"

scrape_configs:

  • job_name: 'prometheus' static_configs:
    • targets: ['localhost:9090']
  • job_name: 'node_exporter' static_configs:
    • targets: ['localhost:9100']
  • job_name: 'remote_servers' static_configs:
    • targets: ['192.168.1.10:9100', '192.168.1.11:9100']

Alert Rules

sudo nano /etc/prometheus/alert_rules.yml
groups:
  - name: node_alerts
    rules:
      - alert: HighCPUUsage
        expr: 100 - (avg by(instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High CPU usage detected"
          description: "CPU usage is above 80% for more than 5 minutes"
  - alert: HighMemoryUsage
    expr: (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100 > 85
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "High memory usage detected"
      description: "Memory usage is above 85%"

  - alert: DiskSpaceLow
    expr: (node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"} * 100) < 10
    for: 5m
    labels:
      severity: critical
    annotations:
      summary: "Low disk space"
      description: "Less than 10% disk space remaining"

  - alert: NodeDown
    expr: up{job="node_exporter"} == 0
    for: 1m
    labels:
      severity: critical
    annotations:
      summary: "Node exporter is down"
      description: "Node exporter has been down for more than 1 minute"

Systemd Service

sudo nano /etc/systemd/system/prometheus.service
[Unit]
Description=Prometheus Monitoring System
Documentation=https://prometheus.io/docs/introduction/overview/
Wants=network-online.target
After=network-online.target

[Service] User=prometheus Group=prometheus Type=simple ExecStart=/usr/local/bin/prometheus \ --config.file=/etc/prometheus/prometheus.yml \ --storage.tsdb.path=/var/lib/prometheus/ \ --storage.tsdb.retention.time=30d \ --web.console.templates=/etc/prometheus/consoles \ --web.console.libraries=/etc/prometheus/console_libraries \ --web.listen-address=0.0.0.0:9090 Restart=always RestartSec=5

[Install] WantedBy=multi-user.target

# Reload systemd
sudo systemctl daemon-reload

Enable dan start

sudo systemctl enable prometheus sudo systemctl start prometheus

Cek status

sudo systemctl status prometheus

2. Instalasi Node Exporter

Node Exporter adalah agent yang meng-export system metrics untuk Prometheus.

# Download (adjust versi)
cd /tmp
wget https://github.com/prometheus/node_exporter/releases/download/v1.6.1/node_exporter-1.6.1.linux-amd64.tar.gz

Extract dan install

tar xvfz node_exporter-1.6.1.linux-amd64.tar.gz sudo cp node_exporter-1.6.1.linux-amd64/node_exporter /usr/local/bin/ sudo chown prometheus:prometheus /usr/local/bin/node_exporter

Node Exporter Systemd Service

sudo nano /etc/systemd/system/node_exporter.service
[Unit]
Description=Node Exporter
Wants=network-online.target
After=network-online.target

[Service] User=prometheus ExecStart=/usr/local/bin/node_exporter \ --path.rootfs=/host \ --collector.filesystem.ignored-mount-points='^/(sys|proc|dev|run)($|/)' \ --collector.netdev.ignored-devices='^(lo|docker. |veth.|br-.*)$' Restart=always RestartSec=5

[Install] WantedBy=multi-user.target

# Enable dan start
sudo systemctl daemon-reload
sudo systemctl enable node_exporter
sudo systemctl start node_exporter

3. Instalasi Grafana

# Install dependencies
sudo apt-get install -y apt-transport-https software-properties-common wget

Add Grafana repository

wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add - echo "deb https://packages.grafana.com/oss/deb stable main" | sudo tee /etc/apt/sources.list.d/grafana.list

Update dan install

sudo apt-get update sudo apt-get install -y grafana

Enable dan start

sudo systemctl daemon-reload sudo systemctl enable grafana-server sudo systemctl start grafana-server

Cek status

sudo systemctl status grafana-server

4. Konfigurasi Grafana

Akses Grafana

Buka browser dan akses: http://your-server-ip:3000

Default credentials:
– Username: admin
– Password: admin (akan diminta ganti saat pertama login)

Add Prometheus Data Source

  1. Klik Configuration → Data Sources
  2. Click “Add data source”
  3. Select “Prometheus”
  4. Set URL: http://localhost:9090
  5. Click “Save & Test”

Import Dashboard

  1. Klik “+” → Import
  2. Import ID 1860 (Node Exporter Full)
  3. Select Prometheus data source
  4. Click Import

Dashboard lain yang direkomendasikan:
Node Exporter: ID 1860
Docker Monitoring: ID 179
Linux Hosts: ID 10180
MySQL Overview: ID 7362

5. Setup Alertmanager

Instalasi

cd /tmp
wget https://github.com/prometheus/alertmanager/releases/download/v0.25.0/alertmanager-0.25.0.linux-amd64.tar.gz
tar xvfz alertmanager-0.25.0.linux-amd64.tar.gz
sudo cp alertmanager-0.25.0.linux-amd64/alertmanager /usr/local/bin/
sudo cp alertmanager-0.25.0.linux-amd64/amtool /usr/local/bin/
sudo chown prometheus:prometheus /usr/local/bin/alertmanager
sudo chown prometheus:prometheus /usr/local/bin/amtool

Buat direktori config

sudo mkdir /etc/alertmanager sudo chown prometheus:prometheus /etc/alertmanager

Konfigurasi Alertmanager

sudo nano /etc/alertmanager/alertmanager.yml
global:
  smtp_smarthost: 'smtp.gmail.com:587'
  smtp_from: '[email protected]'
  smtp_auth_username: '[email protected]'
  smtp_auth_password: 'your-email-password'

templates:

  • '/etc/alertmanager/template/*.tmpl'

route: receiver: 'email-notifications' group_by: ['alertname', 'severity'] group_wait: 10s group_interval: 10s repeat_interval: 1h

receivers:

  • name: 'email-notifications' email_configs:
    • to: '[email protected]' subject: 'Prometheus Alert: {{ .GroupLabels.alertname }}' body: | {{ range .Alerts }} Alert: {{ .Annotations.summary }} Description: {{ .Annotations.description }} Severity: {{ .Labels.severity }} {{ end }}

inhibit_rules:

  • source_match: severity: 'critical' target_match: severity: 'warning' equal: ['alertname', 'instance']

Alertmanager Service

sudo nano /etc/systemd/system/alertmanager.service
[Unit]
Description=Alertmanager
Wants=network-online.target
After=network-online.target

[Service] User=prometheus Group=prometheus Type=simple ExecStart=/usr/local/bin/alertmanager \ --config.file=/etc/alertmanager/alertmanager.yml \ --storage.path=/var/lib/alertmanager \ --web.listen-address=0.0.0.0:9093 Restart=always RestartSec=5

[Install] WantedBy=multi-user.target

# Enable dan start
sudo systemctl daemon-reload
sudo systemctl enable alertmanager
sudo systemctl start alertmanager

6. Custom Metrics dengan Textfile Collector

Node Exporter bisa membaca custom metrics dari text files.

# Buat script untuk generate metrics
sudo nano /usr/local/bin/custom-metrics.sh
#!/bin/bash
TEXTFILE=/var/lib/node_exporter/textfile_collector

Backup metrics

backup_count=$(find /backup -name "*.tar.gz" -mtime -1 | wc -l) echo "backups_completed_today $backup_count" > "$TEXTFILE/backup.prom"

Application metrics

app_errors=$(tail -100 /var/log/app.log | grep -i error | wc -l) echo "application_errors $app_errors" >> "$TEXTFILE/app.prom"

# Jadikan executable
sudo chmod +x /usr/local/bin/custom-metrics.sh

Buat direktori

sudo mkdir -p /var/lib/node_exporter/textfile_collector sudo chown -R prometheus:prometheus /var/lib/node_exporter

Add ke cron

crontab -e

/5 * /usr/local/bin/custom-metrics.sh

7. Query Examples (PromQL)

CPU Usage

# CPU usage percentage
100 - (avg by(instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

CPU usage per core

100 - (avg by(instance, cpu) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

Memory Usage

# Memory usage percentage
(node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100

Memory usage in GB

(node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / 1024 / 1024 / 1024

Disk Usage

# Disk usage percentage
(node_filesystem_size_bytes{mountpoint="/"} - node_filesystem_avail_bytes{mountpoint="/"}) / node_filesystem_size_bytes{mountpoint="/"} * 100

Disk free space

node_filesystem_avail_bytes{mountpoint="/"} / 1024 / 1024 / 1024

Network

# Network traffic rate
irate(node_network_receive_bytes_total[5m])
irate(node_network_transmit_bytes_total[5m])

8. Backup dan Maintenance

Backup Prometheus Data

#!/bin/bash
# backup-prometheus.sh

DATE=$(date +%Y%m%d) BACKUP_DIR=/backup/prometheus mkdir -p $BACKUP_DIR

Stop Prometheus

sudo systemctl stop prometheus

Backup data

tar -czf $BACKUP_DIR/prometheus-data-$DATE.tar.gz /var/lib/prometheus/

Backup config

tar -czf $BACKUP_DIR/prometheus-config-$DATE.tar.gz /etc/prometheus/

Start Prometheus

sudo systemctl start prometheus

Keep 7 days

find $BACKUP_DIR -name "*.tar.gz" -mtime +7 -delete

Maintenance Rutin

# Check disk usage Prometheus
 du -sh /var/lib/prometheus/

Compact data (Prometheus melakukan ini otomatis)

Manual compaction (hati-hati!)

promtool tsdb analyze /var/lib/prometheus/

Kesimpulan

Setup Prometheus dan Grafana memberikan:

  1. Real-time metrics untuk system resources
  2. Historical data untuk trend analysis
  3. Alerting untuk proactive monitoring
  4. Beautiful dashboards untuk visualization
  5. Scalable architecture untuk multi-server monitoring

Stack ini adalah standar industry untuk monitoring infrastruktur modern dan sangat recommended untuk production environments.

Ditulis oleh

Hendra Wijaya

Tinggalkan Komentar

Email tidak akan ditampilkan.