Monitoring

Monitoring Microservices with Prometheus and Grafana

December 28, 2023•14 min read•Amol Tribhuwan

Monitoring Microservices with Prometheus and Grafana

Observability is the cornerstone of reliable distributed systems. In this guide, we'll set up a complete monitoring stack using Prometheus for metrics collection and Grafana for visualization.

Why Prometheus?

Prometheus has become the industry standard for cloud-native monitoring because of:

Pull-based model: It scrapes metrics from services
Service Discovery: Automatically finds targets in Kubernetes/Consul
PromQL: Powerful query language
Dimensional Data: Labels make filtering easy

Architecture

[App 1] <--- Scrape --- [Prometheus] ---> [Grafana]
[App 2] <--- Scrape ---/      ^
                              |
                        [Alertmanager]

Step 1: Instrumenting Applications

To let Prometheus scrape metrics, your app needs to expose a /metrics endpoint.

Node.js Example

const express = require('express');
const client = require('prom-client');

const app = express();
const collectDefaultMetrics = client.collectDefaultMetrics;

// Probe every 5th second
collectDefaultMetrics({ timeout: 5000 });

app.get('/metrics', async (req, res) => {
  res.set('Content-Type', client.register.contentType);
  res.end(await client.register.metrics());
});

app.listen(3000);

Step 2: Configuring Prometheus

Create a prometheus.yml configuration file:

global:
  scrape_interval: 15s

scrape_configs:
  - job_name: 'node-app'
    static_configs:
      - targets: ['localhost:3000']
    
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

Step 3: Running with Docker Compose

version: '3'

services:
  prometheus:
    image: prom/prometheus:latest
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
    ports:
      - "9090:9090"

  grafana:
    image: grafana/grafana:latest
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=secret
    ports:
      - "3000:3000"
    depends_on:
      - prometheus

Step 4: Visualizing in Grafana

Login to Grafana (admin/secret)
Add Data Source -> Prometheus -> URL: http://prometheus:9090
Import Dashboard (e.g., Node Exporter Full - ID: 1860)

Important Metrics to Watch

RED Method

Rate: Request rate (req/s)
Errors: Error rate (%)
Duration: Request duration (latency)

USE Method (for Infrastructure)

Utilization: % time busy
Saturation: Queue length
Errors: Count of errors

Alerting

Don't just look at dashboards. Set up alerts for critical issues.

# alert.rules.yml
groups:
- name: example
  rules:
  - alert: HighErrorRate
    expr: job:request_latency_seconds:mean5m{job="myjob"} > 0.5
    for: 10m
    labels:
      severity: page
    annotations:
      summary: High request latency

Conclusion

A robust monitoring stack gives you the confidence to deploy faster. By measuring what matters (RED/USE), you can detect and fix issues before users notice.

Resources

#Monitoring#Prometheus#Grafana#Observability