Logging

Logging is the practice of emitting timestamped records of events from running software so operators can reconstruct what happened. Modern systems use structured logging, where each record is a JSON object with named fields (level, timestamp, service, request_id, message, plus arbitrary attributes), making logs queryable rather than just human-readable.

Log levels

  • TRACE / DEBUG: fine-grained, developer-focused, usually off in production
  • INFO: normal operational events worth noting
  • WARN: something unexpected but recoverable
  • ERROR: a request or job failed
  • FATAL: the process cannot continue

Conventions that matter

  • Structured fields, not string interpolation. log.info({user_id, action}) beats log.info("user 123 did x") because it is queryable.
  • Correlation IDs. A request_id or trace_id threaded through every log line links a single request across services.
  • PII discipline. Never log credentials, tokens, or sensitive customer fields.
  • Sampling at volume. High-traffic services drop a fraction of INFO logs to control cost; ERRORs are always kept.

Common tools

  • Libraries: pino (Node), zap (Go), structlog (Python), Logback (Java), tracing (Rust)
  • Aggregators: Loki, Elasticsearch (ELK), OpenSearch, Datadog, Splunk, Sumo Logic, Better Stack

Subscribe to Sahil's Playbook

Clear thinking on product, engineering, and building at scale. No noise. One email when there's something worth sharing.
[email protected]
Subscribe
Mastodon