A curated list of Site Reliability and Production Engineering Tools
-
Updated
Mar 18, 2026
A curated list of Site Reliability and Production Engineering Tools
Lightweight, self-contained Linux® server monitoring tool
A Simple Monitoring Dashboard for Docker Swarm Cluster
Data Center workload and software optimizations for Intel hardware.
Dashboard for Docker Swarm Cluster
Advanced stealth web data collection framework for security
Utility to test and wipe hard disks and SSDs
Awesome Uptime Monitoring
Identify unused resources at Google Cloud Platform through Prometheus' metrics
A collection of scripts that extend EventSentry's functionality.
Network-Based Intrusion Detection System - dev/deploy-ment
My Artificial Intelligence Log Sentinel for Postfix and beyond...
Command line client for interacting with checkson.io
Real-time log file monitoring with pattern highlighting and desktop notifications. Cross-platform Rust CLI tool with regex matching, file rotation support, and desktop notifications.
🌐 Explore VandCloud, a cross-platform app to browse, test, and monitor APIs and services with real-time status updates.
Wazuh integration to send alerts to Keep (open-source alert management and AIOps platform)
🤖 Simplify IT operations with Wuhr AI Ops, an AI-driven platform for real-time monitoring, log analysis, and seamless CI/CD management.
🖥️ Monitor RAM and CPU usage in Proxmox for hosts, LXC, and QEMU/KVM VMs with clear visuals and detailed metrics for better resource management.
EagleEye-PowerShell-Insights is a PowerShell monitoring tool that integrates with Azure Application Insights. It enables real-time tracking and visualization of script performance, providing detailed logs and insights in a neural network-style map for better debugging and analysis.
Add a description, image, and links to the monitoring-tools topic page so that developers can more easily learn about it.
To associate your repository with the monitoring-tools topic, visit your repo's landing page and select "manage topics."