Creating Custom Scripts for Server Uptime and Latency Checking
When managing servers, relying on third-party monitoring tools can be expensive or inflexible. Writing your own custom scripts for server uptime and latency checking gives you total control over how and when you measure network health. This article walks you through practical, lightweight approaches using common scripting languages.
Why Build Custom Monitoring Scripts?
Pre-built monitoring solutions often lock you into specific metrics or dashboards. Custom scripts let you monitor exactly what matters: server uptime, latency thresholds, packet loss, and response times. You can integrate them into existing cron jobs, webhooks, or CI/CD pipelines without overhead.
Core Metrics to Check
- Uptime — Is the server reachable? Use ping or TCP connect.
- Latency — Round-trip time (RTT) in milliseconds.
- Service availability — Is the web server responding on port 80/443?
- DNS resolution — Verify hostname resolves correctly.
Script 1: Bash Ping Check with Timestamps
A simple bash script can continuously monitor uptime and log latency. Here’s a snippet that pings a target server and records failures:
#!/bin/bash
TARGET="example.com"
LOGFILE="/var/log/uptime_check.log"
if ping -c 4 -W 3 "$TARGET" > /dev/null 2>&1; then
echo "$(date) - UP - RTT: $(ping -c 1 "$TARGET" | tail -1 | awk '{print $4}')" >> "$LOGFILE"
else
echo "$(date) - DOWN" >> "$LOGFILE"
fi
This script uses the ping command with a 3-second timeout. It logs latency only on success, avoiding noisy output. Set it as a cron job every minute for near-real-time checks.
Script 2: Python HTTP Latency Checker
Python’s requests library makes HTTP monitoring straightforward. This script measures response time for a web server and alerts if latency exceeds a threshold:
import requests, time, sys
url = "https://your-server.com/health"
threshold = 500 # ms
start = time.time()
try:
resp = requests.get(url, timeout=5)
latency = (time.time() - start) * 1000
if resp.status_code == 200 and latency < threshold:
print(f"OK - {latency:.2f}ms")
else:
print(f"WARNING - Status {resp.status_code}, latency {latency:.2f}ms")
except:
print("CRITICAL - Server unreachable")
sys.exit(2)
Run this with Python 3. You can pipe the output to a notification system like Slack or PagerDuty.
Adding TCP Port Checks
For services that don’t use HTTP, use TCP socket checks. In Python, the socket module tests if a port is open:
import socket
host, port = "db-server.local", 5432
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.settimeout(3)
result = sock.connect_ex((host, port))
if result == 0:
print("Port open")
else:
print("Port closed")
sock.close()
This works for PostgreSQL, MySQL, Redis, and more.
Best Practices for Production Scripts
- Set timeouts — Always include timeout to avoid hanging scripts.
- Log rotation — Use
logrotateto prevent disk fill-up. - Alerting — Combine scripts with email or webhook notifications.
- Dry-run mode — Add a verbose flag for testing without side effects.
Integrating with Monitoring Ecosystems
Custom scripts can send metrics to Prometheus or Nagios via exit codes and stdout. For example, return 0 for OK, 1 for warning, 2 for critical. Tools like Telegraf can exec your script and parse numeric output.
Testing and Debugging
Always test scripts in a staging environment. Use strace or tcpdump to verify network behavior. Check for edge cases like DNS failures or intermittent packet drops.
Custom scripts for server uptime and latency checking are simple to build, easy to maintain, and far more flexible than many SaaS alternatives. Start with a basic ping script, then layer on HTTP and TCP checks as your infrastructure grows.