nginx + Certbot Setup Health Check

You've successfully set up SSL/TLS for your nginx server using Certbot – congratulations! That's a critical step towards securing your web presence. But the job isn't done once the green padlock appears. Certificates expire, configurations drift, and automation can silently fail. Leaving your certificate setup unmonitored is a ticking time bomb, leading to service outages, security warnings, and a scramble to restore trust.

This article will guide you through practical health checks for your nginx and Certbot setup. We'll cover how to verify that your certificates are valid, Certbot is renewing them correctly, and nginx is serving them without issue. These checks are designed to be run manually or integrated into your own monitoring scripts, giving you peace of mind long before a certificate expiry becomes a crisis.

Why Health Checks are Crucial

Imagine this: your site suddenly shows a "Your connection is not private" error. Users are fleeing, and your SEO ranking takes a hit. The culprit? An expired SSL certificate. This isn't a rare occurrence; it happens to countless websites every day.

While Certbot automates much of the renewal process, it's not infallible. Changes in your server environment, DNS, firewall rules, or even an update to Certbot itself can break the automated renewal. A robust health check strategy helps you:

  • Prevent Downtime: Catch potential issues before they cause an outage.
  • Maintain Trust: Ensure your visitors always see a secure connection.
  • Avoid Panic: Diagnose and fix problems calmly, rather than in a crisis.
  • Verify Automation: Confirm that Certbot's scheduled tasks are actually running and succeeding.

Key Components to Monitor

Your nginx + Certbot setup relies on several interconnected pieces. A thorough health check should look at each of them:

  1. Certificate Validity: Is the certificate currently valid, and when does it expire?
  2. Certbot Renewal Process: Is Certbot scheduled to run, and is it successfully renewing certificates?
  3. nginx Configuration: Is nginx correctly configured to use the latest certificates, and can it reload without errors?
  4. System Automation: Are the systemd timers or cron jobs responsible for Certbot running as expected?
  5. Network Accessibility: Can your server be reached on ports 80 and 443, and are DNS records pointing correctly?

Let's dive into the practical steps for checking each of these.

Checking Certbot Renewal Status

Certbot's primary job is to renew your certificates before they expire. The first step in your health check should be to verify that this process is working or would work.

1. Dry Run a Renewal

Certbot provides a --dry-run option that simulates the renewal process without actually modifying your certificates. This is an invaluable tool for testing.

sudo certbot renew --dry-run

What to look for: * The following certificates are not due for renewal yet:: This is normal if your certificates still have plenty of time left. * Cert not yet due for renewal: Again, normal. * Congratulations, all renewals succeeded. The following certs have been renewed:: If you did have certificates due for renewal, this indicates success. * Any error messages: This is the critical part. If you see errors related to DNS, webroot, or plugin issues, you have a problem to investigate. Common errors include Timeout during connect (likely firewall problem), DNS problem: NXDOMAIN, or The client lacks sufficient authorization.

2. Inspect Certbot Logs

Certbot logs its activities, including renewal attempts and any errors, to /var/log/letsencrypt/.

sudo tail -f /var/log/letsencrypt/letsencrypt.log

You can also use grep to search for specific terms or dates:

sudo grep -i "renewal failed" /var/log/letsencrypt/letsencrypt.log
sudo grep -i "renewing an existing certificate" /var/log/letsencrypt/letsencrypt.log | tail -n 10

What to look for: * Messages indicating successful renewals. * Any ERROR or CRITICAL entries that might point to a persistent issue. * A pattern of failed attempts followed by success, or consistent failures.

Verifying nginx Configuration and Certificate Usage

After Certbot renews a certificate, it needs to tell nginx to use the new files. This usually happens via a post-hook or deploy-hook that reloads nginx.

1. Test nginx Configuration

Before reloading nginx, always test its configuration. A syntax error could bring your entire web server down.

sudo nginx -t

What to look for: * nginx: the configuration file /etc/nginx/nginx.conf syntax is ok * nginx: configuration file /etc/nginx/nginx.conf test is successful * Any error messages: If there are syntax errors, you'll see them here. These must be resolved before proceeding.

2. Reload nginx

If the configuration is okay, you can safely reload nginx to pick up any new certificate files.

sudo systemctl reload nginx
# or, for older systems:
# sudo service nginx reload

What to look for: * No error messages. A successful reload usually returns silently. * If you encounter an error, check sudo journalctl -xe or sudo systemctl status nginx for details.

3. Check Served Certificate Expiry

Finally, verify that nginx is actually serving the correct, renewed certificate. You can do this by examining the certificate directly on your server or by connecting to your website.

On the server:

sudo openssl x509 -in /etc/letsencrypt/live/yourdomain.com/fullchain.pem -noout -dates

Replace yourdomain.com with your actual domain. This command reads the certificate file that Certbot manages.

Via your website (more robust):

echo | openssl s_client -servername yourdomain.com -connect yourdomain.com:443 2>/dev/null | openssl x509 -noout -dates

This command connects to your website over SSL and extracts the expiry dates of the certificate currently being served. This is the most reliable check as it confirms everything from DNS to nginx is working.

What to look for: * notBefore=... (start date) * notAfter=... (expiry date) * Ensure the notAfter date is in the future, ideally more than 30 days away. Let's Encrypt certificates are valid for 90 days, so after a successful renewal, you should see an expiry date roughly 90 days from the renewal date.

Confirming Certbot Automation

Certbot renewals are typically automated via systemd timers or cron jobs. You need to ensure these are active and correctly configured.

For systemd (most modern Linux distributions):

sudo systemctl list-timers | grep certbot

What to look for: * An entry for certbot.timer (or similar). * The NEXT column should show a future time, typically within the next 12 hours, as Certbot usually checks twice a day. * The LAST column should show a recent time, indicating it has run before. * ACTIVATES should show certbot.service.

You can also check the status of the timer and service:

sudo systemctl status certbot.timer
sudo systemctl status certbot.service

For cron:

sudo crontab -l
# or check system-wide cron jobs:
# sudo cat /etc/cron.d/certbot

What to look for: * An entry like 0 */12 * * * root test -x /usr/bin/certbot -a \! -d /run/systemd/system && perl -e 'sleep int(rand(3600))' && certbot -q renew (or similar). * This indicates Certbot is scheduled to run, often twice a day.

Network Accessibility and DNS Checks

Sometimes, the issue isn't with Certbot or nginx, but with the network path to your server.

1. DNS Resolution

Ensure your domain resolves to the correct IP address.

dig yourdomain.com A +short
dig www.yourdomain.com A +short

Or for IPv6:

dig yourdomain.com AAAA +short

What to look for: * The output should be your server's public IP address.

2. Port Accessibility

Verify that ports 80 (for HTTP challenges, if used) and 443 (for HTTPS) are open.

curl -vI http://yourdomain.com/.well-known/acme-challenge/test
curl -vI https://yourdomain.com/

The first command tests if the webroot is accessible over HTTP, which is often required for Certbot to perform its challenge. The second verifies HTTPS connectivity.

What to look for: * For the HTTP test, a successful connection (even if it's a 404 for a non-existent file) indicates port 80 is open. * For the HTTPS test, a successful connection and an HTTP/2 200 (or similar) status code. Look for the * SSL connection using... line to confirm TLS is working.

Common Pitfalls and Edge Cases

Even with these checks, things can go wrong. Here are some common pitfalls:

  • **Firewall