SSL certificate expiration alerting broken
Incident Report for Checkly
Due to a regression in the SSL certificate expiration alerting job, no alerts were sent for SSL certificates nearing expiration from December 12, 2023 to March 8, 2024. Affected customers have been directly notified, and the alerting is again working as expected.

Checkly’s SSL alerting is an older, rarely changed feature. Unlike other alerts, it does not go through our regular AlertingService, which is actively monitored through metrics like number of alerts sent or failed and similar. Instead, SSL alerting is based on a daily scheduled job that is monitored with a heartbeat check only. On Dec 12, 2023 - a regression was introduced to the SSL alerting job that caused the job to fail sending alerts, but without throwing an error. Since we only alerted based on job success until now, this was not caught by our monitoring and went unnoticed until March 5, 2024, when support reported a missing SSL alert to the engineering team.

We are reviewing our entire monitoring for all Checkly alerting currently to make sure we do not have other such blindspots for older alerting features. SSL alerting will now be actively monitored and alerted by metrics such as number of alerts sent rather than just by tracking job success or failure. That way we can prevent such a high time to detect moving forward.
Posted Mar 08, 2024 - 12:00 UTC