A Greenhouse Recruiting update was released on September 9, 2020 at 16:45 UTC that led to 19 minutes of increased error rates for Greenhouse Recruiting customers between 17:02-17:15 UTC and 18:00-18:06 UTC. The incident was fully resolved on September 9, 2020 at 23:51 UTC.
September 9, 2020
17:02 UTC: Increased errors started to occur for Greenhouse Recruiting
17:15 UTC: Greenhouse Recruiting performance was restored
17:38 UTC: Greenhouse Predicts was temporarily disabled to improve stability
18:00 UTC: Increased errors began again
18:06 UTC: Greenhouse Recruiting performance was restored
18:48 UTC: Interview Stats was temporarily disabled to improve stability
23:01 UTC: Interview Stats was restored
23:51 UTC: Incident was fully resolved
September 11, 2020
16:12 UTC: Greenhouse Predicts was restored
WHAT WAS THE EFFECT?
Some requests to the Greenhouse Recruiting application failed or took a long time to complete. A portion of customers experienced increased error rates for up to 19 non-contiguous minutes, between 17:02-17:15 UTC and 18:00-18:06 UTC.
WHO WAS AFFECTED?
Greenhouse Recruiting customers.
WHAT WAS THE CAUSE?
Beginning at 16:45 UTC, the release of simplified interviewing permissions caused an increased database load, which in turn, created a slowdown in requests that we did not have the capacity to handle. This was exacerbated by timeouts connecting to our internal Greenhouse Predicts service. Due to the high load, servers were not able to respond to health checks, and were automatically restarted, lowering overall capacity.
Beta tests were conducted for this roll-out, but did not cause performance issues.
WHAT ARE WE DOING TO PREVENT THIS FROM OCCURRING AGAIN?
We apologize for the inconvenience this incident has caused. We take the reliability of our application seriously and are actively working to prevent similar incidents like this one from occurring in the future. If you have any questions or concerns, please reach out via: https://support.greenhouse.io/hc/en-us/requests/new.