Degraded Performance for Greenhouse Business Intelligence Connector
Incident Report for Greenhouse
Postmortem

On Monday Night around 10PM Eastern time, the provider who generates the CSV files for our back-up process declared an incident that they felt was resolving. With this information, we allowed the nightly BIC sync to occur as normal. However, it became apparent overnight that their incident was not going to clear in time for the BI Connector to finish on time. As the 9 AM deadline approached, we manually failed over to our back-up system. Silos 3 and 4 recovered quickly, but the volume of customers on Silo 1 and 2 created degraded overall performance for the BI system and some customers ended up with no data for May 26th, as some of the jobs had not completed by the time the loads for May 27th were about to kick off.

Going forward, we will make the BI system more resilient for when our CSV provider fails. This will include:

  1. Being more pro-active in failing over when the provider declares an incident on their end.
  2. When data has not been received by an expected deadline, automatically failing over to the back-up system.

We apologize for this failure and any problems that it caused.

Posted May 27, 2021 - 12:39 EDT

Resolved
GH's database provider had an issue with providing data at the expected time. This caused large delays in transmitting the nightly data for most customers on Silos 1 and 2.
Posted May 26, 2021 - 09:00 EDT