Service outage

Incident Report for iAdvize (HA)

Postmortem

On September 16th we experienced service disruptions between 10:55am and 12:55pm (UTC+2) on our platform, impacting all iAdvize services.

We quickly identified the problem was located on our software's orchestrator cluster. For an unknown reason it was causing orchestrated jobs some unwanted restarts or stops.
After further investigations we discovered this misbehaviour was caused by a bug on our software's orchestrator cluster. As a consequence of this bug all our instances were stuck in a restart loop.

After identifying the problem, the following corrective actions have been performed:

At 12:10pm (UTC+2) we managed to stabilize our orchestrator cluster by applying a bug fix
Once the orchestrator cluster was stabilized, we were able to gradually restart all the services

When the restart of all our services was complete, everything went back to normal at 12:55pm (UTC+2) and agents could receive incoming contacts.

The bugfix we applied will prevent this issue from happening in the future, no further actions have been identified after this incident.

Posted Sep 17, 2018 - 11:02 CEST

Resolved

After monitoring the activity of our platform since our last intervention, we didn't notice any new perturbations.

Posted Sep 16, 2018 - 14:03 CEST

Monitoring

The situation is now back to normal, all services are operating normally since 12:52pm (UTC+2). We appreciate your patience as we worked through this and apologize for the inconvenience.

Our technical team continues the monitoring of our infrastructure.

Posted Sep 16, 2018 - 13:18 CEST

Update

Our technical team has performed an intervention. Some services are back, however the situation is not yet back to normal.

We will update as we have more information.

Posted Sep 16, 2018 - 12:29 CEST

Identified

Our platform is currently not fully reachable, this incident is impacting all our services. Our technical team is currently looking into the issue, we will update you as we learn more.

Posted Sep 16, 2018 - 11:07 CEST