Elevated API Errors
Incident Report for Anodot
Resolved
Summary of the incident:
Started 13:20 UTC - All components were partially affected.
14:10 UTC - Data pipeline recovered.
18:00 UTC - All services were recovered.
During this incident customer didn't receive any alerts (14:10 UTC - 18:00 UTC).
From 13:20 UTC till 14:00 UTC some customers may see gaps in the graphs in dashboards and metric search.
From 18:00 UTC the system was fully functional.
Posted 19 days ago. Sep 01, 2019 - 05:47 UTC
Update
We manage tor recover all our services and they are back to normal operation.
We keep monitoring it as this incident is not fully resolved from AWS side. Here is the latest update from AWS:

10:47 AM PDT We want to give you more information on progress at this point, and what we know about the event. At 4:33 AM PDT one of 10 datacenters in one of the 6 Availability Zones in the US-EAST-1 Region saw a failure of utility power. Backup generators came online immediately, but for reasons we are still investigating, began quickly failing at around 6:00 AM PDT. This resulted in 7.5% of all instances in that Availability Zone failing by 6:10 AM PDT. Over the last few hours we have recovered most instances but still have 1.5% of the instances in that Availability Zone remaining to be recovered. Similar impact existed to EBS and we continue to recover volumes within EBS. New instance launches in this zone continue to work without issue.
Posted 19 days ago. Aug 31, 2019 - 19:57 UTC
Update
We are continuing to monitor for any further issues.
Posted 19 days ago. Aug 31, 2019 - 19:54 UTC
Monitoring
A fix has been implemented and we are monitoring the results.
Posted 19 days ago. Aug 31, 2019 - 19:49 UTC
Update
We are continuing to work on a fix for this issue.
Posted 19 days ago. Aug 31, 2019 - 19:47 UTC
Update
Update:
We are in a process of recovering AWS failing nodes and data, AWS incident is not fully resolved yet. Here is the latest update from AWS:
"Recovery is in progress for instance impairments and degraded EBS volume performance within a single Availability Zone in the US-EAST-1 Region. We continue to work towards recovery for all remaining affected instances and EBS volumes."
Posted 19 days ago. Aug 31, 2019 - 17:15 UTC
Update
We are continuing to work on a fix for this issue.
Posted 19 days ago. Aug 31, 2019 - 15:58 UTC
Identified
There is an issue at AWS US East which is being investigated by AWS support. The issue is with the EC2 Service.
see https://status.aws.amazon.com/
Posted 19 days ago. Aug 31, 2019 - 14:49 UTC
Update
We are continuing to investigate this issue.
Posted 19 days ago. Aug 31, 2019 - 13:42 UTC
Investigating
We're experiencing an elevated level of API errors and are currently looking into the issue.
Posted 19 days ago. Aug 31, 2019 - 13:23 UTC
This incident affected: Data Collection, Alerts, Composite Metrics, and Anomalies.