High response times on CMA and CDA - Admin interface not loading

Incident report for DatoCMS

Resolved

This incident was caused by a customer with a large user base using a private plugin that went rogue.

The plugin started making a lot of API calls and since the user base of this customer is very large the entire platform was hit by a DDoS. This caused generic slow downs as the requests were authenticated as users and were hitting directly our web servers.

To avoid this from happening again we have added rate-limitations to JWT tokens, that are used by plugins acting as users, something that we didn't anticipate before.

Posted at Nov 6, 13:56 GMT+00:00

Resolved

We can close the issue now

Posted at Nov 2, 15:32 GMT+00:00

Monitoring

We noticed another batch of unusual traffic at reported time.

We will deploy some countermisures soon.

Posted at Nov 2, 12:48 GMT+00:00

Monitoring

We experienced about 10 minutes of high response times on our CMA and CDA endpoints. This means that requests sent to those endpoints (site-api.datocms.com, graphql.datocms.com) started to have high latencies or timeout errors (503).

The admin interface was affected too, since it uses the CMA endpoint itself.

The cause was as unusual amount of traffic that we are going to investigate further.

We are monitoring the situation right now.

Posted at Nov 2, 11:45 GMT+00:00