runbook.cloud uses a combination of hard-won experience and state of the art machine learning techniques to help identify problems that exist inside your AWS account, whatever the cause.
On subscribing to runbook.cloud a role is added which allows us read only access to metrics within your account. Using this role, we continually discover what AWS services you are using, and monitor them for problems. Using machine learning, we understand what is and isn't a problem for your infrastructure, and produce alerts accordingly.
The ML Difference
Traditionally, systems monitoring has been a process of setting thresholds which indicate a problem - for example, "raise a warning if CPU is above 85%, raise a critical alarm if it is above 95%".
This leads to many unnecessary alerts, because whilst this condition may be indicative of a problem for one application, it may be perfectly normal for another. Alerts are then either ignored or disabled, leading to missed opportunities to spot problems.
runbook.cloud works differently, learning from the patterns in your infrastructure. Our systems look at and learn from normal usage and take data points from across your entire AWS estate in combination.