Datadog’s new AI/ML capabilities enable IT teams to resolve application performance issues
Datadog announced two new capabilities for Watchdog, its AI engine: Log Anomaly Detection and Root Cause Analysis.
In today’s highly dynamic application environments, it is impossible for engineers to anticipate and develop rules to detect all possible anomalous application behavior that could impact performance and availability. Embedded across Datadog’s observability platform, Watchdog analyzes billions of events and learns what “normal” behavior looks like in order to proactively provide insight to users for anomalies they didn’t anticipate. The two new capabilities of Watchdog take this one step further.
Log Anomaly Detection automatically understands and baselines normal patterns in logs, and proactively discovers abnormalities such as new text patterns, meaningful changes in data volumes of existing patterns and error outliers. With this new capability, Datadog Log Management users are able to quickly see and address hidden issues before they turn into critical incidents.
Root Cause Analysis works with Datadog’s APM products to automatically identify causal relationships between symptoms of an issue across an organization’s services. By doing so, it pinpoints the precise service where an issue originated. Additionally, this capability identifies the business impact of an issue when Datadog’s Real Using Monitoring (RUM) is deployed in the environment. This unique new capability often solves in minutes the problems of causality and real user impact, each of which often take hours or days to solve with manual troubleshooting.
“The constant challenge with AI is balancing alert volume. If the alert volume is too high, it may overload your monitoring systems and lead to alert fatigue; if it’s too low, you might miss something that could critically impact your business,” said Brent Montague, Site Reliability Architect at Cvent. “Watchdog helps our teams focus on the signals that matter by surfacing events that typically aren’t caught by traditional monitors. Looking at Watchdog every morning helps me gain a better understanding of everything happening across our entire technology stack. With the help of Root Cause Analysis, we have all the vital information we need so that our teams are able to investigate and address business-critical issues quickly and efficiently.”
“With the increasing complexity of cloud-based environments and the constantly growing volumes of telemetry data, businesses are finding it challenging to separate key signals from all the noise when they are monitoring their technology stack,” said Omri Sass, Group Product Manager of Application Performance Monitoring at Datadog. “We built Watchdog as a ubiquitous layer of intelligence that serves in-context insights directly in the user’s workflow and points them to the areas that need their attention the most.”
Both Root Cause Analysis and Log Anomaly Detection require no additional configuration and are available to Datadog APM and Log Management users out of the box.