How to combat alert fatigue in cybersecurity

In this Help Net Security interview, Ken Gramley, CEO at Stamus Networks, discusses the primary causes of alert fatigue in cybersecurity and DevOps environments. Alert fatigue results from the overwhelming volume of event data generated by security tools, the prevalence of false positives, and the lack of clear event prioritization and actionable guidance.

alert fatigue organizations challenge

What are the primary causes of alert fatigue in cybersecurity and DevOps environments?

Alert fatigue is the result of several related factors.

First, today’s security tools generate an incredible volume of event data. This makes it difficult for security practitioners to distinguish between background noise and serious threats.

Second, many systems are prone to false positives, which are triggered either by harmless activity or by overly sensitive anomaly thresholds. This can desensitize defenders who may end up missing important attack signals.

The third factor contributing to alert fatigue is the lack of clear prioritization. The systems generating these alerts often don’t have mechanisms that triage and prioritize the events. This can lead to paralyzing inaction because the practitioners don’t know where to begin.

Finally, when alert records or logs do not contain sufficient evidence and response guidance, defenders are unsure of the next actionable steps. This confusion wastes valuable time and contributes to frustration and fatigue.

Reducing alert fatigue is a significant challenge for organizations. How can they optimize their security tech stack to overcome this challenge?

It really is a challenge. We’ve seen organizations that, unfortunately, have simply decided to log all the alerts and only inspect them when there’s been an incident detected by a more trusted system. While, often, this logged alert data does contain a substantial trail of evidence that can be critical in an incident investigation, this “store and ignore” approach is not the ideal solution.

The three most important components of a modern security operations center (SOC) are the network detection and response (NDR) system, the endpoint detection and response (EDR) system, and the central analytics engine (usually a security information and event management (SIEM) system). Each of these elements of the so-called “SOC visibility triad” plays an important role in reducing alert fatigue.

Your NDR and EDR systems must have a reliable mechanism for identifying serious and imminent threats in their respective domains with ultra-high precision – that is, with near-zero false positives. This drives confidence in the toolset and can provide a starting point for the security analyst to begin an investigation. In addition, they should provide some form of automated event triage or prioritization, which can serve to highlight the next tier of events that the SOC team must investigate.

Lastly, the NDR and EDR must collect all relevant artifacts associated with a given security event, and if possible, correlate and organize them into an incident timeline to accelerate the investigation and allow defenders to eradicate the threat before it’s able to cause any damage.

The NDR and EDR are critical sources of security telemetry into your SIEM, so that’s where the next level of alert fatigue reduction takes place. Each individual event record or log that the NDR and EDR sends to the SIEM should be enriched with substantial metadata that provides the SIEM analytics engine and its users with all the relevant evidence and related information needed to inform the incident response efforts. In addition, these detailed event records can feed an additional layer of correlated threat detection in the SIEM itself.

How can organizations use contextual information to enrich alerts and make them more actionable?

This is critical. There are several types of context that can be helpful here. Organization-specific information – such as host names and familiar network names – can really make identifying the assets under attack or those being used to propagate malware, for example, much easier than say an IP address alone. Without this context being included in the alert record or log, an analyst needs to pivot into a different system to look up this information.

Another form of context comes in the form of associated metadata and artifacts. What I’m talking about here are things like protocol transaction logs, file attachments, and even complete packet captures (PCAPs) of the session during which the alert took place.

This additional information is proven to help SOC personnel more quickly assess the severity, sources, and causes of an incident, making these alerts much more actionable.

How can organizations balance the need for transparency with the potential risks of exposing sensitive information?

This topic is near and dear to my heart. At Stamus Networks, we have a very strong commitment to extreme transparency and data sovereignty – both of which factor into this question. That point notwithstanding, balancing transparency with information security is a difficult tightrope walk for organizations. There are a number of strategies that organizations can employ, but here are some that stand out to me and that we tend to see in the practices of successful security leaders.

First, they build a program of controls based on a recognized security framework – such as NIST or ISO 27001. This not only creates a defensible program, but it ensures they are considering the big picture and not forgetting important controls.

Next, they place a strong emphasis on instrumenting their systems and networks with extensive security monitoring. This allows them to spot serious threats and unauthorized activity earlier in the kill chain.

Additionally, these organizations develop a clear and transparent communication plan that outlines what information can and cannot be shared. This builds trust and avoids confusion within the organization and with stakeholders.

Finally, these organizations pay particular attention to where their data resides and where it is processed – and they exercise what I refer to as “extreme data sovereignty,” which is maintaining tight control over data residency and processing.

What role do regulatory requirements and industry standards play in promoting transparency and accountability in cybersecurity?

Regulatory requirements and industry standards play an important role in promoting transparency and accountability by driving both breach disclosure and the implementation of strong cybersecurity controls. Regulations such as the SEC’s Form 8-K filings in the U.S. and GDPR in the European Union mandate reporting data breaches to authorities and, in some cases, affected individuals. This compels the organizations to be upfront about security incidents, fostering public awareness and preventing potential cover-ups.

The 10-K filing requirement from the SEC requires public companies to disclose details about their cybersecurity programs. Similarly, the EU’s NIS Directive, which focuses on essential service providers, compels them to implement risk management measures. By making these controls visible, stakeholders (and shareholders) can assess an organization’s cybersecurity posture and hold them accountable for maintaining strong defenses.

How can organizations leverage new technologies and frameworks to improve transparency and accountability?

The elements of the “SOC visibility triad” I mentioned earlier – NDR, EDR, and SIEM are among the critical new technologies that can help. These systems constantly monitor networks for suspicious activity, allowing for faster identification and mitigation of threats. Real-time threat detection fosters transparency as organizations can communicate about ongoing threats and the actions being taken.

I’ve already mentioned the importance of cybersecurity frameworks – these help organizations identify, protect, detect, respond to, and recover from cyberattacks. By publicly outlining their approach based on the frameworks, organizations demonstrate their commitment to cybersecurity and can be held accountable for following their established processes.

Don't miss