The slow death of static security detections: Beginning of SIEM deployments
Machines both mechanical and electric have always been good at counting things. Ask anyone from an earlier generation who still uses a Victor Champion adding machine from the early 1950s, even though replacement paper rolls and ink ribbon are required. One may wonder someone wouldn’t just use a battery operated calculator, but we all know that letting go of the old familiar paradigms is hard.
Static security detections—using a single piece of what might be labeled “critical” information and making a decision to act based on it or even several combined together—has gone the way of the Victor Champion. We’ve been outsmarted and we appear to be in denial. Every large data breach starts with some form of social engineering. Whether it was an email link, a webpage that looked like a bank website but really belonged to an attacker, or a piece of malicious code we picked up that popped up on our screen that told us to change our password, someone someplace will click it. All of us are the weakest link in the chain. As long as humans are involved, they will be social engineered (and they’ll make configuration errors but more on that later).
We are certainly in denial. More and more of us know someone or, know someone that knows someone, that’s been on a security team that’s had a data breach. The reaction corporate reaction is now fairly predictable. The CEO makes an apology to a number of interchangeable constituencies—shareholders, employees, customers and others, tells us that the data breach is due to unauthorized access, “we’re getting to the bottom of it,” and pledges to throw money at the problem in the form of more people and better detection technology.
The problem is that there’s a cyber security skills shortage. How many security colleagues do you know that have that have had a data breach If There are re will be In large enterprises, the security information and event management (SIEM) system is collecting 10,000 alerts per day from security point solutions and of those; maybe 8 or 9 percent are not false-positives or false-negatives. This means that about 800 are alerts that should be followed up on by the incident response (IR) team. But who does that? The security team in most organizations simply isn’t large enough to handle the volume. So the IR team has to manually prioritize 800 alerts per day.
Static detections and the SIEM
In most modern security infrastructures, the SIEM is key as the investigation starting point. The SIEM takes in all of the static detection data it collects from all the various data sources and has it’s own set of specific rules (or questions to be asked of the data) to perform correlations to eliminate false-positives and false-negatives. These rules also support specific high-risk security scenarios defined by the vendor or the user (or contractor).
In addition, it supports key performance metrics reporting, compliance reporting, and ticketing and investigation workflows. SIEM customers generally fall into three categories. From the least prevalent to the most prevalent these are:
1. Active SIEM deployments – These customers actively curate and manage between 300-600 rules with 600 being the very upper edge of the range where most SIEMs can scale. Most the out-of-the-box rule content has been has been removed and replaced by the security team. Many of the rules have been customized internally based on red-team blue-team exercises. New scenarios and data sources are added at a regular cadence.
2. Passive SIEM deployments – For these customers, once they got their data into the SIEM they stayed mostly reliant on the out-of-the-box rule content supplied by the vendor. I am aware of several SIEM customers with as little as seven to eight rules turned on. Very little tuning of the SIEM takes place to match the security and IT environments.
3. Combination of active and passive – This is a very wide range of SIEM deployments that can include some occasional active engagement by a consultant or changes as part of staff augmentation. In some cases, a reduction in security personnel through sudden attrition forces the team to see if there is further value to be gained through tuning. However, most of the time the team only reactively tunes their SIEM. Not much in the way of new data gets added.
There are several things that are common across SIEM platforms and deployments. Once a rule is created (or enabled out-of-the-box) each rule asks a single independent question about the data. This could involve one or more data sources. Also, the questions or rules don’t build upon one another and dynamically ask different questions in different sequences depending on the circumstances. Also, if you are asking 300-600 questions about your data and the attacker’s methodology is outside of the scope of your questions and scenarios, he won’t be seen. In other words, it doesn’t cater to new types of attacks or scenarios not seen before.
Finally, counting on your SIEM to detect attackers actively impersonating users with valid credentials is futile. This requires your SIEM to analyze the log data of normal IT system usage, learn behaviors and understand when enough normal behaviors are “off” to know that this isn’t the credential owner performing these operations. The most widely used SIEMs (nor any of your other traditional security point solutions) do not use normal data as context for abnormal behaviors—they just weren’t engineered that way.
Some SIEMs have added statistical analysis capabilities to find activities that may not be normal. Examples are algorithms for finding the mean, averaging, and standard deviation. While counting things is important, looking at user behavior this way in a dynamic business environment, in the absence of context, introduces a different set of false positives—especially in the area of behavior.
Using standard deviation to look at URL length over time may indicate command and control embedded but could also simply be an overly long web services API call between two machines. Similarly, looking at session length with asset and application context also introduces the prospect of false positives. The business is constantly changing and information about users, applications, services and security detection systems and software are all managed and monitored by different sets of people in different departments with different objectives.
Business roles and responsibilities change, people’s hours change, they go on vacation, and log in from strange places. Most organizations are part or all BYOD. Off-the-shelf algorithms expect linear data from static environments. Gaps in coverage due to lack of data don’t work well with off the shelf algorithms.
The biggest takeaway here is the difference between the marketing perception and reality: SIEMs are marketed and sold as providing “security intelligence” and using “big data analytics.” Your SIEM is doing the job it was designed to do. It asks questions of your data but it is up to the person viewing the data to supply the insight or intelligence to understand what the answers to the questions asked by the SIEM mean in aggregate, what the answers may be in a different order and what the answers mean in context.
In summary, while the comparison breaks down pretty quickly, just like the Victor Champion, today’s SIEM is best at processing mathematical data inputs. Computers take this one step further because they can be programed to perform statistical and mathematical correlations understanding traffic spikes, watching for beaconing hosts or seeing pre-defined relationships in data. This is what it was built for. The SIEM was not built to handle logical data correlations. These are best handled by a human being. The human brain is best at taking into account identities, additional circumstances and context.