Policy automation to eliminate configuration errors
Far too often, major security breaches can be traced back to a configuration error. Changes and adjustments to network and security configurations are unavoidable; they are a necessary part of managing a company’s technology environment. But it’s important to recognize that they are also risky and can have unexpected consequences – from service interruptions, performance degradation and unintended downtime to security breaches and violations of compliance requirements.
A complex environment
On the surface, it might seem like configuration errors should be an easily solvable problem: organizations should simply pay more attention to any changes and manually make sure all settings are correct every time a change is made. They should set up policies and ensure they’re followed and require that any new adjustments are checked against those policies. The Four Eyes principle was adopted by many organizations to reduce mistakes – one person would design and request for a change to be made, and another person would approve it. Sometimes yet a third person would go on to implement it.
Unfortunately, this is all easy on paper, but not in practice. The problem is that today’s large organizations are complex; there are many moving parts to deal with at any given time. There are also a variety of teams with the ability to make changes and adjustments, making it exponentially more difficult to ensure correct configurations. Many of these teams are operating with different languages as well.
Visibility into the entire environment is also an issue – if you want to be able to review and ensure every change is in accordance with security policy, you need to see and/or be alerted to every change or adjustment. It goes without saying that, even with full visibility, manually reviewing and approving all changes is simply not humanly possible.
When all the variables to consider are added up, the workload becomes overwhelming to deal with. There are far too many tasks to accomplish and too many potential gaps to cover. To successfully control how every update, change and addition is implemented – and to understand how each change affects the environment and other changes that are already “in flight” – the only solution is to embrace automation.
Automation enables agility
No one wants to see a potential security breach happen because configurations and changes weren’t watched closely enough; but if you spend all your time on this issue, then you’ll have bigger problems to deal with. In addition, it’s a waste of critical resources to have security teams focus on mundane tasks instead of more strategic activities, especially when there is an easy solution. The key to accomplishing more tasks than what seems possible is to fully embrace automation when it comes to configuration and changes.
There are several key functions where automation can be applied immediately to help gain control over configuration changes:
Automatic change analysis and design: There is no such thing as a simple configuration change. Even what appears to be the easiest, most benign change could cause an error. For example, suppose you’re adding a host to a network group to provide access, and you are unaware that the same group is used in a different place to block traffic. If you aren’t paying close attention, you’ve let a potential issue slip by. A simple issue like this could increase the attack surface and leave your systems overexposed, or block access to a critical system or service. Your team would then need to spend a great deal of time troubleshooting the problem and figuring out where a misstep was made – or worse, how to mitigate a breach!
By adding automation to network visibility, you will automatically be given an overview of the entire organization, with areas of critical importance highlighted for you, so you can see recent changes and requests and potential issues and know what to concentrate your time on.
Guardrails and policy compliance: With automation, all requests can be automatically reviewed against security policy and standards, indicating for you what potential effect they’ll have on the overall environment. You can also easily prove compliance – or realize where changes could put compliance at risk. Change requirements, or developer guardrails, can be established to ensure nothing is approved that can create an issue with security or affect normal operations.
Will making the change live potentially cause a problem? Is there another element of your environment that will need to be tweaked to support the change? Automation can answer these questions and automatically approve or deny a request or flag a change as needing direct review and adjustment to maintain compliance.
Automated reporting, documentation, and auditing: All changes, reworked configurations and requests should be logged and documented. This task alone could be a full-time job for a member of your team. Instead, look to automated tools to maintain accessible and actionable audit information. A comprehensive audit trail should include the device or platform whose configuration was changed, the exact time of the change, the configuration details, the people who were involved (requestor, approvers, implementer), and the change context such as the project or application.
The goal is to enable continuous improvement of your security policies, management processes and, ultimately, continuous reduction of your attack surface. The only way to truly be successful in doing so is to learn from the past and apply lessons learned. Auditing and documentation of changes are key to having a more robust security posture.
Preventing issues and speeding up recovery
Experts agree that a significant part of the recovery time during incident response or when an error is uncovered is actually spent figuring out what configuration was changed, when, why and by whom. If you have already set up these control processes and have embraced automation, you’ll then quickly have the necessary information in front of you when there’s a crisis – and can concentrate on rolling back any changes, halting a breach and speeding the path to repair or recovery.
In addition, by integrating your policy automation with an incident response plan or system, you can immediately improve your ability to decrease dwell time and speed up incident response. Your risk of missed configuration errors causing a security breach significantly reduced when you implement policy-based automation.
Implementing automated solutions that can automate these error-prone, repetitive tasks and maintain a vigilant, 24/7 watch over your environment will go a long way to helping you prevent (and easily recover from) any configuration errors.