Incident response in the cloud can be simple if you are prepared
If your business has moved toward off-premises computing, there’s a bonus to the flexibility and scalability services that AWS and Microsoft 365 can provide. Incident response (IR) in the cloud is far simpler than on-premises incident response.
There is a catch, though: All the tools you need to do IR reside in the platform of your favorite cloud providers and SaaS products, so you need to do some initial setup to be prepared for an incident.
Centralize your logging
Default log dashboards in the cloud are not built for incident response investigations. This is why SIEM solutions such as GCP Chronicle and Azure Sentinel exist on each of the major platforms. But these solutions only enhance the native features that can make IR in the cloud straightforward, if—and only if—those features are engaged.
Taking advantage of the cloud’s built-in incident response begins with centralizing all your logging.
In general, two sorts of actions can be logged:
- A “read” action reveals information about the cloud environment and its components without modifying it
- “Write” actions make changes to the environment such as creating new accounts, adding new users and deploying services.
Logging “write” actions that modify the cloud account is crucial for detection. But for an incident investigation, it’s not enough. Thorough incident response requires the ability to see the full scope of actions taken by a threat actor, both “read” and “write” events.
Setting up fully centralized logging is crucial and so is maintaining it. This requires checking the health and coverage of the feed.
Often we find that logging was limited in the central logging solution to cut cost. Meanwhile, the client team may assume they’d have a complete set of logs as knowledge of these limitations was lost as people left the company. If your organization faces cost-related problems, we advise implementing procedures to store the filtered-out logs in cold storage, which allows ingestion to the centralized logging solution if required.
You can’t count on the platforms
Almost all providers allow you to download actions from a specific timespan using their default log portals. But we found that these portals, although updated constantly, have limitations on the integrity of downloads for longer periods. And after downloading them you have to process the logs for analysis in some manner, which can create a major obstacle if you’re responding to an active incident.
Furthermore, most cloud log portals have throttling for the on-demand downloads to protect the overall availability of the log services for all clients. This can be a big problem if you’re investigating a sizeable cloud environment.
In the best-case scenario, a lack of centralized logs puts you an hour behind—an hour that may be crucial for your response. That’s why all the cloud services still need centralized logging.
Default logging is not enough
Often the services you use stacked on top of your cloud account is where you’ll suffer most of the impact of a cyber incident. Unfortunately, few of these services also have logging turned on by default.
The logging of the services used in the cloud should be also specifically considered. This may require you to define the simple configuration, which takes time. But the costs of failing to set up logging on these services can be severe.
Consider a case where logs are not enabled, and an AWS S3 storage bucket has been made public by mistake. When a regulator asks you who accessed this data, you won’t be able to answer as the evidence does not exist. This can lead to larger fines and more consequences for your organization.
Tag and map your assets
One of the most difficult parts about on-premises IR is tracking assets. This often hinders responders as they try to prioritize which computers to secure or investigate first.
In the cloud, mapping an environment is also far easier than in an on-premises network, and you can do it from anywhere. Evidence collection is also simplified. Utilizing cloud native tools instead of third-party tooling, evidence can be captured from the comfort of your house/office without having to send someone to a data center. However, these snapshots may be almost worthless if they aren’t properly tagged to help the investigation team with the context around these snapshots.
At a bare minimum, cloud resources should be tagged with the cost center, person responsible, relevant service and role of that cloud resource to the service. Without this information, valuable time will be lost trying to derive the context around the resource.
Volume snapshots without proper tagging, for instance, rarely provide the evidence necessary for your investigation. Investigation of a single volume snapshot may quickly become a review of all volume snapshots. And again, crucial time is lost.
Establish responder accounts
Even if you have all the logs you need, your security team may not be able to access them. Therefore, you need responder accounts for your cloud environment created before an incident begins. These accounts become critical if you need to share logs with an external vendor for third-party assurance or support.
With indirect—or read-only access—these responder accounts can access logs and log dashboards and begin an investigation. These accounts won’t be able to make changes to the environment and will require contact with the cloud administrators to remediate the threat actor directly. However, if your security team understands the direct implications of making changes to policies and resetting credentials in the cloud environment, direct access for responder accounts can make sense.
Take advantage of the cloud’s advantages
Traditional IR was born in the first decade of this century when operating systems were not designed with security in mind. This required investigators to rely upon evidence that was unintentionally left in system.
With cloud solutions, there is a baseline of data there, waiting to be investigated. When you’re examining an AWS compromise, for instance, that investigation relies almost entirely on logs. Usually, you’re not doing digital forensics where that involves parsing digital files to find out how the compromise took place. This is because threat actors, like any user, are limited in which actions can take place in a cloud environment. And almost all actions are in the logs. Thus, the investigation relies on a data source that is relatively complete and easy-to-parse.
Compare this to on-premises IR where the evidence may be spotty and in varying formats that require specific parsing—work that can require days, if not weeks.
Sooner or later, you’ll suffer a compromise, and the time and resources you will save by preparing your cloud for IR will more than pay for itself. And the regret that comes from not taking these steps may last longer than any incident.