Aim for a modern data security approach
Risk, compliance, governance, and security professionals are finally realizing the importance of subjecting sensitive workloads to robust data governance and protection the moment the data begins traversing the data pipeline. Many organizations no longer feel it’s adequate to secure data only once it “settles” in a cloud data warehouse, and know that they must begin safeguarding data at source systems, data transformation subsystems, and analytical stores.
Why current data security approaches are falling short
Most enterprise organizations store their data workloads in the cloud. Unfortunately, this is where the modernization ends for many, as most still rely on traditional data security methods that were built for on-premise environments, where data sources were both small and manageable.
Yet today’s modern data stack is comprised of an explosion of data sources, data consumers, and use cases. Securing data is a primary concern for data and infosec teams as information is no longer limited to what was generated by trusted data consumers or was contained behind their own firewall.
On the other hand, modern architectures are comprised of a hybrid of centralized data infrastructures, disaggregated compute engines, and distributed applications, which makes older approaches to achieving data security untenable. Like the modern data architecture itself, a modern data security approach must be flexible, scalable, and able to support numerous hybrid data ecosystems so that consumers can use multiple data consumption approaches.
Then there are interoperability factors: databases, files, events, and/or APIs may be offered by different vendors, each with their own style and approach to data security. When data is handed-off at various points via ELT (extract, load, transform)/ETL (extract, transform, load) , they each use a specific security posture that oftentimes is not used or known by the others.
Complicating matters further is the lack of common security standards, as traditional products are designed for either operational or analytical data stores and may be unable to interoperate with one another.
Hence, organizations are now rethinking data security by examining the numerous layers of their legacy data stack and determining interoperability, scalability, and security needs without applying any pre-existing assumptions.
The need to implement flexible and scalable data security before data lands in the cloud data warehouse is forcing many data teams to adopt a “shift left” approach to data security where data is safeguarded early in its journey from the source system.
“Shift left” data security defined
Originally designed for software engineers to continuously monitor and test early in software development lifecycles, shift left data security addresses potential data security issues sooner in the data journey. Shift left data governance allows policies to be attached to data workloads as soon as they leave source systems and remain attached all the way to the cloud and to data consumers.
By identifying, preventing, and tackling data governance and security measures earlier, and to the left of the cloud data warehouse, teams can initiate the strong access governance and security capabilities already available on cloud data warehouses and extend them back to data as it leaves source systems. Additionally, it enables data users to ensure the proper policies are attached and applied while data is in motion and at rest.
How to “shift left” data security?
“Shift left” data security has two important components: expanding data observability and establishing comprehensive data governance.
Beginning with data observability, a “shift left” implementation requires that data security become the linchpin before any application is put into production. Instead of being confined to data quality or data reliability, security needs to become another use case application of the underlying data and be unified into the rest of the data observability subsystem. By doing this, data security benefits from the alerts and notifications stemming from data observability offerings.
Data governance platform capabilities typically include business glossaries, catalogs, and data lineage. They also leverage metadata to accelerate and govern analytics. In “shift left” data governance, the same metadata is augmented by data security policies and user access rights to further increase trust and allow appropriate users to access data. Leveraging and establishing comprehensive data observability and governance is the key to data democratization. As a result, these proactive and transparent views over the security of critical data elements will also accelerate application development and improve productivity.
The “shift left” approach for data management is the new north star for data quality, observability, and now data security.
Sensitive and regulated data that is left unprotected before reaching cloud data warehouses means that data is at a high risk of exposure. The concept of data mesh, and initiatives such as data products are pushing the accountability of data to the business domain teams that reside on the left. By applying these “shift left” principles, organizations can supercharge their security and governance by achieving complete regulatory compliance and by allowing faster access to operational data for non-technical users via intuitive self-service.