
Presented by Capital One
Data security remains one of the least mature areas in enterprise cybersecurity. according to IBM35% of breaches in 2025 involved unmanaged data sources or “shadow data”. This reveals a systematic lack of basic information awareness. It’s not for lack of tools or investment. Because many organizations still struggle with the most fundamental questions: What data do we have? Where does he live? How does it move? And who is responsible for this?
In an increasingly complex ecosystem of data sources, cloud platforms, SaaS applications, APIs and AI models, answering these questions becomes even more difficult. Closing the maturity gap in information security requires a cultural change that does not make security an afterthought. Instead, protection is embedded throughout the entire data lifecycle based on robust inventory, clear classification, and scalable mechanisms that translate policy into automated safeguards.
Don’t look like a foundation
The most persistent barrier to information security maturity is basic visibility. Organizations often focus on how much data they store, but not what that data contains. Does it contain personally identifiable information (PII)? Financial information? Health information? Intellectual property? Without this level of understanding and inventory, it is more difficult to implement meaningful protection.
However, this can be avoided by opting for enterprise capabilities that can detect large and diverse areas of sensitive data. Detection should be integrated with the transaction, data should be deleted where it is no longer needed, and the application should ensure data security by conforming to a well-defined policy.
Mature organizations must approach information security as an “understand your environment” challenge. Take inventory, classify what’s in the ecosystem, and align protections with classification rather than relying solely on perimeter controls or point solutions at scale.
Chaotic data protection
One reason data security lags behind other security domains is that data itself is chaotic. Unlike perimeter security that relies on open ports and defined borders, data is largely unpredictable. That is, the same underlying data can appear in many different formats: structured databases, unstructured documents, conversation transcripts, or analytical pipelines. Each may have a slightly different encoding or transformation that introduces unexpected and often undetectable changes to the data itself.
Human behavior exacerbates the problem, with various actions introducing risks in ways that perimeter control simply cannot anticipate. This could be anything from a credit card number copied into a freeform comment field, a spreadsheet emailed outside of the intended audience, or a set of data assigned to a new workflow.
When protection closes at the end of the workflow, organizations create blind spots. They rely on downstream inspections to catch upstream design flaws. Complexity accumulates over time, and the risk of exposure becomes a question of when, not if.
A more resilient model assumes that sensitive data will be exposed in unexpected places and formats, so protection is built in from the moment the data is captured. Defense in depth becomes a design principle: segmentation, encryption at rest and in transit, tokenization, and layered access control.
Critically, these safeguards travel through the information lifecycle from reception to processing, analytics and publication. Instead of enforcing control, the organization designs for chaos. They accept variability as a given and build systems that remain safe even when the data deviates from expectations.
The extent of management with automation
Data security is operationally sustainable when management is done through automation at its source. When combined with clear expectations to create bounded contexts: teams understand what is allowed, under what conditions, and with what safeguards data can be used effectively.
This is more important today than ever. Artificial intelligence systems often require access to large amounts of data across domains. This makes policy implementation particularly difficult. Doing this effectively and safely requires deep understanding, strong governance policies and automated protection.
Security techniques such as synthetic data and token swapping allow organizations to protect analytics context while making sensitive values harder to read. Code-like policy instances, APIs, and automation can handle tokenization, deletion, retention limits, and dynamic access controls. With firewalls built into the platforms they use, engineers can focus on innovating with more data and securely improving business outcomes.
AI systems must also operate within the same management and monitoring expectations as human workflows. Permissions, telemetry, and controls over what models can access are important, along with the information they can publish. Management will always bring some degree of friction. The goal is to understand, navigate, and increasingly automate this friction. Purpose validation, use case registration, and provision of access dynamically based on role and need should be clear, repeatable processes.
At the enterprise scale, this requires centralized capabilities that enforce cybersecurity policy in the data domain. This includes detection and classification engines, tokenization and detokenization services, and ownership and taxonomy mechanisms that drive retention enforcement and risk management expectations into day-to-day execution.
When done well, governance becomes an enabler rather than a bottleneck. Metadata and classification automatically drive protection decisions while accelerating business discovery and use. Data is protected throughout its lifecycle with strong safeguards such as tokenization and deleted when required by regulation or internal policy. Teams shouldn’t need to “manually touch the data” for every control decision, policy is implemented by design.
Building for the future
Simply put, bridging the information security maturity gap is less about adopting a single breakthrough technology than it is about operational discipline. Build the map. Classify what you have. Build security into workflows so security can be replicated at scale.
Three priorities stand out for business leaders looking for measurable progress over the next 18 to 24 months.
First, create a robust inventory and metadata-rich map of the data ecosystem. Sight is not negotiable. Second, categorize clear, actionable policy expectations. Clarify what protection each category requires. And finally, invest in scalable, automated protection schemes that integrate directly into development and data workflows.
As protection moves from reactive bolt-on controls to proactive built-in protections, compliance is simplified, control is enhanced, and AI training can be achieved without compromising rigor.
Learn more about how Capital One DataboltAn enterprise data security solution from Capital One Software can help your business become AI-ready by securing sensitive data at scale.
Andrew Seaton is Vice President, Data Engineering – Enterprise Data Detection & Protection, Capital One.
Sponsored articles are content produced by a company that paid for the post or has a business relationship with VentureBeat and is always clearly marked. Contact for more information sales@venturebeat.com.




