Large UK companies are finding themselves in an uncomfortable position as artificial intelligence moves from a promising technology to a routine business tool: many senior technology and data leaders say they do not fully understand what happens to their data once it leaves the UK, particularly when it is used to train or run AI systems overseas.
A new survey of senior technology and data executives points to a gap that is both operational and strategic. While organisations may have policies on paper—covering privacy, security, retention and consent—the survey suggests that day-to-day visibility into cross-border data handling is often weaker than expected. The result is a mismatch between how confidently companies talk about governance and how clearly they can explain the real-world journey of information across jurisdictions.
This matters because AI is rarely “contained.” Even when a company’s core systems are hosted domestically, AI workflows frequently involve external vendors, cloud platforms, model providers, analytics services and third-party integrations. Each handoff introduces uncertainty: where exactly does the data go, who can access it, how long is it retained, and what transformations occur before it returns as an output? For many organisations, the survey indicates that these questions are not consistently answered with the level of detail that risk teams, regulators and customers increasingly expect.
The survey’s central finding is straightforward but significant: a substantial number of respondents reported lacking a clear understanding of how information is stored, processed and used outside the UK. That lack of clarity is not limited to junior staff or isolated teams. It appears among senior technology and data executives—people who, in theory, should have oversight of architecture, vendor contracts, data lineage and compliance controls.
So why does this happen in large organisations that invest heavily in governance?
One reason is that AI supply chains are complex by design. Modern AI systems are built from layers: data ingestion pipelines, feature engineering, storage layers, orchestration tools, model training environments, inference endpoints, monitoring dashboards and logging systems. Data can be copied for redundancy, cached for performance, replicated for disaster recovery, and routed through multiple services. Even if a company intends to keep data within the UK, the underlying infrastructure may still involve global components—especially when using widely adopted cloud services or managed AI platforms.
Another reason is that “data use” is broader than many people assume. When executives think about cross-border transfers, they often focus on the moment data is sent to a vendor. But AI usage can involve additional processing steps that are less visible: preprocessing and normalisation, embedding generation, tokenisation, indexing, evaluation, and ongoing model improvement. Some of these steps may occur automatically in ways that are not obvious to the business owner of the dataset.
There is also the question of what counts as “the data.” In AI contexts, organisations may treat raw personal data as the primary concern, but AI systems frequently work with derived representations—such as embeddings, summaries, classifications and other intermediate outputs. These derivatives can still be linked back to individuals depending on the context and the technical setup. If a company cannot map where those derivatives are created and stored, it may underestimate the extent of cross-border exposure.
The survey’s findings raise a further issue: confidence without clarity. Many organisations have mature compliance frameworks for traditional data processing—think customer databases, marketing lists, HR records and transactional logs. Yet AI introduces new processing patterns that can outpace existing governance. A company might know that it has a lawful basis for processing personal data, but still struggle to answer more granular questions such as whether data is used for training by a third party, whether it is retained for debugging, whether it is accessible to support teams abroad, or whether it is used to improve models across customers.
In other words, legal compliance is not the same as operational transparency. A contract may state that data will not be used for training, but the organisation may not have the technical evidence to verify how the system behaves in practice. Conversely, a contract may allow certain uses, but the company may not have internal visibility into the exact scope of those uses. Both scenarios can create risk—either regulatory risk, reputational risk, or both.
For UK businesses, the stakes are heightened by the expectations around international data transfers. Cross-border data flows are not merely a technical matter; they are a governance and accountability matter. Regulators and customers increasingly want to see that organisations can demonstrate control: that they understand where data goes, how it is protected, and how they manage third parties. When senior executives report limited understanding, it suggests that accountability may be distributed in ways that are hard to audit.
This is where the survey becomes more than a snapshot of confusion. It points to a structural challenge in how organisations manage AI governance. Many companies treat data governance as a set of local policies—what happens in the UK, what happens in the company’s own systems, what happens under domestic oversight. But AI governance is inherently global. Even if the company’s decision-making is UK-based, the infrastructure and service ecosystem may be international. That means governance must be designed to follow the data wherever it travels, not just where it originates.
A unique angle in the current moment is that AI adoption is accelerating faster than governance maturity. Organisations are under pressure to deliver AI-driven productivity, customer service improvements, fraud detection enhancements, and automation benefits. Procurement teams may move quickly to secure vendor capabilities. Product teams may integrate AI features into customer-facing workflows. Meanwhile, governance teams may be asked to “catch up” after the fact—reviewing contracts, assessing risks, and trying to retrofit controls into architectures that are already live.
This creates a predictable pattern: governance becomes reactive rather than proactive. Instead of mapping data flows before deployment, companies may discover cross-border processing only after integration. Instead of validating vendor behaviour with technical tests, they may rely on documentation that is difficult to interpret or verify. Instead of maintaining a living inventory of data processing activities, they may have static records that become outdated as systems evolve.
The survey suggests that even senior executives can be caught in this cycle. That does not necessarily mean they are unaware of the existence of cross-border transfers. It means they may not have enough detail to be confident about the full lifecycle of data in AI contexts.
What does “lack of understanding” look like in practice?
It can show up as uncertainty about storage locations and replication practices. Data may be stored in one region but replicated elsewhere for resilience. Logs may be shipped to a central monitoring system located abroad. Backups may be encrypted and stored in different geographies. Support tooling may require access to data for troubleshooting. Without a clear map of these mechanisms, executives may know that “it’s handled securely,” but not know precisely where and how.
It can also show up as uncertainty about processing purposes. AI vendors may offer multiple modes: some configurations are designed for strict isolation, while others allow broader processing for service improvement. Even when training is disabled, vendors may still process data for operational reasons such as performance monitoring, quality assurance, and incident response. Those operational uses can still involve cross-border access.
Another form of uncertainty is about retention. Data retention policies are often described at a high level—“we retain for X days”—but AI systems can retain data longer than expected due to caching, indexing, or model evaluation workflows. Derived data may persist even after raw inputs are deleted. If retention is not tracked across all layers, the organisation may not know what remains accessible and for how long.
Finally, lack of understanding can appear as uncertainty about downstream sharing. AI outputs can be fed into other systems: ticketing platforms, CRM tools, analytics pipelines, or human review workflows. If those downstream systems are hosted abroad or integrated with foreign services, the original data may effectively continue its journey beyond the initial vendor relationship.
The survey’s implications extend beyond compliance checklists. They touch on trust. Customers, employees and partners increasingly expect transparency about how their data is used, especially when AI is involved. If a company cannot explain cross-border handling clearly, it may struggle to respond to customer inquiries, procurement questionnaires, or regulatory investigations. Even when the company is acting responsibly, the inability to articulate details can undermine confidence.
There is also a competitive dimension. Organisations that can demonstrate robust AI data governance—clear data lineage, documented transfer mechanisms, verified vendor controls, and measurable safeguards—will likely find it easier to win enterprise deals. Large customers often require evidence, not assurances. They want to see how data is protected end-to-end, including when it is processed by AI systems and third parties.
So what can companies do to close the gap suggested by the survey?
First, they need a living inventory of AI-related data flows. Traditional data inventories often focus on structured datasets and known systems. AI workflows require a broader approach: identifying every point where data is ingested, transformed, stored, logged, embedded, and used for inference or training. This includes not only the main dataset but also derived artifacts and metadata.
Second, they need to treat vendor documentation as a starting point, not the finish line. Contracts and policy statements should be complemented with technical validation. Where possible, organisations should test whether data is used for training, confirm retention behaviour, and verify access patterns. This can include reviewing vendor logs, conducting controlled experiments, and requiring transparency reports or audit artefacts.
Third, governance should be integrated into procurement and deployment processes. If AI governance is bolted on after integration, uncertainty will persist. Instead, organisations should embed data transfer and AI-specific risk assessments into vendor selection and architecture design. That means asking detailed questions early: where data is stored, how it is processed, whether it is used for model improvement, what support access looks like, and how deletion works across all layers.
Fourth, companies should strengthen internal ownership. The survey indicates that even senior executives may not have full visibility. That suggests a need for clearer accountability across functions: technology teams for architecture and data lineage, legal and compliance teams for transfer mechanisms and contractual terms, security teams for controls and monitoring, and product teams for how AI features are used in practice. When responsibilities are blurred, knowledge becomes fragmented.
Fifth, organisations should build governance that follows the data internationally. This is the most important conceptual shift. Data
