AI vs Bad Data: Rebuilding Trust in Financial Crime Compliance

Elizabeth Travis
2 days ago
7 min read

For over two decades, the financial services industry has invested heavily in the architecture of compliance. Transaction monitoring systems, sanctions screening engines, customer risk models and suspicious activity reporting workflows have been built, tested, upgraded and rebuilt. Regulatory expectations have sharpened. Enforcement penalties have escalated. In 2025 alone, anti-money laundering (AML) related fines exceeded $1.1 billion in the US, with crypto exchanges, money transmitters and securities firms bearing the heaviest regulatory consequences, according to the Institute for Financial Integrity (IFI). The machinery of financial crime prevention has never been more extensive.

It has also never been more fragile. The single greatest vulnerability in that machinery is not a gap in regulation or a shortage of technology. It is the quality of the data on which everything else depends. Poor data corrupts risk models, weakens controls, generates false positives at industrial scale and creates operational blind spots that criminals are adept at exploiting.

The compliance industry has spent billions on systems designed to detect financial crime. Those systems are only as reliable as the information they consume. When the data is wrong, the controls do not fail gracefully. They fail silently.

Bad data is the sector’s most persistent vulnerability

The problem is not new, but its consequences are deepening. Financial institutions operate with data drawn from multiple sources: customer onboarding records, payment messages, third-party screening databases, beneficial ownership registers and regulatory watchlists. Each source carries its own inconsistencies. Names are transliterated differently across jurisdictions. Addresses are incomplete, outdated or formatted in ways that defeat automated matching.

Dates of birth are missing or unverifiable. Entity identifiers are absent or duplicated. The result is a compliance environment built on foundations that are structurally unreliable.

The effects are measurable. Research consistently shows that financial institutions report false-positive rates exceeding 95 per cent when using traditional sanctions screening tools, according to Sanctions.io. A study by the Bank Policy Institute (BPI) found that name-based screening typically yields zero true matches while overwhelming compliance teams with alerts that require manual review. These are not marginal inefficiencies. They represent a systemic failure to distinguish signal from noise, one that drains investigative capacity, delays legitimate transactions and allows genuine risks to pass undetected beneath the volume of irrelevant alerts.

The regulatory response has been to demand more data, not necessarily better data. The Financial Action Task Force’s (FATF) revised Recommendation 16 (R16), adopted in June 2025, introduces enhanced originator and beneficiary information requirements for cross-border payments, including geographic data, structured identifiers and alignment verification. The European Union’s (EU) Instant Payments Regulation, fully live since October 2025, requires real-time payee verification.

These are necessary reforms. They will also amplify the consequences of poor data quality for any institution that has not addressed its underlying data infrastructure. More fields in a payment message do not improve compliance if the information populating those fields is inaccurate, incomplete or unverifiable. Regulation cannot outrun bad data.

Rule-based systems amplify what they cannot correct

Traditional compliance systems operate on fixed rules and static thresholds. They are designed to flag transactions that match predetermined patterns: amounts above a certain value, transfers to high-risk jurisdictions, names that approximate an entry on a sanctions list. These systems do not interpret context. They do not learn from past decisions. When the data feeding these systems is inconsistent, the rules do not compensate; they compound the inconsistency.

A misspelled beneficiary name triggers a sanctions alert. A truncated address prevents automated screening from completing. An absent date of birth forces a manual review that adds hours to a process designed to operate in seconds. The operational cost is significant but the strategic cost is worse.

Compliance teams buried under false positives cannot allocate investigative attention to the cases that matter. Siloed systems, where AML, know-your-customer (KYC) and payment workflows operate on separate data sets, produce fragmented risk assessments that miss connections visible only in aggregate. Regulators now expect proof that controls work in practice, not merely that they exist on paper. A compliance programme that generates thousands of alerts but investigates none effectively is not a programme; it is an audit liability. Volume is not vigilance.

AI can strengthen data integrity, not just process volume

This is where artificial intelligence enters the conversation, and where precision matters. The promise of AI in financial crime compliance is not simply that it processes more data more quickly. It is that it can impose coherence on data that was previously incoherent. Machine learning models can evaluate dozens of variables simultaneously, distinguishing between names that are genuinely similar and names that share superficial characteristics. Natural language processing (NLP) can interpret unstructured data, extracting risk-relevant information from news articles, corporate filings and correspondence that rule-based systems cannot parse.

The evidence is compelling. Research from the Federal Reserve Board, cited by KPMG in its Financial Crime Fall 2025 report, found that large language models reduced sanctions screening false positives by 92 per cent. Detection rates improved by 11 per cent compared to traditional fuzzy matching. The Sanction Scanner Financial Crime and Compliance Report for 2025–2026 noted that 73 per cent of financial institutions had implemented AI in fraud detection, up from 49 per cent in 2024.

Crucially, the value of AI in this context is not limited to detection. It extends to data remediation. AI-powered systems can identify inconsistencies across customer records, reconcile conflicting data points from multiple sources and flag records that require enrichment before they enter a screening workflow. This upstream function, cleaning and structuring data before it reaches the compliance engine, may prove more consequential than any downstream improvement in alert quality. The institutions that treat AI as a tool for data integrity rather than merely a faster screening engine will be the ones that achieve genuine compliance effectiveness.

The hallucination problem is real but manageable

No honest assessment of AI in compliance can ignore the hallucination risk. Large language models (LLMs) generate plausible but incorrect outputs with disconcerting confidence. In compliance contexts, an AI system might fabricate regulatory references, misattribute enforcement actions or generate risk assessments grounded in data that does not exist. The Financial Industry Regulatory Authority’s (FINRA) 2026 Regulatory Oversight Report explicitly identified hallucinations, bias and cybersecurity risks as concerns that firms must address before deploying generative AI. The EU AI Act, which came progressively into force throughout 2025, imposes transparency, explainability and human oversight obligations on AI systems deployed in regulated financial services contexts.

The risk is real. It is also manageable, provided firms approach AI deployment with the same rigour they would apply to any other critical control. The safeguards are well understood. Retrieval-augmented generation (RAG) grounds AI outputs in verified reference data rather than relying on the model’s internal training. Human-in-the-loop oversight ensures that no AI-generated compliance decision is acted upon without expert validation.

Explainability frameworks document the rationale behind every alert, escalation and disposition. Continuous model monitoring tracks performance drift, bias emergence and output accuracy over time.

These are not theoretical aspirations. They are operational requirements. Institutions that deploy AI without them are replacing one form of unreliable output with another.

The distinction that matters is between AI as an autonomous decision-maker and AI as an augmentation of human judgement. In financial crime compliance, the latter is the appropriate model. AI should surface, structure and prioritise information. The compliance professional should interpret, challenge and act on it. The EU AI Act’s insistence on explainability is not a bureaucratic constraint; it is an acknowledgement that decisions affecting individuals and institutions must be traceable to a reasoning process that a regulator can examine.

Governance is not an afterthought; it is the architecture

The firms that will extract genuine value from AI in compliance are those that treat governance as a foundational design principle rather than a retrospective control. This means establishing clear ownership of AI models within the three lines of defence. It means defining acceptable confidence thresholds below which human review is mandatory. It means conducting periodic audits that assess not only whether the AI is performing but whether it is performing for the right reasons.

Guidehouse’s 2026 analysis of AI in financial services emphasised that governance complexity is now one of the primary barriers to responsible deployment. Too many voices slow approvals and dilute focus, while too few create accountability gaps that regulators will identify.

The governance framework must also address the data pipeline itself. AI models trained on poor data will reproduce the biases and errors embedded in that data with greater efficiency and at greater scale. Data quality governance, including lineage tracking, validation protocols and reconciliation between internal and external sources, is a precondition for effective AI deployment. Without clean inputs, no model is trustworthy.

The direction of travel is clear. The FATF’s revised R16 emphasises structured data. The EU mandates ISO 20022 messaging formats. The Legal Entity Identifier (LEI) is gaining traction for corporate transactions globally.

Each of these developments converges on one regulatory expectation: data should be accurate, verifiable and machine-readable. Institutions that achieve this standard will find that AI amplifies their compliance capability. Institutions that do not will find that AI amplifies their exposure.

Turning unreliable data into a strategic advantage

The compliance teams that will navigate this transition most effectively are those that recognise three interconnected priorities. First, data remediation must precede AI deployment. Implementing machine learning on top of fragmented, inconsistent data sets does not solve the data problem; it scales it. Before investing in AI-powered screening or monitoring, firms should audit their data architecture, identify structural weaknesses and invest in the unglamorous but essential work of cleaning, standardising and enriching their records.

Second, AI procurement decisions must be governed by explainability criteria as rigorously as they are governed by performance metrics. A model that reduces false positives by 90 per cent but cannot explain why it dismissed a particular alert is not a compliance asset; it is a regulatory risk. Third, human oversight must be designed into the workflow, not bolted on after deployment. The compliance analyst reviewing an AI-generated case narrative is not a formality. That analyst is the control.

Trust is rebuilt from the data up

The financial crime compliance sector has spent two decades building systems on data it never fully trusted. The arrival of AI does not resolve that tension; it sharpens it. AI deployed on unreliable data will produce unreliable outputs with greater speed and apparent confidence. AI deployed on clean, structured, verified data, with appropriate governance and genuine human oversight, can do what no rule-based system ever could: it can learn, adapt and improve.

The question for every institution is not whether to adopt AI. It is whether they have built the data foundations on which AI can be trusted to perform. Compliance credibility has always depended on the integrity of the information that underpins it. That principle has not changed. The tools have.

Do you know whether your compliance data is reliable enough to support AI-driven controls?

At OpusDatum, we help firms assess the integrity of their compliance data and build governance frameworks that ensure AI is deployed responsibly, effectively and in alignment with regulatory expectations.

If you are evaluating AI for financial crime compliance or reviewing the quality of your underlying data, contact us to discuss how we can help.