The Impossible Standard: Perfection, Agentic AI & Financial Crime Compliance

Elizabeth Travis
May 18
8 min read

Autonomous Zoox car turning on a city street with "TURN" painted on the road. Urban setting with tall buildings and a clear sky.

In January 2026, the Office of Financial Sanctions Implementation (OFSI) published a penalty notice imposing a £160,000 monetary penalty on Bank of Scotland plc for breaches of the Russia (Sanctions) (EU Exit) Regulations 2019. The facts were unremarkable by the standards of sanctions enforcement. A designated Russian individual, subject to UK sanctions since 2017, opened a personal current account using a UK passport. The name on the passport differed from the OFSI Consolidated List due to common Russian-to-English transliteration variations.

The bank’s automated screening did not reconcile those character differences. A separate politically exposed person alert was generated two weeks later, but human review failed to escalate the match. Twenty-four payments totalling over £77,000 were processed before the account was restricted.

Yet the regulatory response was measured, not existential. OFSI did not treat the failure as evidence that screening technology is inherently unsafe. It did not call for automation to be rolled back. The penalty notice focused on governance, reasonable enhancement expectations, escalation discipline and the absence of a commercial sanctions list. A 50 per cent reduction was applied for voluntary disclosure and cooperation. The message was one of proportionate accountability.

This is how human-centred compliance failure is treated: serious, but intelligible; correctable, not disqualifying. When agentic AI enters the conversation, that pragmatism evaporates.

The compliance fiction of zero error has always been a fiction

Financial crime compliance has long struggled with what might be described as the fiction of zero tolerance. On paper, breaches are unacceptable. In practice, every experienced compliance professional understands that errors occur daily.

Alerts are mis-prioritised. Sanctions matches are missed. Cases are closed prematurely. Suspicious activity reports are submitted late or with weak narratives. Regulators do not expect the absence of error. They expect reasonable systems and controls, proportionate to risk, supported by governance, escalation and remediation.

The Financial Action Task Force (FATF) has been explicit on this point for over two decades. The risk-based approach was never intended to eliminate risk, but to prioritise resources where harm is most likely. The UK Financial Conduct Authority (FCA) has repeatedly reinforced that compliance failures are assessed in context, with a focus on whether firms took reasonable steps rather than whether outcomes were flawless.

The Bank of Scotland case illustrates this precisely. OFSI’s penalty notice made clear that the breach arose from a combination of systems limitations and human handling within routine operations. The designated individual had been subject to UK sanctions since 2017 and held a UK passport obtained in January 2023. The bank’s screening system lacked sufficient enhancement to reconcile transliteration variants.

A politically exposed person (PEP) review commenced on 20 February 2023 identified adverse media linking the customer to sanctions designations. However, human error led to the incorrect assessment that the individual had been removed from the UK sanctions list. The account remained unrestricted for a further four days, during which £75,000 was credited. OFSI assessed the case as serious, but not most serious. The regulatory framework absorbed the failure without questioning the legitimacy of automation itself.

SAR quality failures are normalised; AI failures are not

The same inconsistency is visible in the treatment of suspicious activity reporting. For years, UK law enforcement has drawn attention to the variable quality of suspicious activity reports (SARs). The National Crime Agency’s (NCA) UK Financial Intelligence Unit published updated best practice guidance in November 2025, structured around what constitutes a high-quality SAR. The guidance emphasises that reports require complete, structured data entered in the correct fields, clear articulation of suspicion and sufficient context for intelligence to be actionable. The very existence of such guidance confirms that quality remains an unresolved challenge.

The FCA has echoed these concerns consistently, noting that weak narratives undermine the intelligence value of reporting and that firms must demonstrate improvement trajectories. The NCA’s SARs Annual Report for 2023–24, published in March 2025, recorded over 872,000 SARs received in the reporting period. Volume is not the problem. Utility is.

Yet despite this persistent and well-documented shortcoming, SAR quality failures are treated as a systemic issue, not a disqualifying one. Banks are not expected to submit perfect reports. They are expected to demonstrate reasonable efforts, governance oversight and a credible trajectory of improvement. The regime tolerates imperfection because it recognises the complexity of human judgement under time pressure and data constraint.

If an agentic AI system were to produce an imperfect SAR narrative today, the reaction would be markedly different. Questions would immediately surface about explainability, legal risk and whether such systems should be permitted at all. The fact that many human-authored SARs fall short of regulatory expectations is quietly accepted as operational reality. Machines are not afforded the same contextual understanding.

The expectation of machine objectivity is the root of the problem

There are several reasons why technology is held to a higher standard than human operators, none of them particularly rational.

First, there is a deeply embedded belief that machines are objective. Humans are messy, biased and inconsistent, so the instinctive expectation is that automated systems should be clean, neutral and predictable. When AI behaves in ways that mirror human limitations, such as making probabilistic judgements based on imperfect data, it violates this expectation. The discomfort stems not from the error itself, but from the collapse of a false assumption.

Second, accountability feels harder when decisions are automated. A human analyst who exercises poor judgement can be retrained, supervised or disciplined. An AI model feels abstract and opaque, even when its behaviour is well documented. This creates governance anxiety, which often manifests as unrealistic performance expectations. If every decision cannot be exhaustively explained, the system is judged unsafe.

Third, technology failures scale. A human analyst can make one mistake at a time. An agentic system, if poorly designed or governed, can replicate an error across thousands of decisions. This is a legitimate concern, but it is not an argument for perfection. It is an argument for robust controls, testing, monitoring and human oversight: precisely the same principles applied to any other high-risk operational process.

The Bank of Scotland case is instructive on this point. The failure did not arise because technology acted autonomously. It arose because screening logic, data handling and human escalation failed to align. The PEP review identified the risk. The human reviewer misinterpreted it. Expecting AI to be perfect while accepting human imperfection of this kind obscures the real sources of operational risk.

The self-driving car problem applies to compliance

The analogy with self-driving vehicles is uncomfortable but instructive. Human drivers cause fatal accidents every day. These incidents are tragic, yet largely accepted as an inherent cost of mobility. Society does not question the legitimacy of driving as an activity because humans make mistakes.

When an autonomous vehicle is involved in a fatal incident, the reaction is fundamentally different. Public outrage is intense. Regulatory scrutiny is immediate. The technology itself is called into question. The absolute number of incidents may be far lower than those involving human drivers, yet tolerance is markedly lower.

This reaction is not driven by comparative risk assessment. It is driven by a sense of moral violation. The expectation is not parity with human performance, but moral superiority.

Financial crime compliance exhibits the same pattern. Human analysts miss sanctions matches, fail to escalate alerts and misinterpret risk signals, sometimes for extended periods. These failures are treated as serious but understandable. When an AI system misclassifies a transaction or produces an imperfect risk assessment, it is framed as evidence that the technology cannot be trusted. The standard applied is not equivalence. It is perfection.

This is an impossible standard, and it guarantees disappointment.

Agentic AI must be governed as a participant in the risk-based approach

Agentic AI should not be framed as a replacement for human judgement, nor as an infallible arbiter of compliance decisions. It should be understood as a participant in the risk-based approach, subject to the same principles of proportionality, oversight and accountability as any other control.

Performance should be benchmarked against human outcomes, not theoretical perfection. If an agentic system demonstrably reduces false positives, improves detection of crime typologies or enhances case quality compared to existing processes, it is adding value, even if it occasionally errs.

Governance should focus on behaviour rather than mystique. Clear rules about where autonomy begins and ends, when escalation is required and how decisions are reviewed matter far more than abstract debates about whether AI should ever act independently. Many compliance processes already rely on delegated authority within defined thresholds. Agentic AI is not conceptually different.

Errors, when they occur, should be treated as learning signals rather than existential threats. Human analysts improve through feedback, supervision and experience. AI systems improve through training, monitoring and recalibration. Demanding that systems never err is incompatible with how learning, whether human or machine, actually works.

Crucially, accountability must remain human. Firms, not algorithms, are regulated. Boards and senior managers remain responsible for outcomes, regardless of whether decisions are supported by people or by technology. This is not a weakness of agentic AI. It is a foundational principle of governance.

Regulators are not demanding perfection from AI

Importantly, the regulatory direction of travel supports this position. The FCA published its consolidated approach to AI in financial services in September 2025, confirming that it will not introduce AI-specific regulation. Instead, existing frameworks, including the Senior Managers and Certification Regime (SMCR), model risk management expectations and operational resilience requirements, apply regardless of whether decisions are made by humans or supported by algorithms. In December 2025, FCA Chief Executive Nikhil Rathi reaffirmed that the regulator would rely on its principles-based, outcomes-focused approach rather than prescriptive AI rules.

The emphasis is on control, testing, explainability appropriate to the use case, and the ability to intervene. There is no expectation that AI-enabled systems will never fail. The expectation is that firms understand their limitations, enhance them where risk justifies it, and respond effectively when failures occur.

OFSI’s own guidance on compliance focuses similarly on reasonable cause, systems and controls, and proportionality. The Bank of Scotland penalty notice did not demand that screening systems achieve zero false negatives. It demanded that firms with greater sanctions exposure take reasonable steps to enhance their screening capability. The principle is the same whether the system is rule-based, AI-enabled or hybrid.

In other words, regulatory expectations for AI are consistent with long-standing compliance principles. The zero-tolerance narrative is largely self-imposed.

Refusing to deploy effective technology is itself an ethical failure

There is an ethical inconsistency in how the sector frames responsibility for compliance outcomes. When humans fail, the response emphasises context, workload and cognitive limitation. When machines fail, the response emphasises unacceptable risk. The underlying harm may be identical, yet the moral calculus shifts dramatically depending on the actor.

This inconsistency shapes investment decisions. Firms hesitate to deploy advanced AI not because it is ineffective, but because they fear being judged against an impossible standard. As a result, they continue to rely on overstretched human teams and legacy processes that regulators already know are flawed.

From an ethical perspective, this is difficult to defend. If agentic AI can materially reduce harm, even imperfectly, refusing to deploy it because it cannot guarantee perfection may itself constitute an ethical failure. Compliance is not about purity of process. It is about reducing real-world harm.

The standard should not be perfection; it should be progress

The question is not whether agentic AI will make mistakes. It will. The relevant question is whether those mistakes are fewer, more visible and more remediable than those made by humans operating within existing constraints.

Financial crime compliance has spent decades arguing that zero risk is an illusion. The FATF codified this position. The FCA operationalised it. OFSI enforces it with proportionate accountability. It would be deeply ironic if the sector abandoned that hard-won realism at the very moment technology offers a genuine opportunity to improve outcomes.

The perfection trap is not a technology problem. It is a governance failure dressed as caution. If the compliance profession accepts that humans can make serious errors without invalidating the entire framework, it must extend the same proportionality to agentic AI. The standard should not be perfection. It should be defensible progress.

As agentic AI enters FCC, is your firm caught in the perfection trap?

At OpusDatum, we help firms design, test and govern advanced financial crime systems with realism, proportionality and regulatory credibility. From AI-enabled sanctions screening and SAR optimisation to model governance and control effectiveness frameworks, our focus is not on perfection, but on defensible progress.

If you would like to discuss how agentic AI can be deployed responsibly within your financial crime framework, we would be pleased to talk. Contact us now.