The gap between what AI can achieve in a sandbox and what it can safely achieve in a live, regulated investment setting has never been larger. That gap is not a temporary inconvenience. It is the primary lens through which an asset allocator and those involved in manager research should evaluate any firm claiming AI is central to its process.
When it comes to manager selection, the AI arms race is in play. But practitioners beware: AI capability in isolation is not a reliable signal of edge. What matters is whether it holds up under governance constraints, imperfect data, and live market conditions. In today’s environment, due diligence must move beyond assessing models to assessing processes, including how AI behaves and how it fails in production.
On 31 March, the WBS Gillmore Centre, in partnership with CFA Institute, CFA UK Society, and the Research and Policy Center, convened a summit on the real limits of AI transformation in an investment business.
The discussion featured Simon Legrand-Green, CFA, head of Multi-Asset and Systematic Strategies Research at WTW; Julan Al-Yassin, director of Learning Content at CFA Institute; James Hadfield, a senior risk leader formerly at CVC Capital Partners, Ninety One, and Close Brothers; and Carlos Salas, a portfolio manager and data scientist.
This post synthesizes the key takeaways on instances in which AI is delivering, where governance constrains real world deployment, and where promotional claims have outrun the evidence. It also develops a Modern Manager Due Diligence Checklist to help practitioners evaluate these claims in practice.
Where AI Is Delivering
Legrand-Green rates investment strategies across systematic and multi-asset approaches and sees across the industry. His account of where AI is adding demonstrable value was grounded in conversations with asset managers.
Enhanced NLP: Transcript and document analysis is the clearest proven use case being deployed today. Large language models (LLMs) understand context in ways that crude word-count sentiment techniques never could. Managers are deploying models across earnings transcripts, regulatory filings, and the contrast between scripted remarks and the more candid Q&A section of earnings calls. The models are extracting tonal and linguistic signals that carry informational content.
Enhanced nowcasting: Missing data estimation was performed credibly during last year's 43-day US government shutdown, which suspended releases of GDP, inflation, and labour market series. Managers used AI to ingest alternative datasets, estimate the missing figures, and calibrate those estimates against eventual official releases. Decision-making continued without interruption.
Enhanced model validation: Live model behaviour monitoring is helping managers build track record confidence more quickly. AI-assisted validation of whether live returns match backtested predictions allows faster and more rigorous conviction and earlier capital deployment behind new models
Challenger model execution and routing: Continuous challenge-and-verify loops are being used to route decisions under varying market conditions whilst deliberately sending a fraction of trades down sub-optimal routes to test whether assumptions still hold. The aim is to keep the system calibrated to real market behaviour rather than historical patterns.
Evolution not revolution: Live deployments of AI tend to be incremental enhancements of peripheral elements rather than a rethinking of alpha functions or full investment processes.
A Common comfort zone: Bounded inputs, observable outputs, human checkpoints, and measurable performance define deployable AI architecture.
Professional Standards, Governance and Firm Infrastructure
Al-Yassin outlined the professional standards practitioners need to embed. Standard 5A of the CFA Institute Code of Ethics already encompasses AI by requiring a reasonable basis for any investment recommendation. In the past, this standard has been operationalized as reproducibility. LLMs are probabilistic and stochastic by default, however, meaning that the same prompt posed twice produces different outputs. This means reproducibility is generally no longer available and a different approach is required.
To stay within the Code, transparency and disclosure of AI use is needed. Practitioners must state how AI was used, which steps were human-driven, which were AI-handled, and how outputs were validated. Where that disclosure cannot be provided, existing professional standards are already likely to be in breach. It is notable that the layers of checking and validation that tend to be required are expensive and laborious and create, a clear AI-heightened point of failure in investment processes.
The Risk Officer’s Conundrum
Hadfield pointed out that the risk officer's problem is not AI capability. Rather, it is infrastructure maturity relative to the rate of change. The failure mode is not dramatic collapse but quiet degradation: Powerful models fed by poor data, bottom-up tool proliferation without architectural coherence, and approved tools migrating into unapproved applications.
Bias toward AI and away from critical underpinnings: A CEO will sign a $10 million cheque for a new AI system readily. In contrast, obtaining a budget to clean, tag, and govern the underlying data properly is harder to secure. A model is only as good as what feeds it.
Vendor concentration compounds the risk: The apparent diversity of AI products masks a narrow set of underlying infrastructure providers. Regulators under DORA are beginning to require full supply chain visibility. The accountability position is unequivocal: delegation to AI changes the workflow, not the liability.
AI Outside the Comfort Zone
Synthetic data generation for market simulation has attracted significant promotional attention. The evidence from live deployment is discouraging. Insights derived from synthetic training have repeatedly failed to transfer to real data and in some cases have created actively misleading signals.
However, synthetic data generation for risk control, and counterfactual generation has shown great promise. In all cases, allocators and manager researchers should require specific out-of-sample evidence of novel AI before accepting the premise.
On agentic AI — systems that act rather than merely generate outputs — the panel was direct. Recent hype around fully autonomous investment funds should be treated with skepticism. The risk is not that agentic systems will be obviously wrong; it is that errors propagate through decision chains before any human checkpoint intervenes. Current governance frameworks are not designed to catch that failure mode.
A Modern Manager Due Diligence Checklist
Process integrity: Can the manager decompose its AI-assisted workflow into AI-handled and human-driven steps, with documented validation at each stage?
Data governance: What is the quality, tagging, and audit status of model inputs?
Budget allocation: What is the cost of data infrastructure relative to model infrastructure? This is a useful proxy for organisational seriousness.
Vendor risk: What is the AI vendor concentration across the full stack? Has operational resilience been tested against provider failure or commercial disruption?
Out-of-sample rigor: What is the parallel-running period before an AI-assisted workflow receives full operational weight, and what are the pass/fail criteria?
Synthetic data: If used, what is the specific evidence of transfer from synthetic to real conditions?
Use-case control: How does the manager prevent approved tools drifting into unapproved applications, and who owns that boundary?
Agentic deployment: If agentic AI is in use, where are the human decision points and what is the accountability structure when the system acts erroneously?
Accountability: Can leadership articulate, without deflecting to the technology, who is responsible for AI-assisted decisions?
What AI Adoption Means for Manager Selection
Process integrity, demonstrable edge, and governance have always been the discriminating characteristics in manager selection. The Modern Manager Due Diligence Checklist provides a structured basis for those involved in manager selection. AI capabilities have not historically been evaluated in isolation from the broader investment process, but manager research now demands it.
More broadly, does a manager you are considering hiring genuinely understand the limitations and opportunities of the AI tools they intend to deploy? Are you hearing an elegant marketing narrative over a superficial, non-existent or risky AI deployment? Or is this a highly competent team with a fully guard-railed, production AI?
These are important considerations because the AI arms race is characterized by a deploy first, govern later mindset. The question is not whether a manager uses AI. It is whether they can explain, justify, and stand behind every step of the process.
If you liked this post, don’t forget to subscribe to the Enterprising Investor.
All posts are the opinion of the author. As such, they should not be construed as investment advice, nor do the opinions expressed necessarily reflect the views of CFA Institute or the author’s employer.
Image credit: ©Getty Images