Did the Manager Change the Model or Just the Settings | EI Blog

Milos Maricic

Enterprising Investor Theme - Data and Tech Hero

THEME: TECHNOLOGY

15 June 2026 Enterprising Investor Blog

Did the Manager Change the Model or Just the Settings?

One question that helps institutional investors evaluate how systematic managers learn from failure

Milos Maricic

In the past two years, I have read dozens of drawdown letters from systematic managers. Their architecture rarely varies. Market conditions are described as unusual. Specific factors are identified as headwinds. The list of refinements that follows is almost always parametric, that is, changes to model settings (e.g., a shorter lookback, a tighter risk target), rather than changes to the model's underlying assumptions.

Performance often recovers. Eighteen months later, a different drawdown produces a different letter with the same architecture. The pattern is industry standard. More importantly, it reveals whether the response reflects genuine learning about the model or simply a retuning of its settings.

The question allocators should ask is simple:

“Walk me through a time your model was wrong. What did you learn about its structural assumptions, and what did you change architecturally rather than parametrically?”

This question surfaces the difference between calibration and learning. A manager who changed only a lookback window, risk target, or signal weight may have improved the model’s settings without re-examining its assumptions. A manager who changed the way signals interact, how conflicting information is weighted, or which market assumptions the model depends on may have learned something more fundamental.

This piece is the second in a series outlining four dimensions in a broader specification risk framework (SPEC). Following "The Question That Exposes Weak Quant Models," Following “The Question That Exposes Weak Quant Models,” which examined how variables enter a model, this piece explores what happens when the model is wrong.

The Pattern

What is consistently missing from these letters is a discussion of whether the model's underlying structural assumptions held, whether the failure revealed something new about how signals interact, or whether the response addressed the architecture rather than its calibration. That distinction often separates managers who learn at the level of model design from those who learn primarily at the level of calibration.

Recent research highlights why this distinction matters. Cortesi and collaborators at MIT and Princeton tested twelve combinations of model architectures and optimizers on identical data. The combinations produced similar prediction error and similar Sharpe ratios. Portfolio turnover differed by a factor of three. Each optimizer encoded a different prior about market structure. Some assumed short memory and recent dynamics matter most. Others assumed long memory and complex structural patterns. Two managers running statistically equivalent models can hold opposite implicit beliefs about how markets work.

When such a model underperforms, a parametric response adjusts the settings without revisiting the assumptions behind them. The model fails again, in a different way, when conditions shift. Each failure gets attributed to a regime that was unanticipated, while the structural assumption that made the model regime-fragile remains in place.

Where Current Frameworks Stop Short

Standard post-mortems focus on attribution. Which factors detracted. Which positions hurt. Which signals fired incorrectly. These are useful but limited. Attribution describes what happened. It does not test whether the team's understanding of the model itself has changed.

Industry due diligence questionnaires (DDQs) typically ask managers to describe their research process and risk controls. They rarely ask managers to walk through a specific failure and trace what was learned about the model's structural assumptions from it. That gap allows a manager to maintain the same model architecture across multiple drawdowns while presenting changes to parameters and settings as evidence of learning.

One Question That Changes the Conversation

This phrasing carries the work. You are requesting a specific failure, a specific lesson, and a specific structural change. In conversations with allocators and managers across institutional contexts, the responses cluster into three categories.

A strong answer: The manager identifies a certain drawdown episode and describes what structural assumption proved wrong. They distinguish clearly between changes to model settings, such as a lookback window or position-sizing parameter, and changes to the model's underlying assumptions, such as reformulating how signals interact, restructuring how conflicting information is weighted, or replacing a component whose implicit prior the team could no longer defend. They explain why the same failure mode is less likely to recur, and they connect the lesson to a broader view about what their model assumes the world to be.

A standard answer: The manager describes a difficult period and focuses on the changes made to lookback windows, risk targets, or signal weights. This is the industry baseline. A useful follow-up surfaces whether anything deeper happened: “Was the underlying logic of the model changed, or only its settings?” Honest managers will tell you. Unprepared managers will reach for the language of structural change without the substance, at which point the gap becomes audible.

A concerning answer takes one of three forms. The first is an inability to recall a meaningful failure, which suggests either a short track record or a research process without the discipline of structural post-mortem. The second is attribution of every difficult period to external regime change, with no reflection on the model's contribution to the loss. The third is a defense of the model's continued correctness despite the failure. A manager who has never identified a structural assumption they got wrong has either built a model without structural assumptions, which is impossible, or has chosen not to examine them.

Why This Matters Now

Two forces have raised the cost of architectural blindness.

AI claims have made it harder to evaluate manager learning from track record alone. When every systematic manager describes their process as adaptive, surface-level evidence of learning increases while the underlying evidence may remain unchanged. Allocators need a way to test whether adaptation occurred at the architectural level or only at the level of settings.

The academic literature on factor decay has documented systematic patterns of performance erosion that no parametric adjustment can repair. McLean and Pontiff found that post-publication factor returns decline by 26-to-58-percent. A manager whose model was specified during the high-return period of a factor and who has updated only parameters since is running a strategy whose foundation has shifted while the team has been working on the surface.

The most rigorous research institutions are moving to architectural standards of evaluation. ADIA Lab's investment in causal inference research, López de Prado's work on factor mirages, and the broader shift toward structural validation in quantitative research all point in the same direction. The institutional bar for what counts as model learning is rising.

CFA Institute's Standard V(A) requires members to have a reasonable and adequate basis for investment recommendations, including ongoing assessment of the assumptions and limitations of quantitative models. A manager who cannot describe what they learned at the architectural level from a past failure cannot demonstrate ongoing assessment of those assumptions.

Before Your Next Meeting

Ask the question. Listen for the distinction between “we extended the lookback” and “we discovered our assumption about signal interaction was wrong.” The first is calibration. The second is learning.

Each of the four “SPEC” dimensions fail independently. The manager who answers all four well is rare. The manager who answers all four poorly is operating a system the allocator cannot meaningfully evaluate. The documented cases behind each dimension are collected in an open research catalog.

If you liked this post, don’t forget to subscribe to the Enterprising Investor.

All posts are the opinion of the author. As such, they should not be construed as investment advice, nor do the opinions expressed necessarily reflect the views of CFA Institute or the author’s employer.

0.25 PL Record PL credit Manage your Professional Learning credits

Publisher Information

CFA Institute