notices - See details
Notices
AI in Asset Management hero image
THEME: TECHNOLOGY
18 November 2025 Research Foundation

Chapter 8: Machine Learning in Commodity Futures: Bridging Data, Theory, and Return Predictability

How data-driven models improve forecasting accuracy and risk management in global commodity markets

Commodity futures are an underexplored frontier for machine learning. This chapter shows how theory-grounded signals make them interpretable and investible.

Chapter 8: Machine Learning in Commodity Futures: Bridging Data, Theory, and Return Predictability View PDF Practitioner Brief View PDF Author Video - Machine Learning in Commodity Futures View Video CFA Institute Member-Exclusive: AI in Asset Management Explained Login to view videos

Hear from the Author

Executive Summary

Machine learning (ML) has transformed equity investing, powering advances in factor discovery, portfolio construction, and systematic strategies. But commodities have not kept pace. Although commodities are widely traded and play a growing role in institutional portfolios, ML applied to commodity futures remains an underexplored frontier.

Commodity futures present different challenges than equities, but they also provide fertile ground for ML to add value. This chapter of AI in Asset Management: Tools, Applications, and Frontiers illustrates how ML, when grounded in commodity theory, can uncover persistent return patterns and convert the complexities of commodity markets into systematic, interpretable, and investable strategies.

Why Commodities Are Different from Equities

Unlike equities, commodities are not capital assets with financial statements or earnings reports. They are physical goods whose pricing reflects inventories, supply chains, seasonality, and exogenous shocks. Agricultural markets respond to weather, energy to geopolitical risk, and metals to industrial demand. These differences mean that commodity return drivers are less standardized, less transparent, and often harder to model.

Traditional approaches to modeling commodities have leaned heavily on reduced-form econometric models, often drawing only on price data. These models have tended to focus on narrow subsets of contracts, such as a particular group of metals or agricultural futures, and typically rely on momentum or other price-based indicators. Although such methods capture short-term effects, they miss the deeper economic structure of commodity markets.

This summary is based on the CFA Institute Research Foundation and CFA Institute Research and Policy Center chapter “Machine Learning in Commodity Futures: Bridging Data, Theory, and Return Predictability,” by Tony Guida.

Why Machine Learning Creates Opportunities in Commodities Futures

The same factors that complicate commodity modeling also make them an ideal laboratory for predictive modeling. Commodities are structurally heterogeneous: energy, metals, and agriculture move on different cycles and often show weak or negative correlations with one another. This heterogeneity enhances their role as diversifiers in portfolios.

ML is particularly well-suited to this setting. It can absorb diverse inputs, identify nonlinear relationships, and adapt to regime shifts that occur when supply disruptions, geopolitical changes, or inventory cycles alter price dynamics. Just as important is the fact that commodities benefit from a surprisingly rich theoretical foundation.

  • The theory of storage (Kaldor 1939; Working 1949) links inventories to futures curves. When inventories are low, futures markets move into backwardation, and holding physical commodities earns a “convenience yield.” When inventories are high, futures curves slope upward into contango, signaling weaker returns.
  • The hedging pressure hypothesis (Keynes 1930; Hirshleifer 1988) explains how commercial hedgers and speculators interact. If producers crowd one side of the market, speculators taking the other side demand compensation in the form of risk premia.

These theories are not abstract. They yield observable signals such as carry, basis, and momentum, which are the very features ML can embed in predictive models. Grounding ML in these features ensures interpretability and connects results to established commodity economics.

How Machine Learning is Applied to Commodity Futures

The chapter adapts the equity ML playbook to commodities in three important ways:

  • Theory-grounded features. Inputs include momentum, basis, carry, skewness, and open interest, all of which stem from established commodity economics. Unlike purely technical signals, these features reflect real market frictions such as storage constraints and risk transfer.
  • Cross-sectional modeling. Rather than forecasting the level of a single commodity, models rank commodities relative to one another. Predictions are translated into long–short portfolios, echoing factor investing logic from equities but adapted to the commodity context.
  • Ensemble modeling across horizons. Instead of relying on a single predictive horizon, the framework models returns across multiple timeframes and aggregates them. This approach reduces model risk, smooths performance, and moderates turnover.

Taken together, these elements demonstrate that ML in commodity futures is not about opaque forecasting. It is about embedding theory into supervised learning, applying rankings across the cross-section, and diversifying signals through ensembles to produce robust, tradable strategies.

Key Findings

  • Momentum dominates. Time-series momentum consistently emerges as the strongest predictive feature across horizons. This aligns with the long-established role of trend-following in commodity markets.
  • Skewness matters at short horizons. Skewness-based features contribute especially in the near term, capturing reversal and sentiment effects linked to tail risks.
  • Performance varies by horizon. Short-term predictions can be powerful but face turnover and capacity constraints. Mid-term models cluster around overlapping signals, whereas long-term models provide distinct, decorrelated perspectives.
  • Ensembles deliver robustness. Aggregating predictions across horizons produces smoother results, lower volatility, and improved drawdown control compared with any single-horizon model.
  • Portfolios align with economic reality. Exposure tilts shift in ways that track macro cycles: long oil in periods of geopolitical stress, long grains during supply shortages, and metals positions during inflationary super cycles. These outcomes confirm that models are learning meaningful structure, not noise.

Implications of Machine Learning for Investment Professionals

The results translate into practical guidance for different stakeholder groups:

  • Portfolio managers can deploy ML signals in systematic long–short strategies. Horizon choice is crucial: Short-term alpha is offset by higher turnover, while ensembles offer more balanced, scalable performance.
  • Asset allocators can view commodities as more than diversifiers or inflation hedges. When modeled cross-sectionally, they become a source of systematic alpha that complements traditional equity and fixed-income exposures.
  • Risk managers can appreciate the resilience of ensemble approaches, which reduce fragility and better control drawdowns during market stress.
  • Governance teams gain from the interpretability of theory-based features, ensuring that models are transparent and aligned with fiduciary standards.

Broader Perspective: Bridging Data, Theory, and Investment Practice

The broader contribution of the chapter is methodological. It shows how to bridge the gap between equity ML maturity and commodity neglect. The very properties often considered obstacles (inventories, storage constraints, supply shocks, or hedging pressure) become strengths when reframed as inputs to machine learning.

By grounding features in theory, applying cross-sectional rankings, and diversifying across horizons, the chapter demonstrates that commodities can be systematically modeled and invested in using AI. The result is not a black box but an interpretable, theory-consistent strategy that aligns with real macroeconomic dynamics.

Conclusion

ML in commodity futures works when grounded in theory and implemented through disciplined portfolio design. Three enduring lessons stand out:

  • Theory matters. Features such as carry, basis, momentum, and skewness, rooted in storage and hedging dynamics, form the foundation of predictive modeling.
  • Horizon diversification matters. Ensembles that blend short-, medium-, and long-term signals outperform single-horizon approaches, delivering smoother and more reliable performance.
  • Interpretability matters. Linking ML outputs to economic theory ensures transparency, builds confidence, and meets institutional governance standards.

Commodities are not only diversifiers or inflation hedges; they are fertile ground for systematic alpha when modeled with ML. By embedding theory into feature construction, using cross-sectional portfolios, and employing ensembles, ML in commodity futures makes these markets more transparent, predictable, and investable.

Frequently Asked Questions

What is machine learning in commodity futures?
It is the application of supervised learning models to forecast returns in commodity futures markets. By grounding features in established theories such as the theory of storage and the hedging pressure hypothesis, ML identifies signals such as momentum, basis, carry, and skewness and translates them into long–short portfolio strategies.

Why is ensemble modeling important in commodities?
Ensemble modeling combines predictions from multiple horizons (short, medium, and long term) into a single signal. This approach reduces model risk, lowers volatility, and improves drawdown control compared with single-horizon models.

Can machine learning generate alpha in commodity markets?
Yes. When features are carefully designed and portfolios are constructed cross-sectionally, machine learning can uncover persistent patterns in commodity prices. These patterns align with macroeconomic cycles and provide systematic sources of alpha.

Are the results interpretable for institutional investors?
Yes. Because the features are drawn from established commodity economics, the models are not “black boxes.” They remain transparent, interpretable, and consistent with fiduciary and governance requirements.

Recommended Chapter References

Angelidis, Timotheos, Athanasios Sakkas, and Nikalaos Tessaromatis. 2025. “Predicting Commodity Returns: Time Series vs. Cross Sectional Prediction Models.” Journal of Commodity Markets 38 (June). doi:10.1016/j.jcomm.2025.100475.

Blitz, David, Matthias X. Hanauer, Tobias Hoogteijling, and Clint Howard. 2023. “The Term Structure of Machine Learning Alpha.” Working paper (19 July). doi:10.2139/ssrn.4474637.

Gu, Shihao, Bryan Kelly, and Dacheng Xiu. 2020. “Empirical Asset Pricing via Machine Learning.” Review of Financial Studies 33 (5): 2223–73. doi:10.1093/rfs/hhaa009.

Wang, Shirui, and Tianyang Zhang. 2024. “Predictability of Commodity Futures Returns with Machine Learning Models.” Journal of Futures Markets 44 (2): 302–22. doi:10.1002/fut.22471.

Chapters