notices - See details
Notices
AI in Asset Management hero image
THEME: TECHNOLOGY
18 November 2025 Research Foundation

Chapter 7: Natural Language Processing

How AI and large language models transform investment research, compliance, and risk management

This chapter explores how natural language processing (NLP) and large language models (LLMs) are transforming financial analysis, investment research, and risk monitoring across the finance industry.

Chapter 7: Natural Language Processing View PDF Practitioner Brief View PDF CFA Institute Member-Exclusive: AI in Asset Management Explained Login to view videos
AI in Asset Management book cover image

Executive Summary

Language has always shaped markets. From earnings calls and regulatory filings to breaking news and social media, words move capital and influence decisions as powerfully as numbers. What has changed is the ability to systematically process and act upon this vast flow of unstructured text. 

Natural language processing (NLP), once a niche academic field, has evolved over decades into a central tool for modern financial analysis and broader adoption of AI in asset management. The rise of large language models (LLMs) marks a paradigm shift. LLMs are broadening what NLP can achieve, moving it beyond narrow, task-specific tools to general-purpose systems that parse nuance, synthesize information, and automate at scale. For financial executives, understanding this trajectory is no longer optional; it is central to strategy, risk management, and competitive positioning.

This summary is based on “Natural Language Processing,” by Francesco A. Fabozzi, PhD, a chapter of  AI in Asset Management: Tools, Applications, and Frontiers.

From Dictionaries to Deep Learning: The Evolution of NLP in Finance

Early attempts at NLP were crude but intuitive: simple word lists and dictionaries that counted “positive” and “negative” terms. These methods produced insights but faltered in the face of context, ambiguity, and domain-specific language. Financial texts made these challenges especially clear: words like “liability” or “charge” do not neatly map to everyday sentiment.

Statistical approaches improved matters by treating documents as bags of words and assigning weights based on frequency and co-occurrence. These models allowed more data-driven inference but still missed context and nuance. The introduction of word embeddings and neural networks pushed NLP closer to true language understanding, because words could now be mapped into multidimensional spaces that captured relationships and meaning.

The real breakthrough came with transformers. By enabling models to consider all words in a sequence simultaneously, transformers solved longstanding issues of context and efficiency. They became the foundation for today’s LLMs, which combine vast pretraining with fine-tuning and prompting flexibility. What began as dictionary lookups has evolved into systems capable of reasoning across domains and tasks, a transformation with direct consequences for finance.

Large Language Models in Finance: Scale, Flexibility, and Trade-Offs

LLMs are powerful because of their generality. Trained on enormous corpora, they can perform classification, summarization, translation, and question-answering without bespoke architectures or large labeled datasets. This flexibility is invaluable in finance, where structured data is plentiful but annotated text is scarce. Equally important is adaptability. With a well-crafted prompt, a single LLM can shift seamlessly from scoring sentiment to summarizing a filing to extracting environmental, social, and governance (ESG) risks. And when prompts alone are not enough, firms can turn to lightweight fine-tuning techniques that adapt general-purpose models to specialized financial domains without the cost or complexity of retraining from scratch.

But scale has its price. The largest models require extraordinary computing resources and come with latency and cost challenges. Smaller, fine-tuned models or hybrid approaches often deliver better performance-to-cost ratios. Senior executives take note: Bigger is not always better. Strategic alignment of model size, deployment strategy, and business need is what matters.

Strategic Applications of NLP and Generative AI in Finance

NLP in finance has long gone beyond sentiment, and classical techniques still play a role for efficient, targeted tasks. What LLMs add is breadth: the ability to unify and extend these methods across multiple domains. Executives are now deploying both NLP and LLMs across the financial enterprise to perform the following:

  • Monitor compliance by checking filings and reports against evolving regulatory standards.
  • Analyze ESG disclosures to extract material risks and opportunities.
  • Track news and events in real time to flag risks that could impact portfolios.
  • Summarize lengthy documents such as 10-Ks or earnings call transcripts, reducing analyst workload and speeding decision-making.
  • Support quantitative investing by embedding text into structured factors that feed models of return, volatility, and risk.

Together, these use cases represent some of the most impactful generative AI applications in finance, moving beyond narrow sentiment analysis into core areas of compliance, risk, and investment strategy.

Deployment and Infrastructure: How Financial Firms Implement NLP and LLMs

One of the most important decisions executives face is how to deploy NLP and LLM capabilities.

  • Application program interface (API)-based access offers speed, ease of integration, and frontier performance but raises concerns about data security, transparency, and long-term cost.
  • Self-hosting open-source models provides control, compliance, and customizability but demands significant investment in infrastructure and talent.
  • Hybrid approaches are emerging as the pragmatic middle ground, using commercial APIs for low-risk, external-facing tasks while keeping sensitive analysis in-house.

These choices are not purely technical; they are strategic. They intersect directly with regulatory compliance, intellectual property protection, and cost management. Executives must ensure alignment between deployment strategy and organizational priorities.

Risks and Governance Considerations for NLP in Finance

The opportunities of NLP and LLMs come with equally serious challenges:

  • Hallucinations and reliability. LLMs can produce fluent but incorrect outputs. Without safeguards, these errors can propagate unnoticed through workflows.
  • Evaluation gaps. Unlike image recognition, financial NLP lacks standardized benchmarks. Firms must design their own evaluation frameworks, grounded in domain expertise.
  • Forward-looking bias. Pretraining on data that includes future outcomes risks contaminating backtests and overstating predictive power.
  • Compliance and legal risks. Using third-party APIs raises questions of data ownership, leakage, and liability. IP disputes and unclear regulatory guidance add to the uncertainty.

For financial leaders, the lesson is to treat NLP not just as a technical system but as part of a governed enterprise architecture, central to AI governance in finance. Success depends on building safeguards, review processes, and audit trails. This approach aligns with emerging AI governance frameworks in finance, where compliance, oversight, and auditability are as critical as raw performance.

The Future of Natural Language Processing and Large Language Models in Finance

The future of NLP in finance will be defined not by monolithic models but by hybrid systems. Classical NLP methods will continue to support efficient, well-bounded tasks, while LLMs will increasingly drive workflows requiring flexibility, reasoning, and real-time adaptation. Retrieval-augmented generation (RAG) will draw on fresh, proprietary data at inference, and multi-agent systems will orchestrate chains of tasks — from scanning filings to drafting alerts to flagging issues for human oversight.

We should also expect greater multimodality: models that can process not only text but also tables, charts, and audio from earnings calls. This opens the door to richer analysis and more intuitive human–AI collaboration.

Governance and evaluation will be the real differentiators. As models converge in raw capability, the firms that succeed will be those that can deploy responsibly, aligning technology with compliance, risk appetite, and business goals. Talent strategy will also be decisive: The intersection of financial expertise, data science, and AI engineering is where enduring value will be created.

Takeaways for Senior Executives

  • Recognize language as data. Text is no longer unstructured noise; it is a primary input to competitive advantage.
  • Invest in governance as much as in models. Reliability, evaluation, and compliance frameworks will define success.
  • Adopt a portfolio mindset. Blend classical NLP, API access, open-source models, and hybrid pipelines to balance cost, control, and performance.
  • Focus on augmentation, not automation. LLMs amplify human expertise but should not replace judgment in high-stakes decisions.
  • Prepare for the future. Multimodal and agentic systems will reshape workflows; organizations that experiment early will be best positioned to capture value.

Recommended Chapter References

Brown, Tom B., Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, et al. 2020. “Language Models Are Few-Shot Learners.” In NIPS‘20: Proceedings of the 34th International Conference on Neural Information Processing Systems, 1877–901. doi:10.48550/arXiv.2005.14165.

Chen, Yifei, Bryan T. Kelly, and Dacheng Xiu. 2024. “Expected Returns and Large Language Models.” Working paper (23 August). https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4416687.

Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. “Attention Is All You Need.” In NIPS’17: Proceedings of the 31st International Conference on Neural Information Processing Systems, 6000–10.