Automating Financial Risk Analysis for the Bank of England
Financial regulators face the immense challenge of sifting through thousands of pages of unstructured data from corporate earnings calls to identify potential risks. This project demonstrates an AI-driven multi-agent solution built around a Bank of England use case.
The system mimics the workflow of a human analyst: a primary AI agent extracts key metrics, and a supervisor LLM audits those outputs for relevance and quality, providing a second layer of assurance.

My Solution: A Multi-Agent Evaluation Framework
I designed and implemented a structured evaluation pipeline where one LLM supervises another:
- Scoring relevance — each extracted metric is scored against Prudential Regulation Authority (PRA) 2025 priorities.
- Justifying the score — the supervisor provides a written rationale.
- Recommending action — metrics are flagged to Keep, Revise, or Remove, ensuring actionable outputs.
Key Features
- Agentic Workflow — multi-agent architecture where a GPT-4o “meta-agent” supervises Claude.
- Context-Aware Prompting — PRA regulatory context injected directly into prompts.
- Automated Data Pipelines — scripts parse
.csv, structure into JSON, and batch-process outputs. - Insight Visualization — aggregated scores highlight patterns, strengths, and weaknesses across metrics.
Tech Stack
- Core AI: OpenAI GPT-4o, Anthropic Claude
- Languages & Libraries: Python, Pandas, NLTK, Gensim
- Environment: Google Colab, Jupyter Notebook