LLMs Enable Bayesian Causal Mapping of Iran Conflict

Buy-side risk managers face limits in conventional scenario stress-testing for the Iran conflict. Alexander Denev of Turnleaf Analytics used Anthropic's Claude to generate a Bayesian causal network in roughly 10 minutes with under 15 prompts. The model assigned a 45% probability of continued disruption to oil flow through the Strait of Hormuz even if the strait remains open, and encoded a ceasefire branch that could push oil prices to $100-$130 if Gulf infrastructure is severely damaged and U.S. withdrawal does not occur. The exercise demonstrates that Bayesian networks produced by large language models can rapidly scale causal scenario mapping, but outputs remain experimental and require expert verification, calibration, and robust sensitivity testing before use in live risk systems.
What happened
Alexander Denev of Turnleaf Analytics used Anthropic's Claude to generate a Bayesian causal map of the Iran conflict in about 10 minutes with no more than 15 prompts. The generated network placed a 45% probability on continued disruption to oil flow through the Strait of Hormuz even with the strait open, and produced a ceasefire branch that drives oil to $100-$130 in scenarios with severe Gulf infrastructure damage and no U.S. withdrawal. Qatar's energy minister warned LNG repairs could take three to five years, amplifying tail risks.\n\nTechnical details: Bayesian networks (causal directed acyclic graphs parameterized by conditional probability tables) were constructed quickly by iteratively prompting Claude. A simple 10-node binary network already encodes over 1,000 distinct states, illustrating combinatorial scenario coverage. Key procedural facts: Denev paused after roughly 15 prompts; the model outputs nodes, causal links, and probability assignments; prior manual efforts took months (one 2014 model and a late-2010s IHS Markit project). Important practitioner caveats: LLM-generated structure and CPTs require expert review, calibration against priors and data, sensitivity analysis, and backtesting. Risks include hallucinated causal links, mis-specified priors, and overconfident probability assignments without empirical grounding.\n\nContext and significance: Why it matters: The exercise shows LLMs can convert domain judgment into formal causal models at scale, addressing a core weakness in snapshot scenario stress tests. For risk teams, a scalable causal approach offers greater transparency and probabilistic narratives compared with a handful of deterministic scenarios. However, operationalizing these outputs requires integrating causal discovery best practices: explicit priors, robustness checks (e.g., do-calculus sensitivity), Monte Carlo propagation, and versioned provenance for auditability. Competitors and internal risk teams will need tooling to convert prompt outputs into validated, production-grade Bayesian networks.
What to watch
Adoption hinges on tooling for verification, standard methods for prompting and calibration, and demonstrable backtesting against market moves. Expect rapid prototyping in quant and macro teams, but slow regulatory and production uptake until validation frameworks mature.
Scoring Rationale
This story demonstrates a practical, potentially scalable use of LLMs for causal scenario generation that matters to risk teams and quant practitioners. It is significant but experimental, requiring validation frameworks before production adoption. Freshness today reduces the score slightly.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problemsStep-by-step roadmaps from zero to job-ready — curated courses, salary data, and the exact learning order that gets you hired.


