Models & Researchmutational signaturescancer genomicsstatistical methodsbioinformatics

Power posteriors enable robust mutational signature discovery

|June 11, 2026|By LDS Team

5.6

Relevance Score

Power posteriors enable robust mutational signature discovery

Researchers from Harvard Biostatistics, Boston University, and Dana-Farber Cancer Institute have published BayesPowerNMF, a Bayesian non-negative matrix factorization (NMF) method for discovering mutational signatures in cancer genomes, in PLOS Computational Biology. The approach uses a power posterior to improve robustness when the standard NMF model is misspecified, and a sparsity-inducing prior to automatically infer the number of active signatures - removing a key manual step in competing tools. In simulation studies, BayesPowerNMF recovers more true signatures with lower cosine error than leading methods SigProfilerExtractor and SignatureAnalyzer. Applied to whole-genome sequencing data for six cancer types from the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium (PCAWG), the method recovers more signatures than current state-of-the-art. Unlike point-estimate methods, BayesPowerNMF provides credible intervals that quantify uncertainty in each inferred signature.

Background

Mutational signature analysis decomposes a cancer genome's mutation catalog into characteristic frequency profiles - "signatures" - left by distinct mutational processes such as carcinogenic exposures, defective DNA repair, or APOBEC deaminase activity. Non-negative matrix factorization (NMF) is the dominant computational framework for this task, underpinning widely used tools like SigProfilerExtractor and SignatureAnalyzer. However, the standard NMF model is an approximation, and even modest departures from the assumed probability model can cause methods to miss real signatures or infer spurious ones, per the PLOS Computational Biology paper (Xue et al.).

Method

Per the paper, BayesPowerNMF uses a power posterior for a fully Bayesian NMF model. A power posterior discounts the likelihood by a fractional exponent (the "temperature"), making inference less sensitive to model misspecification without abandoning the Bayesian framework. The method also uses a sparsity-inducing prior to automatically infer the number of active signatures, removing the need for a separate model-selection step required by competing tools. As a fully Bayesian approach, BayesPowerNMF produces credible intervals that quantify uncertainty in each inferred signature - a feature absent in point-estimate methods like SigProfilerExtractor and SignatureAnalyzer.

Evaluation

Per the paper, extensive simulation studies show BayesPowerNMF recovers more true signatures with greater accuracy and lower cosine error compared to leading methods when the generating model is misspecified. On real whole-genome sequencing data for six cancer types from the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium (PCAWG), the method recovers more signatures than the current state-of-the-art.

Significance for practitioners

For data scientists and computational biologists working in cancer genomics, a more robust signature extractor with automatic signature-count inference reduces a common failure mode - choosing the wrong number of NMF components - and provides principled uncertainty estimates for downstream annotation and attribution tasks. The PLOS Computational Biology publication (peer-reviewed) increases confidence in the method relative to preprint-only tools.

What to watch

Independent benchmarks comparing BayesPowerNMF against newer tools and updates to SigProfiler and SignatureAnalyzer; community uptake via citation and software reuse; and practical guidance on tuning the power posterior temperature hyperparameter for new cancer cohorts.

Key Points

1BayesPowerNMF uses a power posterior for Bayesian NMF, improving robustness to model misspecification and reducing missed or spurious mutational signatures compared to SigProfilerExtractor and SignatureAnalyzer.
2A sparsity-inducing prior automatically infers the number of active signatures, removing the manual model-selection step that is a common error source in competing tools.
3Evaluated on ICGC/TCGA PCAWG data across six cancer types, BayesPowerNMF recovers more signatures than state-of-the-art; the peer-reviewed PLOS Computational Biology paper includes Bayesian credible intervals for uncertainty quantification.

Scoring Rationale

A published, peer-reviewed Bayesian methods contribution to cancer mutational signature analysis with a clear methodological advantage over SigProfilerExtractor and SignatureAnalyzer. Relevant primarily to computational biologists and cancer genomics practitioners; not broadly impactful across AI/ML.

Sources

Public references used for this report.

3 sources

journals.plos.orgRobust discovery of mutational signatures using power posteriors

pmc.ncbi.nlm.nih.govRobust discovery of mutational signatures using power posteriors - PMC

biorxiv.orgRobust discovery of mutational signatures using power posteriors | bioRxiv

Practice interview problems based on real data

1,625 SQL & Python problems across 15 industry datasets — the exact type of data you work with.

Try 250 free problems