A research team from Johns Hopkins University and the National Cancer Institute published results in March 2026 demonstrating MangroveGS, a machine learning framework that predicts cancer metastasis with approximately 80% accuracy. The system analyzes gene expression patterns from tumor biopsy samples and identifies signatures associated with metastatic potential, the likelihood that cancer cells will spread to other organs.
Early detection of metastatic risk could fundamentally change cancer treatment planning. Current methods rely heavily on tumor staging, which captures size and local spread but often misses the molecular signals that predict distant metastasis.
How MangroveGS Works
- Analyzes RNA sequencing data from standard tumor biopsy samples
- Identifies a panel of 847 gene expression markers associated with metastatic behavior
- Uses gradient-boosted decision trees trained on over 200,000 patient samples
- Predicts metastatic risk with approximately 80% accuracy across 12 cancer types
- Provides interpretable output showing which gene pathways drive the prediction
Why Gene-Level Prediction Matters
Two patients with the same tumor type, stage, and size can have dramatically different outcomes. One patient’s cancer remains localized and responds well to treatment. The other’s cancer spreads to distant organs within months. Traditional staging systems cannot reliably distinguish between these patients at the time of diagnosis.
MangroveGS addresses a critical gap in oncology by predicting metastatic risk from gene expression data, potentially allowing oncologists to intensify treatment for high-risk patients before cancer spreads.
MangroveGS fills this gap by looking at the molecular machinery inside tumor cells. Certain gene expression patterns correlate strongly with cellular behaviors like migration, angiogenesis, and immune evasion that enable metastasis. By detecting these patterns early, oncologists could intensify treatment for high-risk patients while sparing low-risk patients from unnecessary aggressive therapies.
Validation Across Cancer Types
The researchers validated MangroveGS on 12 cancer types including breast, lung, colorectal, prostate, and pancreatic cancer. Accuracy ranged from 74% for pancreatic cancer to 86% for breast cancer. The model performed best on cancer types with larger training datasets, suggesting that accuracy will improve as more genomic data becomes available.
An independent external validation on data from three European cancer registries confirmed the results, with an overall area under the receiver operating characteristic curve of 0.83.
Path to Clinical Use
MangroveGS is not yet approved for clinical use. The research team is pursuing a prospective clinical trial to validate the model’s predictions against actual patient outcomes over five years. If validated, the tool could be integrated into existing tumor profiling workflows, adding metastatic risk assessment alongside current genomic tests like Foundation Medicine and Tempus.
The model code and pre-trained weights are available on GitHub under an academic license. The team is working with diagnostic companies to develop a commercial version that integrates with standard laboratory information systems.