AI Tool Reads Individual Tumor Cells to Predict Who Will Survive Cancer

Two patients walk into an oncology clinic with what looks, on paper, like the same melanoma. Same stage, same tumor size, comparable bloodwork. One lives for a decade. The other is gone in eighteen months. For as long as oncologists have been grading cancers, this quiet cruelty has haunted the field, and for most of that history the explanation has been a shrug and the word “heterogeneity.” Tumors are mixed bags of cells. Some matter more than others. The trouble has always been telling which ones.

A team at Oregon Health & Science University reckons it now has a way to ask that question directly, one cell at a time. In a paper published today in Cancer Discovery, the group unveils scSurvival, a machine-learning tool that reads the genetic activity of individual tumor cells and ties each cell back to a patient’s odds of survival. Not the tumor as a whole. Not even a cell type. The cell.

“This is the first kind of single-cell survival analysis that directly links individual tumor cells to patient outcomes,” says Tao Ren, a postdoctoral fellow in mathematics at OHSU and co-lead author of the study. “It allows us to see which cells are really driving disease progression instead of treating all cells the same.”

The averaging problem

To understand why this is a bigger deal than it might sound, it helps to know how survival analysis has worked for the past couple of decades. Researchers would sequence RNA from a tumor biopsy, mash all the signals together into one patient-level average, and try to correlate that average with how long the patient lived. A crude approach, necessarily. Within any given tumor, there might be a million cells, some helping the cancer, some trying to kill it, some just loitering. Blending them is a bit like pureeing a wedding cake to figure out the flavour.

Single-cell sequencing changed the raw material. Since the mid-2010s, labs have been able to profile thousands or even millions of cells from a single tumor individually, reading off the genes each one expresses. What hadn’t changed was the survival-analysis machinery pointed at that data. Most tools still collapsed the single-cell richness back into patient-level averages before running the numbers. Or they grouped cells by type, assumed cells of the same type behaved identically, and worked from there.

“Tumors are very complex, and important signals can be lost when data are averaged across thousands or millions of cells,” says Faming Zhao, the paper’s other co-lead author, a postdoctoral fellow in cancer biology. “By looking at survival at single-cell resolution, we can better understand why patients with the same cancer can have very different outcomes.”

How the model learns what to pay attention to

scSurvival borrows an idea from a branch of AI called multiple-instance learning, which was originally developed for problems like classifying images that contain many objects at once. The trick is that the model is given patient-level outcomes, which are known, and asked to figure out which cells in each tumor are most predictive of those outcomes, which are not. An attention mechanism, essentially a learned spotlight, sweeps across the cells and turns up the gain on the ones that seem to carry the signal. The rest are dimmed. Over thousands of training cycles, the spotlight gets very specific about where to look.

Layered underneath is a variational autoencoder, a second neural network that compresses each cell’s gene-expression profile into a compact representation while scrubbing out the technical noise that bedevils single-cell data: dropouts, batch effects, the fact that every sequencing run has its own quirks. “This work uses artificial intelligence to develop a new way to study survival using single-cell data,” says Zheng Xia, the study’s senior author and an associate professor of biomedical engineering at OHSU’s Knight Cancer Institute. “The model is more complex than traditional machine learning approaches, and it allows us to capture information that was not accessible before.”

On melanoma data from 32 patients treated with immunotherapy, the tool achieved a concordance index of 0.812 for predicting patient survival. The metric runs from 0.5 (random guessing) to 1.0 (perfect ranking), and 0.812 is better than any of the benchmark methods the OHSU group tested, including models based on cell-type proportions, bulk expression averages, or the widely used CXCL9/SPP1 macrophage ratio. On a much larger liver-cancer dataset covering more than a million cells from 124 patients, it hit 0.719. Not perfect, by any stretch. But useful.

What the cells are saying

The more interesting results, arguably, are not the prediction accuracies but the biological stories the model tells along the way. In the melanoma cohort, scSurvival flagged two distinct populations of macrophages that looked almost identical on standard type-labels but carried wildly different prognostic weight. One subpopulation, marked by high expression of a gene called SPP1, showed up in patients who did poorly on immunotherapy. The other, marked by CXCL9, showed up in responders. The ratio between these two states has been popping up in recent cancer literature as a clue to who benefits from checkpoint blockade, and scSurvival landed on it without being told to look.

A similar story emerged in the T cells. Patients doing well had T cells expressing markers of stem-like memory, the kind of cell that can keep replenishing an immune response over time. Patients doing poorly had T cells with a stress signature, studded with heat-shock genes and exhaustion markers. Crucially, these weren’t different cell types. They were different functional states of the same types, the sort of distinction that gets flattened in any approach based on cell-type proportions.

“A risk assessment tool that not only tells you who may be at higher risk, but also provides clues as to why, could really help in these difficult cancers,” says Anthony Letai, director of the National Cancer Institute, which funded much of the work.

Not yet, but getting closer

scSurvival is not ready for the clinic. The authors are careful about saying so. Validation on larger independent cohorts is still needed, and the computational cost, though modest by modern AI standards (about 17 minutes to process a million cells on a high-end GPU), is still a barrier in most pathology labs. There’s also the question of what a clinician actually does with the information that a patient’s tumor contains a suspicious-looking SPP1+ macrophage population. Right now, not much. The matching therapies haven’t been developed.

But the direction of travel is becoming hard to miss. Single-cell sequencing costs keep falling. Cohort studies keep getting bigger. Tools like scSurvival, which is freely available on GitHub, are beginning to turn what was once an academic resource into something that could, in principle, inform treatment decisions. The gap between sequencing a tumor and understanding it just got a little narrower, one cell at a time.

Source: Ren, T., Zhao, F., Chen, C. et al. “scSurvival: single-cell survival analysis of clinical cancer cohort data at cellular resolution.” Cancer Discovery (2026). DOI: 10.1158/2159-8290.CD-25-0965

Frequently Asked Questions

Why do two patients with the same cancer diagnosis often have such different outcomes?

Tumors are not uniform tissues. Within a single cancer, different cells can behave in very different ways, some driving disease progression, others resisting it. Standard diagnostic tools average across all of these cells, which hides the individual populations that most influence whether a patient responds to treatment or not.

How is scSurvival different from other cancer prediction tools?

Most survival-prediction models treat a tumor as one lump of data, either averaging all its cells together or grouping them by type. scSurvival instead assigns each individual cell a weight based on how closely it ties to patient outcomes, letting the most prognostically important cells drive the prediction. This is the first tool to apply single-cell resolution directly to survival analysis in this way.

Can scSurvival be used to treat patients today?

Not yet. The tool is a research framework, not a clinical product, and will need validation across larger, more diverse patient cohorts before it could be used in care. The researchers also note that identifying a high-risk cell population is only useful if therapies exist to target it, and that pipeline is still being developed.

What does the tool reveal about the tumor microenvironment?

In melanoma, scSurvival identified two macrophage populations with nearly identical cell-type labels but opposite prognostic signals, distinguished by expression of SPP1 versus CXCL9. In T cells, it separated stem-like memory states linked to favourable outcomes from stress-adapted exhausted states linked to poor ones. These distinctions are typically invisible to methods that group cells only by type.

Is the software available for other researchers to use?

Yes. The OHSU team has released scSurvival as open-source software on GitHub, Zenodo, and Code Ocean, along with tutorials. Any lab with single-cell sequencing data and matched patient survival records can run the analysis on a reasonably well-equipped GPU.

Quick Note Before You Read On.

ScienceBlog.com has no paywalls, no sponsored content, and no agenda beyond getting the science right. Every story here is written to inform, not to impress an advertiser or push a point of view.

Good science journalism takes time — reading the papers, checking the claims, finding researchers who can put findings in context. We do that work because we think it matters.

If you find this site useful, consider supporting it with a donation. Even a few dollars a month helps keep the coverage independent and free for everyone.