A team of researchers led by Memorial Sloan-Kettering Cancer Center Thursday published work describing how they used machine learning, a form of artificial intelligence (AI), to predict tumor type from targeted DNA sequence data to predict tissue of origin, in a finding that may have implications to improve diagnostic and clinical care.
A team of researchers led by Memorial Sloan-Kettering Cancer Center Thursday published work describing how they used machine learning, a form of artificial intelligence, to predict tumor type from targeted DNA sequence data to predict tissue of origin, in a finding that may have implications to improve diagnostic and clinical care.
Writing in JAMA Oncology, the researchers noted that pinpointing tumor origin is key to understanding its biologic characteristics in response to treatment. This study sought to understand if data derived from routine clinical DNA tumor sequencing could complement conventional diagnostic approaches.1
The training data set was derived from the Memorial Sloan-Kettering Integrated Mutation Profiling of Actionable Cancer Targets (MSK-IMPACT). MSK-IMPACT is a FDA-authorized 468-gene tumor sequencing assay cancer panel. The classification was drawn from mutations and indels (hotspots and gene level), focal amplifications and deletions, broad copy number gains and losses, structural rearrangements, mutation signatures, mutation rate, and sex. Classifier scores were calibrated using multinomial logistic regression to match empirically observed classification probabilities.
They constructed and trained a random forest algorithmic classifier on a group of 7791 patients with advanced cancer representing 22 cancer types: non—small cell lung, breast, colon, prostate, glioma, bladder, pancreatic, renal cell, melanoma, esophagogastric, germ cell tumor, thyroid, ovarian, endometrial, cholanglocarcinoma, head and neck, gastrointestinal stromal tumor, mesothelialioma, small cell lung, pancreatic, neuroendocrine tumor, neuroblastoma, and uveal melanoma.
The correct tumor type was predicted for 5748 (73.8%) of the patients. Additionally, the algorithm was applied to an independent set with 74.1% accuracy.
The predictions were assigned probabilities that reflected empirical accuracy; 43.5% of the cases had a more than 95% probability of high confidence.
“Likely tissues of origin” were predicted from targeted tumor sequencing in 67.4% of patients with cancers of unknown primary location.
Genomic analysis of plasma cell—free DNA yielded accurate predictions in three-fourths of 60 cases; the researchers said the results suggest that the approach may be used as both an adjunct to cancer screening and in diverse clinical settings.
The machine learning method was applied to 2 patients under active treatment; the patients were initially presumed to have metastatic breast cancer but the genome-directed reassessment yielded suggested approaches for more targeted treatments, which led to clinical responses.
The sequencing could be used to inform cancer diagnosis in addition to conventional techniques, the author said.
Writing in an accompanying editorial, the authors said that knowing the tumor of origin does not change patient outcome, but said the findings could be helpful when placed in the broader context of new uses of sequencing technologies.2
One way in which the findings can be used is when the diagnosis is challenging, so computational predictions from genomic data might exclude possibilities even if the predictions are not 100%. Other times, a high-confidence prediction that is contrary to the suspected diagnosis can spur a reevaluation.
1. Penson A, Camacho N, Zheng Y, et al. Development of genome derived tumor type prediction to inform clinical cancer care [published online November 14, 2019]. JAMA Oncol. doi: 10.1001/jamaoncol.2019.3985.
2. Liu ET, Mockus SM. Tumor origins through genomic profiles [published online November 14, 2019]. JAMA Oncol.