Proteomics is rapidly becoming a promising field for discovering predictive cancer biomarkers. Recently, ProCan and the Wellcome Sanger Institute published the world’s largest pan-cancer proteomic dataset of 949 cell lines, treated with 625 anti-cancer drugs. This dataset is a powerful tool for identifying proteomic biomarkers of drug response, which can be used to build drug susceptibility profiles for tumours. Such tools could be as a clinical adjunct to determine the optimal treatment for a patient, while avoiding administration of ineffective and toxic drugs.
Recent studies have identified single proteins in cell line proteomic data correlating with drug susceptibility. However, due to computational demand, identifying pair-wise (doublet), and higher-order (triplet and quadruplet) combinations that synergistically modulate drug susceptibility are beyond the scope of current methods. Here, we use a novel machine learning method to identify protein combinations involving up to 4 features that have a greater than additive effect on drug susceptibility, uncovering higher-order proteomic networks underlying drug response.
We present a comprehensive catalogue of proteomic signatures in cancer correlating with drug susceptibility, enabling insight into biologically relevant pathways with predictive value. Our method uncovers “global” baseline signatures predicting drug susceptibility that recurrently appear across all cell lines, and “local” signatures that exclusively predict susceptibility to specific drug classes. For example, high baseline expressions of MCM family proteins exclusively predict increased sensitivity to microtubule inhibitors, while protein hubs centred around PAIRB and LMNB2 confer “global” resistance, across most drug classes. To validate these findings, we replicate these resistance signatures in an independent dataset.
Our method provides a scalable framework for harnessing complex, multi-omic datasets to develop diagnostic and predictive panels. Taken together, our findings contribute towards the goal of leveraging ‘omic data to guide cancer precision medicine, leading to more effective, personalised treatments for cancer patients.