Lightning Talk and Poster Presentation GENEMAPPERS 2024

Context-Adjusted Proportion of Singletons (CAPS): A robust metric for assessing negative selection and benchmarking variant pathogenicity predictors (#11)

Mikhail Gudkov 1 , Loic Thibaut 2 , Steven Monger 1 , Debjani Das 1 , David S Winlaw 3 , Sally L Dunwoodie 1 , Eleni Giannoulatou 1
  1. Victor Chang Cardiac Research Institute, Darlinghurst, NSW, Australia
  2. Institute for Molecular Bioscience, Saint Lucia, Australia
  3. Ann & Robert H. Lurie Children's Hospital of Chicago, Chicago, Illinois, United States

Interpreting genetic variants remains challenging, especially for understudied variant classes. Population genetics methods, like the Mutability-Adjusted Proportion of Singletons (MAPS) metric, assess variant effects through population distributions. However, MAPS' sensitivity to the singletons-by-mutability model calibration may lead to biased estimates for specific variant sets. This study introduces Context-Adjusted Proportion of Singletons (CAPS), a novel metric refining MAPS methodology by eliminating the mutability layer, enhancing robustness in assessing negative selection in the human genome. Here, CAPS was employed to benchmark pathogenicity predictors without relying on known pathogenic variant sets. CAPs derives the expected level of rare variation on a per-context basis from synonymous variants. For missing contexts, expected proportions of singletons were approximated from intronic variants using probit regression. Leveraging gnomAD genomic data, we benchmarked pathogenicity predictors using CAPS and compared our estimates to ClinVar variant classifications. We showed that CAPS outperforms MAPS, yielding more accurate negative selection estimates. Using CAPS, poorly calibrated predictors were identified, highlighting CADD and REVEL as the best-performing methods, with REVEL exhibiting better calibration. CAPS emerges as a promising metric for studying negative selection, serving as a valuable benchmarking tool for pathogenicity predictors. Our findings underscore the importance of cautious predictor integration into variant interpretation pipelines and emphasise biases associated with variant classification.