Exploring predictive features of peptide immunogenicity for the design of immunotherapeutics : A study in the interface between immunopeptidomics and immunoinformatics
Abstract
A fundamental characteristic of effective T cell-based vaccines and immunotherapeutics is the ability of an MHC-restricted peptide to elicit a T cell response. Despite the multitude of peptides contained within a protein antigen, only a fraction of these are presented on the cell surface in complex with MHC. Identifying the immunogenic peptides, the T cell epitopes, poses an immense challenge, particularly in the context of cancer immunotherapy based on tumour-specific epitopes, termed neoepitopes. Computational prediction algorithms used to select putative epitopes traditionally relied on peptide-MHC (pMHC) binding affinity data. However, despite the integration of mass spectrometry (MS)-based datasets leading to increased accuracy in predicting peptide binding, these algorithms still perform poorly in predicting peptide immunogenicity. This thesis explores features of peptide immunogenicity, with a particular emphasis on the development of assays to probe pMHC stability which has been linked to immunogenicity in several studies, and the utilisation of this information to improve prediction algorithms. Specifically, the work integrates experimental methods from the field of immunopeptidomics with computational approaches from the field of immunoinformatics to study peptide binding to MHC class I (MHC I). We initially studied a large MS dataset of MHC I peptides and show that MS hotspots, defined as MHC-associated peptide-enriched areas in protein sequences, although being largely captured by pMHC affinity prediction, also contain a signal of peptide processing which lies beyond what an affinity predictor can capture. Next, we investigated the feature of pMHC stability and trained a neural network model based on a limited dataset of MHC I peptides obtained from an in vitro stability assay. We show that these stability data provide a means to train a model that can predict immunogenic peptides; however, that the trained stability predictor does not outperform a model trained using experimental affinity data. Inspired by these findings, we developed an MS-based assay to profile both the kinetic and thermal stability of MHC I immunopeptidomes in an unbiased manner. We show that the added dimensionality of the thermostability data facilitated the training of a neural network model that demonstrates superior performance in predicting cancer neoepitopes compared to state-of-the-art prediction tools. This thesis underlines the need for high quality data to train accurate immunoinformatics models. Further, the work described contributes to the development of methods to improve our fundamental understanding of peptide immunogenicity and features thereof, providing an aid for the design of future vaccines and immunotherapeutics.