Loading…

Abstract 3383: EPIC: MHC-I epitope prediction integrating mass spectrometry derived motifs and tissue-specific expression profiles

Background: Accurate prediction of epitopes presented by human leukocyte antigen (HLA) is crucial for personalized cancer immunotherapies targeting T cell epitopes. Mass spectrometry (MS)profiling of eluted HLA ligands, which provides unbiased, high-throughput measurements of HLA associated peptides...

Full description

Saved in:
Bibliographic Details
Published in:Cancer research (Chicago, Ill.) Ill.), 2019-07, Vol.79 (13_Supplement), p.3383-3383
Main Authors: Hu, Weipeng, Qiu, Si, Li, Youping, Liu, Geng, Zhang, Xiuqing, Lee, Leo J
Format: Article
Language:English
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Background: Accurate prediction of epitopes presented by human leukocyte antigen (HLA) is crucial for personalized cancer immunotherapies targeting T cell epitopes. Mass spectrometry (MS)profiling of eluted HLA ligands, which provides unbiased, high-throughput measurements of HLA associated peptides resulting from in vivo cellular processing, can be a highly valuable training set to build predictive models of HLA binding. In addition, gene expression profiles measured by RNA-seq data in a specific cell type could significantly improve the positive predictive value (PPV) of epitope presentation prediction. Although large amount of high-quality mass spectrometry data of HLA-bound peptides is being generated in the last few years, few of them provide matching RNA-seq data, which makes incorporating gene expression into epitope prediction difficult. Here, we aim to develop a publicly available prediction tool incorporating both sources of information, and demonstrate its superior performance over existing methods. Methods: We obtained public HLA peptidome datasets with matching RNA-seq data of twelve cell lines derived from multiple tissues. We used these MS HLA ligand data to build Position Score Specific Matrixes (PSSMs) for five HLA-I alleles across these cell lines. We then used logistic regression to model the relationship among PSSM score, gene expression, peptide length distribution and whether the peptide could be presented in each of the twelve cell lines, and compared the feature weights among them. Results: We found that the feature weights across different HLA-I alleles and cell lines were close to each other, suggesting that there is a universal relationship between PSSM score and gene expression across different cell lines that could be applied to epitope presentation prediction for multiple alleles in diverse tissues. When we replaced the cell-line-specific weights with universal weights summarized from all the cell lines, the logistic regression model’s predicted power for each cell line only dropped slightly and still substantially outperformed predictions based on PSSM scores alone. Based on such a finding, we applied the universal feature weights to more than 180,000 unique HLA ligands collected from public HLA peptidomics datasets, and presented an Epitope Presentation Integrated prediCtion (EPIC) model for 66 HLA alleles. EPIC was substantially better than other popular methods, including MixMHCpred, NetMHCpan (v4.0), and MHCflurry, when e
ISSN:0008-5472
1538-7445
DOI:10.1158/1538-7445.AM2019-3383