Loading…

Machine Learning-Based Gene Prioritization Identifies Novel Candidate Risk Genes for Inflammatory Bowel Disease

The inflammatory bowel diseases (IBDs) are chronic inflammatory disorders, associated with genetic, immunologic, and environmental factors. Although hundreds of genes are implicated in IBD etiology, it is likely that additional genes play a role in the disease process. We developed a machine learnin...

Full description

Saved in:
Bibliographic Details
Published in:Inflammatory bowel diseases 2017-09, Vol.23 (9), p.1516-1523
Main Authors: Isakov, Ofer, Dotan, Iris, Ben-Shachar, Shay
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The inflammatory bowel diseases (IBDs) are chronic inflammatory disorders, associated with genetic, immunologic, and environmental factors. Although hundreds of genes are implicated in IBD etiology, it is likely that additional genes play a role in the disease process. We developed a machine learning-based gene prioritization method to identify novel IBD-risk genes. Known IBD genes were collected from genome-wide association studies and annotated with expression and pathway information. Using these genes, a model was trained to identify IBD-risk genes. A comprehensive list of 16,390 genes was then scored and classified. Immune and inflammatory responses, as well as pathways such as cell adhesion, cytokine-cytokine receptor interaction, and sulfur metabolism were identified to be related to IBD. Scores predicted for IBD genes were significantly higher than those for non-IBD genes (P < 10). There was a significant association between the score and having an IBD publication (P < 10). Overall, 347 genes had a high prediction score (>0.8). A literature review of the genes, excluding those used to train the model, identified 67 genes without any publication concerning IBD. These genes represent novel candidate IBD-risk genes, which can be targeted in future studies. Our method successfully differentiated IBD-risk genes from non-IBD genes by using information from expression data and a multitude of gene annotations. Crucial features were defined, and we were able to detect novel candidate risk genes for IBD. These findings may help detect new IBD-risk genes and improve the understanding of IBD pathogenesis.
ISSN:1078-0998
1536-4844
DOI:10.1097/MIB.0000000000001222