Loading…

High performance computing approach for DNA motif discovery

Unraveling the mechanisms that regulate gene expression is a major challenge in biology. An important task in this challenge is to identify regulatory elements, especially the binding sites in deoxyribonucleic acid (DNA) for transcription factors. These binding sites are short DNA segments that are...

Full description

Saved in:
Bibliographic Details
Published in:CSI TRANSACTIONS ON ICT 2019, Vol.7 (4), p.295-297
Main Author: Shrimankar, Deepti D.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Unraveling the mechanisms that regulate gene expression is a major challenge in biology. An important task in this challenge is to identify regulatory elements, especially the binding sites in deoxyribonucleic acid (DNA) for transcription factors. These binding sites are short DNA segments that are called motifs. The motifs are short, recurring patterns in DNA sequences that are presumed to have a biological function. Motif discovery has been one of the most widely studied problems in bioinformatics ever since genomic sequences have been available. Recent advances in genome sequence availability and in high throughput gene expression analysis technologies have allowed for the development of computational methods for motif discovery. As a result, a large number of motif finding algorithms have been implemented and applied to various motif models over the past decade. Since regulatory elements are frequently short and variable, their identification and discovery using computational algorithms is difficult. However, significant advances have been made in the computational methods for modeling and detection of DNA regulatory elements. The detection of regulatory elements from a large set of regulatory regions is a challenging problem in computational genomics. However, computational methods to extract this biological meaningful information suffer from high computational requirements. High performance computing appears as a magic bullet in this challenge. Designing a parallel algorithm to detect regulatory elements using correlation with gene expression data and its implementation with openMPI and openMP will leads to significant runtime savings on distributed system. Solving computationally intensive problems on high performance computing architecture can significantly improve and speedup the run time of the problem solution when proper task distribution, scheduling strategy and suitable parallel computing paradigms are used. Deploying more and more cluster computers can bridge the gap of speed difference between architectures and will result in fewer numbers of concurrent jobs that can be allocated to the system.
ISSN:2277-9078
2277-9086
DOI:10.1007/s40012-019-00235-w