Loading…
A modularized MapReduce framework to support RNA secondary structure prediction and analysis workflows
Ribonucleic acid (RNA) molecules play important roles in many biological processes including gene expression and regulation. Their secondary structures are crucial for the RNA functionality, and the prediction of the secondary structures is widely studied. Previous research shows that cutting long s...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Ribonucleic acid (RNA) molecules play important roles in many biological processes including gene expression and regulation. Their secondary structures are crucial for the RNA functionality, and the prediction of the secondary structures is widely studied. Previous research shows that cutting long sequences into shorter chunks, predicting secondary structures of the chunks independently using thermodynamic methods, and reconstructing the entire secondary structure from the predicted chunk structures tend to yield better accuracy than predicting the secondary structure using the entire RNA sequence as a whole. The chunking, prediction, and reconstruction processes can use different methods and parameters, some of which produce more accurate predictions than others. The RNA sequence can be cut into chunks using different cutting methods and chunk lengths. Several prediction methods, with different degree of accuracy and computing requirements, can be used. The reconstruction of shorter predictions into the entire sequence can rely on simply gluing the parts together or on using more sophisticated merging algorithms. To allow scientists to perform a systematic analysis of the impact of the several methods and parameters, we propose a modularized framework using MapReduce. The framework enables scientists to automatically and efficiently explore large parametric spaces of chunking, prediction, reconstruction, and analysis methods. This paper shows how the MapReduce framework allows scientists to gain insights about different chunking strategies easily, accurately, and efficiently. |
---|---|
DOI: | 10.1109/BIBMW.2012.6470251 |