Loading…

Regular expression based pattern extraction from a cell - Specific gene expression data

Cancer cells are formed when active genes stop functioning properly. Timely activation of a gene is governed through the combined effort of multiple Transcription Factors (TFs). TFs are proteins that bind on DNA in a sequence-specific manner. It is difficult to trace the target and role of TFs in th...

Full description

Saved in:
Bibliographic Details
Published in:Informatics in medicine unlocked 2019, Vol.17, p.100269, Article 100269
Main Authors: Subramanian, Suja, Thomas, Tessamma
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Cancer cells are formed when active genes stop functioning properly. Timely activation of a gene is governed through the combined effort of multiple Transcription Factors (TFs). TFs are proteins that bind on DNA in a sequence-specific manner. It is difficult to trace the target and role of TFs in the gene regulation process. The same element acts differently in different places, similar to the way the same word has a different meaning in a different context. This approach treats the cell line in a language context, whereas the genes and TFs are the symbols or letters of the language. Different combination of symbols forms a sequence with repetitive patterns. Identifying and analysing such frequently occurring patterns will give a better insight into the cell. This work mainly aims to identify such patterns found in the cell line using regular expression technique. The patterns generated in this work can be chosen as a feature for identifying the effect of regulatory elements in the genomic region. For improving readability identity of each character present in the pattern is documented in the form of a text file. Acute Myeloid Leukaemia (AML) data from GEO repository and the related two TFs binding narrow peak data, calibrated in K562 cell line from ENCODE consortium are taken as a case study.
ISSN:2352-9148
2352-9148
DOI:10.1016/j.imu.2019.100269