Loading…

Towards a molecules production from DNA sequences based on clustering by 3D cellular automata approach and n-grams technique

Knowledge extraction from genomic data is important activity for the biologist. In order to extract the underlying biological knowledge, we based on the generic framework of Knowledge extraction from data. In this paper, we transformed DNA sequences into texts, the texts are indexed by TF-IDF and n-...

Full description

Saved in:
Bibliographic Details
Main Authors: Kabli, Fatima, Hamou, Reda Mohamed, Amine, Abdelmalek
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Knowledge extraction from genomic data is important activity for the biologist. In order to extract the underlying biological knowledge, we based on the generic framework of Knowledge extraction from data. In this paper, we transformed DNA sequences into texts, the texts are indexed by TF-IDF and n-grams approach. Secondly, we grouped the similar DNA sequences by clustering; we applied bio-inspired method 3D cellular automata. Then we analyze the clustering results by the transformation of each DNA sequences into amino acids sequences according the standard genetic code, we concluded that, the clusters help the biologist to select DNA sequences can produce a kind of medicament (molecule) and their various derivatives (low concentration).
ISSN:2161-5330
DOI:10.1109/AICCSA.2015.7507249