Loading…
Clustering Formal Concepts to Discover Biologically Relevant Knowledge from Gene Expression Data
The production of high-throughput gene expression data has generated a crucial need for bioinformatics tools to generate biologically interesting hypotheses. Whereas many tools are available for extracting global patterns, less attention has been focused on local pattern discovery. We propose here a...
Saved in:
Published in: | In silico biology 2007, Vol.7 (4-5), p.467-483 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The production of high-throughput gene expression data has generated
a crucial need for bioinformatics tools to generate biologically interesting
hypotheses. Whereas many tools are available for extracting global patterns,
less attention has been focused on local pattern discovery. We propose here an
original way to discover knowledge from gene expression data by means of the
so-called formal concepts which hold in derived Boolean gene expression
datasets. We first encoded the over-expression properties of genes in human
cells using human SAGE data. It has given rise to a Boolean matrix from which
we extracted the complete collection of formal concepts, i.e., all the largest
sets of over-expressed genes associated to a largest set of biological
situations in which their over-expression is observed. Complete collections of
such patterns tend to be huge. Since their interpretation is a time-consuming
task, we propose a new method to rapidly visualize clusters of formal concepts.
This designates a reasonable number of Quasi-Synexpression-Groups (QSGs) for
further analysis. The interest of our approach is illustrated using human SAGE
data and interpreting one of the extracted QSGs. The assessment of its
biological relevancy leads to the formulation of both previously proposed and
new biological hypotheses. |
---|---|
ISSN: | 1386-6338 1434-3207 |
DOI: | 10.3233/ISI-2007-00321 |