Loading…

Assessment of Hierarchical Clustering Methodologies for Proteomic Data Mining

Hierarchical clustering methodology is a powerful data mining approach for a first exploration of proteomic data. It enables samples or proteins to be grouped blindly according to their expression profiles. Nevertheless, the clustering results depend on parameters such as data preprocessing, between...

Full description

Saved in:
Bibliographic Details
Published in:Journal of proteome research 2007-01, Vol.6 (1), p.358-366
Main Authors: Meunier, Bruno, Dumas, Emilie, Piec, Isabelle, Béchet, Daniel, Hébraud, Michel, Hocquette, Jean-François
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Hierarchical clustering methodology is a powerful data mining approach for a first exploration of proteomic data. It enables samples or proteins to be grouped blindly according to their expression profiles. Nevertheless, the clustering results depend on parameters such as data preprocessing, between-profile similarity measurement, and the dendrogram construction procedure. We assessed several clustering strategies by calculating the F-measure, a widely used quality metric. The combination, on logged matrix, of Pearson correlation and Ward's methods for data aggregation is among the best clustering strategies, at least with the data sets we studied. This study was carried out using PermutMatrix, a freely available software derived from transcriptomics. Keywords: proteomics • bioinformatics • data mining • hierarchical clustering • 2-D PAGE
ISSN:1535-3893
1535-3907
DOI:10.1021/pr060343h