Loading…
Analyzing the similarity of protein domains by clustering Molecular Surface Maps
•We present an imaged-based hierarchical clustering of functionally similar proteins.•We compare 3 methods for obtaining the similarity of Molecular Surface Maps.•We tested different combinations of molecular properties for functional similarity.•We present a multi-view visualization application for...
Saved in:
Published in: | Computers & graphics 2021-10, Vol.99, p.114-127, Article 114 |
---|---|
Main Authors: | , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | •We present an imaged-based hierarchical clustering of functionally similar proteins.•We compare 3 methods for obtaining the similarity of Molecular Surface Maps.•We tested different combinations of molecular properties for functional similarity.•We present a multi-view visualization application for exploring the clustering.•We evaluate our method on data consisting of complete proteins and protein domains.
[Display omitted]
Many biochemical and biomedical applications such as protein engineering or drug design are concerned with finding functionally similar proteins, however, this remains to be a challenging task. We present a new imaged-based approach for identifying and visually comparing proteins with similar function that builds on the hierarchical clustering of Molecular Surface Maps. Such maps are two-dimensional representations of complex molecular surfaces and can be used to visualize the topology and different physico-chemical properties of proteins. Our method is based on the idea that visually similar maps also imply a similarity in the function of the mapped proteins. To determine map similarity, we compute descriptive feature vectors using image moments, color moments, or a Convolutional Neural Network and use them for a hierarchical clustering of the maps. We demonstrate the feasibility of our approach using two data sets: an ensemble of hand-selected proteins with known similarities used for verification and an ensemble of ketolase enzymes, where we analyzed the individual domains using our method. Our method is integrated in an interactive visualization application, which allows users to explore and analyze the results. It visualizes the hierarchical clustering and offers linked views that provide details for a comparative data analysis. |
---|---|
ISSN: | 0097-8493 1873-7684 |
DOI: | 10.1016/j.cag.2021.06.007 |