Loading…

Clustering Large Attributed Graphs: A Balance between Structural and Attribute Similarities

Social networks, sensor networks, biological networks, and many other information networks can be modeled as a large graph. Graph vertices represent entities, and graph edges represent their relationships or interactions. In many large graphs, there is usually one or more attributes associated with...

Full description

Saved in:

Bibliographic Details
Published in:	ACM transactions on knowledge discovery from data 2011-02, Vol.5 (2), p.1-33
Main Authors:	Cheng, Hong, Zhou, Yang, Yu, Jeffrey Xu
Format:	Article
Language:	English
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites	cdi_FETCH-LOGICAL-c195t-ebbab5fef4b4c380188b55fe10429d7ab5152094384010835e37d6abe3ce1f33
container_end_page	33
container_issue	2
container_start_page	1
container_title	ACM transactions on knowledge discovery from data
container_volume	5
creator	Cheng, Hong Zhou, Yang Yu, Jeffrey Xu
description	Social networks, sensor networks, biological networks, and many other information networks can be modeled as a large graph. Graph vertices represent entities, and graph edges represent their relationships or interactions. In many large graphs, there is usually one or more attributes associated with every graph vertex to describe its properties. In many application domains, graph clustering techniques are very useful for detecting densely connected groups in a large graph as well as for understanding and visualizing a large graph. The goal of graph clustering is to partition vertices in a large graph into different clusters based on various criteria such as vertex connectivity or neighborhood similarity. Many existing graph clustering methods mainly focus on the topological structure for clustering, but largely ignore the vertex properties, which are often heterogenous. In this article, we propose a novel graph clustering algorithm, SA-Cluster , which achieves a good balance between structural and attribute similarities through a unified distance measure. Our method partitions a large graph associated with attributes into k clusters so that each cluster contains a densely connected subgraph with homogeneous attribute values. An effective method is proposed to automatically learn the degree of contributions of structural similarity and attribute similarity. Theoretical analysis is provided to show that SA-Cluster is converging quickly through iterative cluster refinement. Some optimization techniques on matrix computation are proposed to further improve the efficiency of SA-Cluster on large graphs. Extensive experimental results demonstrate the effectiveness of SA-Cluster through comparisons with the state-of-the-art graph clustering and summarization methods.
doi_str_mv	10.1145/1921632.1921638
format	article
fullrecord	<record><control><sourceid>crossref</sourceid><recordid>TN_cdi_crossref_primary_10_1145_1921632_1921638</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>10_1145_1921632_1921638</sourcerecordid><originalsourceid>FETCH-LOGICAL-c195t-ebbab5fef4b4c380188b55fe10429d7ab5152094384010835e37d6abe3ce1f33</originalsourceid><addsrcrecordid>eNo1jzFrwzAQhUVoIGnSOavp7uRO0tnyGEybFgxdMnQzkn1KHdI2SMrQf9-UuNP3Hg8efEKsENaImjZYSSyUXN9oJmKOREWuS_l-958LgzNxH-MRgAhRzsVjfbrExGH4OmSNDQfOtimFwV0S99ku2PNHXIqpt6fIDyMXYv_8tK9f8uZt91pvm7zDilLOzllHnr12ulMG0BhH146gZdWX1w1JQqWV0YBgFLEq-8I6Vh2jV2ohNrfbLnzHGNi35zB82vDTIrR_hu1oONKoX86XQg0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Clustering Large Attributed Graphs: A Balance between Structural and Attribute Similarities</title><source>Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)</source><creator>Cheng, Hong ; Zhou, Yang ; Yu, Jeffrey Xu</creator><creatorcontrib>Cheng, Hong ; Zhou, Yang ; Yu, Jeffrey Xu</creatorcontrib><description>Social networks, sensor networks, biological networks, and many other information networks can be modeled as a large graph. Graph vertices represent entities, and graph edges represent their relationships or interactions. In many large graphs, there is usually one or more attributes associated with every graph vertex to describe its properties. In many application domains, graph clustering techniques are very useful for detecting densely connected groups in a large graph as well as for understanding and visualizing a large graph. The goal of graph clustering is to partition vertices in a large graph into different clusters based on various criteria such as vertex connectivity or neighborhood similarity. Many existing graph clustering methods mainly focus on the topological structure for clustering, but largely ignore the vertex properties, which are often heterogenous. In this article, we propose a novel graph clustering algorithm, SA-Cluster , which achieves a good balance between structural and attribute similarities through a unified distance measure. Our method partitions a large graph associated with attributes into k clusters so that each cluster contains a densely connected subgraph with homogeneous attribute values. An effective method is proposed to automatically learn the degree of contributions of structural similarity and attribute similarity. Theoretical analysis is provided to show that SA-Cluster is converging quickly through iterative cluster refinement. Some optimization techniques on matrix computation are proposed to further improve the efficiency of SA-Cluster on large graphs. Extensive experimental results demonstrate the effectiveness of SA-Cluster through comparisons with the state-of-the-art graph clustering and summarization methods.</description><identifier>ISSN: 1556-4681</identifier><identifier>EISSN: 1556-472X</identifier><identifier>DOI: 10.1145/1921632.1921638</identifier><language>eng</language><ispartof>ACM transactions on knowledge discovery from data, 2011-02, Vol.5 (2), p.1-33</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c195t-ebbab5fef4b4c380188b55fe10429d7ab5152094384010835e37d6abe3ce1f33</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Cheng, Hong</creatorcontrib><creatorcontrib>Zhou, Yang</creatorcontrib><creatorcontrib>Yu, Jeffrey Xu</creatorcontrib><title>Clustering Large Attributed Graphs: A Balance between Structural and Attribute Similarities</title><title>ACM transactions on knowledge discovery from data</title><description>Social networks, sensor networks, biological networks, and many other information networks can be modeled as a large graph. Graph vertices represent entities, and graph edges represent their relationships or interactions. In many large graphs, there is usually one or more attributes associated with every graph vertex to describe its properties. In many application domains, graph clustering techniques are very useful for detecting densely connected groups in a large graph as well as for understanding and visualizing a large graph. The goal of graph clustering is to partition vertices in a large graph into different clusters based on various criteria such as vertex connectivity or neighborhood similarity. Many existing graph clustering methods mainly focus on the topological structure for clustering, but largely ignore the vertex properties, which are often heterogenous. In this article, we propose a novel graph clustering algorithm, SA-Cluster , which achieves a good balance between structural and attribute similarities through a unified distance measure. Our method partitions a large graph associated with attributes into k clusters so that each cluster contains a densely connected subgraph with homogeneous attribute values. An effective method is proposed to automatically learn the degree of contributions of structural similarity and attribute similarity. Theoretical analysis is provided to show that SA-Cluster is converging quickly through iterative cluster refinement. Some optimization techniques on matrix computation are proposed to further improve the efficiency of SA-Cluster on large graphs. Extensive experimental results demonstrate the effectiveness of SA-Cluster through comparisons with the state-of-the-art graph clustering and summarization methods.</description><issn>1556-4681</issn><issn>1556-472X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2011</creationdate><recordtype>article</recordtype><recordid>eNo1jzFrwzAQhUVoIGnSOavp7uRO0tnyGEybFgxdMnQzkn1KHdI2SMrQf9-UuNP3Hg8efEKsENaImjZYSSyUXN9oJmKOREWuS_l-958LgzNxH-MRgAhRzsVjfbrExGH4OmSNDQfOtimFwV0S99ku2PNHXIqpt6fIDyMXYv_8tK9f8uZt91pvm7zDilLOzllHnr12ulMG0BhH146gZdWX1w1JQqWV0YBgFLEq-8I6Vh2jV2ohNrfbLnzHGNi35zB82vDTIrR_hu1oONKoX86XQg0</recordid><startdate>201102</startdate><enddate>201102</enddate><creator>Cheng, Hong</creator><creator>Zhou, Yang</creator><creator>Yu, Jeffrey Xu</creator><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>201102</creationdate><title>Clustering Large Attributed Graphs</title><author>Cheng, Hong ; Zhou, Yang ; Yu, Jeffrey Xu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c195t-ebbab5fef4b4c380188b55fe10429d7ab5152094384010835e37d6abe3ce1f33</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2011</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Cheng, Hong</creatorcontrib><creatorcontrib>Zhou, Yang</creatorcontrib><creatorcontrib>Yu, Jeffrey Xu</creatorcontrib><collection>CrossRef</collection><jtitle>ACM transactions on knowledge discovery from data</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Cheng, Hong</au><au>Zhou, Yang</au><au>Yu, Jeffrey Xu</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Clustering Large Attributed Graphs: A Balance between Structural and Attribute Similarities</atitle><jtitle>ACM transactions on knowledge discovery from data</jtitle><date>2011-02</date><risdate>2011</risdate><volume>5</volume><issue>2</issue><spage>1</spage><epage>33</epage><pages>1-33</pages><issn>1556-4681</issn><eissn>1556-472X</eissn><abstract>Social networks, sensor networks, biological networks, and many other information networks can be modeled as a large graph. Graph vertices represent entities, and graph edges represent their relationships or interactions. In many large graphs, there is usually one or more attributes associated with every graph vertex to describe its properties. In many application domains, graph clustering techniques are very useful for detecting densely connected groups in a large graph as well as for understanding and visualizing a large graph. The goal of graph clustering is to partition vertices in a large graph into different clusters based on various criteria such as vertex connectivity or neighborhood similarity. Many existing graph clustering methods mainly focus on the topological structure for clustering, but largely ignore the vertex properties, which are often heterogenous. In this article, we propose a novel graph clustering algorithm, SA-Cluster , which achieves a good balance between structural and attribute similarities through a unified distance measure. Our method partitions a large graph associated with attributes into k clusters so that each cluster contains a densely connected subgraph with homogeneous attribute values. An effective method is proposed to automatically learn the degree of contributions of structural similarity and attribute similarity. Theoretical analysis is provided to show that SA-Cluster is converging quickly through iterative cluster refinement. Some optimization techniques on matrix computation are proposed to further improve the efficiency of SA-Cluster on large graphs. Extensive experimental results demonstrate the effectiveness of SA-Cluster through comparisons with the state-of-the-art graph clustering and summarization methods.</abstract><doi>10.1145/1921632.1921638</doi><tpages>33</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 1556-4681
ispartof	ACM transactions on knowledge discovery from data, 2011-02, Vol.5 (2), p.1-33
issn	1556-4681 1556-472X
language	eng
recordid	cdi_crossref_primary_10_1145_1921632_1921638
source	Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)
title	Clustering Large Attributed Graphs: A Balance between Structural and Attribute Similarities
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T03%3A08%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Clustering%20Large%20Attributed%20Graphs:%20A%20Balance%20between%20Structural%20and%20Attribute%20Similarities&rft.jtitle=ACM%20transactions%20on%20knowledge%20discovery%20from%20data&rft.au=Cheng,%20Hong&rft.date=2011-02&rft.volume=5&rft.issue=2&rft.spage=1&rft.epage=33&rft.pages=1-33&rft.issn=1556-4681&rft.eissn=1556-472X&rft_id=info:doi/10.1145/1921632.1921638&rft_dat=%3Ccrossref%3E10_1145_1921632_1921638%3C/crossref%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c195t-ebbab5fef4b4c380188b55fe10429d7ab5152094384010835e37d6abe3ce1f33%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true