Loading…
Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters
A large body of work has been devoted to defining and identifying clusters or communities in social and information networks, i.e., in graphs in which the nodes represent underlying social entities and the edges represent some sort of interaction between pairs of nodes. Most such research begins wit...
Saved in:
Published in: | Internet mathematics 2009-01, Vol.6 (1), p.29-123 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Citations: | Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | A large body of work has been devoted to defining and identifying clusters
or communities in social and information networks, i.e., in graphs in which the nodes
represent underlying social entities and the edges represent some sort of interaction
between pairs of nodes. Most such research begins with the premise that a community
or a cluster should be thought of as a set of nodes that has more and/or better
connections between its members than to the remainder of the network. In this paper,
we explore from a novel perspective several questions related to identifying meaningful
communities in large social and information networks, and we come to several striking
conclusions.
Rather than defining a procedure to extract sets of nodes from a graph and then
attempting to interpret these sets as "real" communities, we employ approximation
algorithms for the graph-partitioning problem to characterize as a function of size
the statistical and structural properties of partitions of graphs that could plausibly
be interpreted as communities. In particular, we define the network community profile
plot, which characterizes the "best" possible community-according to the conductance
measure-over a wide range of size scales. We study over one hundred large real-world
networks, ranging from traditional and online social networks, to technological and
information networks and web graphs, and ranging in size from thousands up to tens
of millions of nodes.
Our results suggest a significantly more refined picture of community structure in
large networks than has been appreciated previously. Our observations agree with
previous work on small networks, but we show that large networks have a very different
structure. In particular, we observe tight communities that are barely connected to the
rest of the network at very small size scales (up to ≈ 100 nodes); and communities
of size scale beyond ≈ 100 nodes gradually "blend into" the expander-like core of the
network and thus become less "community-like," with a roughly inverse relationship
between community size and optimal community quality. This observation agrees well
with the so-called Dunbar number, which gives a limit to the size of a well-functioning
community.
However, this behavior is not explained, even at a qualitative level, by any of the
commonly used network-generation models. Moreover, it is exactly the opposite of
what one would expect based on intuition from expander graphs, low-dimensional or
manifold-like graph |
---|---|
ISSN: | 1542-7951 1944-9488 |
DOI: | 10.1080/15427951.2009.10129177 |