Loading…

Behavior based record linkage

In this paper, we present a new record linkage approach that uses entity behavior to decide if potentially different entities are in fact the same. An entity's behavior is extracted from a transaction log that records the actions of this entity with respect to a given data source. The core of o...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings of the VLDB Endowment 2010-09, Vol.3 (1-2), p.439-448
Main Authors: Yakout, Mohamed, Elmagarmid, Ahmed K., Elmeleegy, Hazem, Ouzzani, Mourad, Qi, Alan
Format: Article
Language:English
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this paper, we present a new record linkage approach that uses entity behavior to decide if potentially different entities are in fact the same. An entity's behavior is extracted from a transaction log that records the actions of this entity with respect to a given data source. The core of our approach is a technique that merges the behavior of two possible matched entities and computes the gain in recognizing behavior patterns as their matching score. The idea is that if we obtain a well recognized behavior after merge, then most likely, the original two behaviors belong to the same entity as the behavior becomes more complete after the merge. We present the necessary algorithms to model entities' behavior and compute a matching score for them. To improve the computational efficiency of our approach, we precede the actual matching phase with a fast candidate generation that uses a "quick and dirty" matching method. Extensive experiments on real data show that our approach can significantly enhance record linkage quality while being practical for large transaction logs.
ISSN:2150-8097
2150-8097
DOI:10.14778/1920841.1920899