Loading…

From temporal to spatial networks: on inferring missing coordinates of debit card transactions

Nowadays a problem of the missing locations inferring from transactions data presents in different areas starting from tracking a COVID-19 carrier to analysing a bank client’s movements for creating a personalized proposal. The most of the existing approaches operate with data consisting of GPS coor...

Full description

Saved in:
Bibliographic Details
Published in:Procedia computer science 2020, Vol.178, p.172-181
Main Authors: Stavinova, Elizaveta, Shikov, Egor, Vaganov, Danila
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Nowadays a problem of the missing locations inferring from transactions data presents in different areas starting from tracking a COVID-19 carrier to analysing a bank client’s movements for creating a personalized proposal. The most of the existing approaches operate with data consisting of GPS coordinates and their aim is to impute the missing data in such a way that it will be as close to the real data as possible. By using the data containing transactions which were done in the surrounding territory, we bring a new perspective to this problem since this information could be used to infer a person’s movements by already generated digital trace and without creation of expensive sensors or cameras. In this study, we propose a unified framework aimed at the estimation of commercial establishments’ coordinates by the set of transactions. Our approach is based on constructing the graph where the vertices are stores and there is an edge between two of them if there were a consecutive transactions made at these two points. After that, we infer the distances between the places by applying machine learning techniques to the distribution of time differences between the transactions. Finally, we optimize the coordinates of points using obtained distances and points with known coordinates. In this paper, we describe the process of the data preparation and the details of training the model for the distances between points with unknown locations inferring (only 4% of 46000 locations is known). The RMSE of the distance inferring computed on the known locations is ≈ 530 meters and, also, the evaluation of the model showed that the R2 coefficient equals 0.51. Then, we present the details of the coordinates inferring for the points with missing coordinates. The median error of the location inferring is ≈ 250 meters.
ISSN:1877-0509
1877-0509
DOI:10.1016/j.procs.2020.11.019