Loading…

Big data clustering via random sketching and validation

As the number and dimensionality of data increases, development of new efficient processing tools has become a necessity. The present paper introduces a novel dimensionality reduction approach for fast and efficient clustering of high-dimensional data. The new methods extend random sampling and cons...

Full description

Saved in:
Bibliographic Details
Main Authors: Traganitis, Panagiotis A., Slavakis, Konstantinos, Giannakis, Georgios B.
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 1050
container_issue
container_start_page 1046
container_title
container_volume
creator Traganitis, Panagiotis A.
Slavakis, Konstantinos
Giannakis, Georgios B.
description As the number and dimensionality of data increases, development of new efficient processing tools has become a necessity. The present paper introduces a novel dimensionality reduction approach for fast and efficient clustering of high-dimensional data. The new methods extend random sampling and consensus (RANSAC) arguments, originally developed for robust regression tasks in computer vision, to the dimensionality reduction problem. The advocated random sketching and validation K-means (SkeVa K-means) and Divergence SkeVa algorithms can achieve high performance, with the latter being able to afford lower computational footprint than the former. Extensive numerical tests on synthetic and real datasets highlight the potential of the proposed algorithms, and demonstrate their competitive performance relative to state-of-the-art random projection alternatives.
doi_str_mv 10.1109/ACSSC.2014.7094614
format conference_proceeding
fullrecord <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_7094614</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>7094614</ieee_id><sourcerecordid>7094614</sourcerecordid><originalsourceid>FETCH-LOGICAL-i208t-aa6395a6a635a223e486c311aeb9e66ba1f341b031403a0ff6e5ea37b9188fd53</originalsourceid><addsrcrecordid>eNotj81Kw0AUhUdBsNa-gG7mBRLvnTu_yxq0CgUX1XW5SSZ1NE0liQXf3ohdfZzDx4EjxA1CjgjhbllsNkWuAHXuIGiL-kwsgvOoXQheBQPnYqaMs5kioEtxNQwfAAqUVzPh7tNO1jyyrNrvYYx96nbymFj23NWHvRw-41i9_5VTlkdu0ySnQ3ctLhpuh7g4cS7eHh9ei6ds_bJ6LpbrLCnwY8ZsKRi2EwwrRVF7WxEixzJEa0vGhjSWQKiBGJrGRhOZXBnQ-6Y2NBe3_7spxrj96tOe-5_t6Sf9AgtjRkU</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Big data clustering via random sketching and validation</title><source>IEEE Xplore All Conference Series</source><creator>Traganitis, Panagiotis A. ; Slavakis, Konstantinos ; Giannakis, Georgios B.</creator><creatorcontrib>Traganitis, Panagiotis A. ; Slavakis, Konstantinos ; Giannakis, Georgios B.</creatorcontrib><description>As the number and dimensionality of data increases, development of new efficient processing tools has become a necessity. The present paper introduces a novel dimensionality reduction approach for fast and efficient clustering of high-dimensional data. The new methods extend random sampling and consensus (RANSAC) arguments, originally developed for robust regression tasks in computer vision, to the dimensionality reduction problem. The advocated random sketching and validation K-means (SkeVa K-means) and Divergence SkeVa algorithms can achieve high performance, with the latter being able to afford lower computational footprint than the former. Extensive numerical tests on synthetic and real datasets highlight the potential of the proposed algorithms, and demonstrate their competitive performance relative to state-of-the-art random projection alternatives.</description><identifier>EISSN: 2576-2303</identifier><identifier>EISBN: 9781479982950</identifier><identifier>EISBN: 9781479982974</identifier><identifier>EISBN: 1479982954</identifier><identifier>EISBN: 1479982970</identifier><identifier>DOI: 10.1109/ACSSC.2014.7094614</identifier><language>eng</language><publisher>IEEE</publisher><subject>Accuracy ; big data ; Clustering ; Clustering algorithms ; Complexity theory ; Computer vision ; Data models ; feature selection ; high-dimensional data ; K-means ; Kernel ; random sampling and consensus ; random sketching and validation ; Robustness</subject><ispartof>2014 48th Asilomar Conference on Signals, Systems and Computers, 2014, p.1046-1050</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/7094614$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,27925,54555,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/7094614$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Traganitis, Panagiotis A.</creatorcontrib><creatorcontrib>Slavakis, Konstantinos</creatorcontrib><creatorcontrib>Giannakis, Georgios B.</creatorcontrib><title>Big data clustering via random sketching and validation</title><title>2014 48th Asilomar Conference on Signals, Systems and Computers</title><addtitle>ACSSC</addtitle><description>As the number and dimensionality of data increases, development of new efficient processing tools has become a necessity. The present paper introduces a novel dimensionality reduction approach for fast and efficient clustering of high-dimensional data. The new methods extend random sampling and consensus (RANSAC) arguments, originally developed for robust regression tasks in computer vision, to the dimensionality reduction problem. The advocated random sketching and validation K-means (SkeVa K-means) and Divergence SkeVa algorithms can achieve high performance, with the latter being able to afford lower computational footprint than the former. Extensive numerical tests on synthetic and real datasets highlight the potential of the proposed algorithms, and demonstrate their competitive performance relative to state-of-the-art random projection alternatives.</description><subject>Accuracy</subject><subject>big data</subject><subject>Clustering</subject><subject>Clustering algorithms</subject><subject>Complexity theory</subject><subject>Computer vision</subject><subject>Data models</subject><subject>feature selection</subject><subject>high-dimensional data</subject><subject>K-means</subject><subject>Kernel</subject><subject>random sampling and consensus</subject><subject>random sketching and validation</subject><subject>Robustness</subject><issn>2576-2303</issn><isbn>9781479982950</isbn><isbn>9781479982974</isbn><isbn>1479982954</isbn><isbn>1479982970</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2014</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotj81Kw0AUhUdBsNa-gG7mBRLvnTu_yxq0CgUX1XW5SSZ1NE0liQXf3ohdfZzDx4EjxA1CjgjhbllsNkWuAHXuIGiL-kwsgvOoXQheBQPnYqaMs5kioEtxNQwfAAqUVzPh7tNO1jyyrNrvYYx96nbymFj23NWHvRw-41i9_5VTlkdu0ySnQ3ctLhpuh7g4cS7eHh9ei6ds_bJ6LpbrLCnwY8ZsKRi2EwwrRVF7WxEixzJEa0vGhjSWQKiBGJrGRhOZXBnQ-6Y2NBe3_7spxrj96tOe-5_t6Sf9AgtjRkU</recordid><startdate>20141101</startdate><enddate>20141101</enddate><creator>Traganitis, Panagiotis A.</creator><creator>Slavakis, Konstantinos</creator><creator>Giannakis, Georgios B.</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>20141101</creationdate><title>Big data clustering via random sketching and validation</title><author>Traganitis, Panagiotis A. ; Slavakis, Konstantinos ; Giannakis, Georgios B.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i208t-aa6395a6a635a223e486c311aeb9e66ba1f341b031403a0ff6e5ea37b9188fd53</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2014</creationdate><topic>Accuracy</topic><topic>big data</topic><topic>Clustering</topic><topic>Clustering algorithms</topic><topic>Complexity theory</topic><topic>Computer vision</topic><topic>Data models</topic><topic>feature selection</topic><topic>high-dimensional data</topic><topic>K-means</topic><topic>Kernel</topic><topic>random sampling and consensus</topic><topic>random sketching and validation</topic><topic>Robustness</topic><toplevel>online_resources</toplevel><creatorcontrib>Traganitis, Panagiotis A.</creatorcontrib><creatorcontrib>Slavakis, Konstantinos</creatorcontrib><creatorcontrib>Giannakis, Georgios B.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library Online</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Traganitis, Panagiotis A.</au><au>Slavakis, Konstantinos</au><au>Giannakis, Georgios B.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Big data clustering via random sketching and validation</atitle><btitle>2014 48th Asilomar Conference on Signals, Systems and Computers</btitle><stitle>ACSSC</stitle><date>2014-11-01</date><risdate>2014</risdate><spage>1046</spage><epage>1050</epage><pages>1046-1050</pages><eissn>2576-2303</eissn><eisbn>9781479982950</eisbn><eisbn>9781479982974</eisbn><eisbn>1479982954</eisbn><eisbn>1479982970</eisbn><abstract>As the number and dimensionality of data increases, development of new efficient processing tools has become a necessity. The present paper introduces a novel dimensionality reduction approach for fast and efficient clustering of high-dimensional data. The new methods extend random sampling and consensus (RANSAC) arguments, originally developed for robust regression tasks in computer vision, to the dimensionality reduction problem. The advocated random sketching and validation K-means (SkeVa K-means) and Divergence SkeVa algorithms can achieve high performance, with the latter being able to afford lower computational footprint than the former. Extensive numerical tests on synthetic and real datasets highlight the potential of the proposed algorithms, and demonstrate their competitive performance relative to state-of-the-art random projection alternatives.</abstract><pub>IEEE</pub><doi>10.1109/ACSSC.2014.7094614</doi><tpages>5</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier EISSN: 2576-2303
ispartof 2014 48th Asilomar Conference on Signals, Systems and Computers, 2014, p.1046-1050
issn 2576-2303
language eng
recordid cdi_ieee_primary_7094614
source IEEE Xplore All Conference Series
subjects Accuracy
big data
Clustering
Clustering algorithms
Complexity theory
Computer vision
Data models
feature selection
high-dimensional data
K-means
Kernel
random sampling and consensus
random sketching and validation
Robustness
title Big data clustering via random sketching and validation
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-21T05%3A21%3A52IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Big%20data%20clustering%20via%20random%20sketching%20and%20validation&rft.btitle=2014%2048th%20Asilomar%20Conference%20on%20Signals,%20Systems%20and%20Computers&rft.au=Traganitis,%20Panagiotis%20A.&rft.date=2014-11-01&rft.spage=1046&rft.epage=1050&rft.pages=1046-1050&rft.eissn=2576-2303&rft_id=info:doi/10.1109/ACSSC.2014.7094614&rft.eisbn=9781479982950&rft.eisbn_list=9781479982974&rft.eisbn_list=1479982954&rft.eisbn_list=1479982970&rft_dat=%3Cieee_CHZPO%3E7094614%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i208t-aa6395a6a635a223e486c311aeb9e66ba1f341b031403a0ff6e5ea37b9188fd53%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=7094614&rfr_iscdi=true