Loading…

Fault Tolerance through Invariant Checking for the Lanczos Eigensolver

The Lanczos eigensolver is a popular iterative method for approximating a few maximal eigenvalues of a real symmetric matrix, particularly if the matrix is large and sparse. In recent years, graphics processing units (GPUs) have become a popular platform for scientific computing applications, many o...

Full description

Saved in:
Bibliographic Details
Main Authors: Loh, Felix, Saluja, Kewal K., Ramanathan, Parameswaran
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 18
container_issue
container_start_page 13
container_title
container_volume
creator Loh, Felix
Saluja, Kewal K.
Ramanathan, Parameswaran
description The Lanczos eigensolver is a popular iterative method for approximating a few maximal eigenvalues of a real symmetric matrix, particularly if the matrix is large and sparse. In recent years, graphics processing units (GPUs) have become a popular platform for scientific computing applications, many of which are based on linear algebra, and are increasingly being used as the main computational units in supercomputers. This trend is expected to continue as the number of computations required by scientific applications reach petascale and exascale range. In this paper, we introduce an efficient error checking mechanism for the Lanczos eigensolver. To the best of our knowledge, we are the first to introduce such a scheme for the Lanczos method. We evaluate our fault tolerant scheme using an open-source sparse eigensolver on a GPU platform, with and without the injection of faults. We use sparse matrices from real applications, and show that our fault tolerant method has good error coverage and low overhead.
doi_str_mv 10.1109/VLSID49098.2020.00020
format conference_proceeding
fullrecord <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_9105557</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9105557</ieee_id><sourcerecordid>9105557</sourcerecordid><originalsourceid>FETCH-LOGICAL-i203t-fa159acc3e1cc983153d0c422bb639d61a0def71f0881af1c431703308692c4a3</originalsourceid><addsrcrecordid>eNotjsFKxDAURaMgOI7zBSLkB1rfS5omWUqdaqHgwtHtkEmTNlpbSTsD-vUWdHPP4h4ul5BbhBQR9N1b_VI9ZBq0ShkwSAGWPCMbLRVKplBIwPycrBhXkOSa8UtyNU3vi6YEyBUpS3PsZ7obexfNYB2duzge245Ww8nEYIaZFp2zH2FoqR_jUjtaL-LPONFtaN0wjf3JxWty4U0_uc0_1-S13O6Kp6R-fqyK-zoJDPiceINCG2u5Q2u14ih4AzZj7HDIuW5yNNA4L9GDUmg82oyjBM5BLddtZvia3PztBufc_iuGTxO_9xpBCCH5L-6cTFc</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Fault Tolerance through Invariant Checking for the Lanczos Eigensolver</title><source>IEEE Xplore All Conference Series</source><creator>Loh, Felix ; Saluja, Kewal K. ; Ramanathan, Parameswaran</creator><creatorcontrib>Loh, Felix ; Saluja, Kewal K. ; Ramanathan, Parameswaran</creatorcontrib><description>The Lanczos eigensolver is a popular iterative method for approximating a few maximal eigenvalues of a real symmetric matrix, particularly if the matrix is large and sparse. In recent years, graphics processing units (GPUs) have become a popular platform for scientific computing applications, many of which are based on linear algebra, and are increasingly being used as the main computational units in supercomputers. This trend is expected to continue as the number of computations required by scientific applications reach petascale and exascale range. In this paper, we introduce an efficient error checking mechanism for the Lanczos eigensolver. To the best of our knowledge, we are the first to introduce such a scheme for the Lanczos method. We evaluate our fault tolerant scheme using an open-source sparse eigensolver on a GPU platform, with and without the injection of faults. We use sparse matrices from real applications, and show that our fault tolerant method has good error coverage and low overhead.</description><identifier>EISSN: 2380-6923</identifier><identifier>EISBN: 9781728157016</identifier><identifier>EISBN: 1728157013</identifier><identifier>DOI: 10.1109/VLSID49098.2020.00020</identifier><language>eng</language><publisher>IEEE</publisher><subject>Fault tolerance ; Fault tolerant systems ; Graphics processing units ; Scientific computing ; Symmetric matrices ; Very large scale integration</subject><ispartof>2020 33rd International Conference on VLSI Design and 2020 19th International Conference on Embedded Systems (VLSID), 2020, p.13-18</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9105557$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,27925,54555,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9105557$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Loh, Felix</creatorcontrib><creatorcontrib>Saluja, Kewal K.</creatorcontrib><creatorcontrib>Ramanathan, Parameswaran</creatorcontrib><title>Fault Tolerance through Invariant Checking for the Lanczos Eigensolver</title><title>2020 33rd International Conference on VLSI Design and 2020 19th International Conference on Embedded Systems (VLSID)</title><addtitle>VLSID</addtitle><description>The Lanczos eigensolver is a popular iterative method for approximating a few maximal eigenvalues of a real symmetric matrix, particularly if the matrix is large and sparse. In recent years, graphics processing units (GPUs) have become a popular platform for scientific computing applications, many of which are based on linear algebra, and are increasingly being used as the main computational units in supercomputers. This trend is expected to continue as the number of computations required by scientific applications reach petascale and exascale range. In this paper, we introduce an efficient error checking mechanism for the Lanczos eigensolver. To the best of our knowledge, we are the first to introduce such a scheme for the Lanczos method. We evaluate our fault tolerant scheme using an open-source sparse eigensolver on a GPU platform, with and without the injection of faults. We use sparse matrices from real applications, and show that our fault tolerant method has good error coverage and low overhead.</description><subject>Fault tolerance</subject><subject>Fault tolerant systems</subject><subject>Graphics processing units</subject><subject>Scientific computing</subject><subject>Symmetric matrices</subject><subject>Very large scale integration</subject><issn>2380-6923</issn><isbn>9781728157016</isbn><isbn>1728157013</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2020</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotjsFKxDAURaMgOI7zBSLkB1rfS5omWUqdaqHgwtHtkEmTNlpbSTsD-vUWdHPP4h4ul5BbhBQR9N1b_VI9ZBq0ShkwSAGWPCMbLRVKplBIwPycrBhXkOSa8UtyNU3vi6YEyBUpS3PsZ7obexfNYB2duzge245Ww8nEYIaZFp2zH2FoqR_jUjtaL-LPONFtaN0wjf3JxWty4U0_uc0_1-S13O6Kp6R-fqyK-zoJDPiceINCG2u5Q2u14ih4AzZj7HDIuW5yNNA4L9GDUmg82oyjBM5BLddtZvia3PztBufc_iuGTxO_9xpBCCH5L-6cTFc</recordid><startdate>202001</startdate><enddate>202001</enddate><creator>Loh, Felix</creator><creator>Saluja, Kewal K.</creator><creator>Ramanathan, Parameswaran</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>202001</creationdate><title>Fault Tolerance through Invariant Checking for the Lanczos Eigensolver</title><author>Loh, Felix ; Saluja, Kewal K. ; Ramanathan, Parameswaran</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i203t-fa159acc3e1cc983153d0c422bb639d61a0def71f0881af1c431703308692c4a3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Fault tolerance</topic><topic>Fault tolerant systems</topic><topic>Graphics processing units</topic><topic>Scientific computing</topic><topic>Symmetric matrices</topic><topic>Very large scale integration</topic><toplevel>online_resources</toplevel><creatorcontrib>Loh, Felix</creatorcontrib><creatorcontrib>Saluja, Kewal K.</creatorcontrib><creatorcontrib>Ramanathan, Parameswaran</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Xplore</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Loh, Felix</au><au>Saluja, Kewal K.</au><au>Ramanathan, Parameswaran</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Fault Tolerance through Invariant Checking for the Lanczos Eigensolver</atitle><btitle>2020 33rd International Conference on VLSI Design and 2020 19th International Conference on Embedded Systems (VLSID)</btitle><stitle>VLSID</stitle><date>2020-01</date><risdate>2020</risdate><spage>13</spage><epage>18</epage><pages>13-18</pages><eissn>2380-6923</eissn><eisbn>9781728157016</eisbn><eisbn>1728157013</eisbn><abstract>The Lanczos eigensolver is a popular iterative method for approximating a few maximal eigenvalues of a real symmetric matrix, particularly if the matrix is large and sparse. In recent years, graphics processing units (GPUs) have become a popular platform for scientific computing applications, many of which are based on linear algebra, and are increasingly being used as the main computational units in supercomputers. This trend is expected to continue as the number of computations required by scientific applications reach petascale and exascale range. In this paper, we introduce an efficient error checking mechanism for the Lanczos eigensolver. To the best of our knowledge, we are the first to introduce such a scheme for the Lanczos method. We evaluate our fault tolerant scheme using an open-source sparse eigensolver on a GPU platform, with and without the injection of faults. We use sparse matrices from real applications, and show that our fault tolerant method has good error coverage and low overhead.</abstract><pub>IEEE</pub><doi>10.1109/VLSID49098.2020.00020</doi><tpages>6</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier EISSN: 2380-6923
ispartof 2020 33rd International Conference on VLSI Design and 2020 19th International Conference on Embedded Systems (VLSID), 2020, p.13-18
issn 2380-6923
language eng
recordid cdi_ieee_primary_9105557
source IEEE Xplore All Conference Series
subjects Fault tolerance
Fault tolerant systems
Graphics processing units
Scientific computing
Symmetric matrices
Very large scale integration
title Fault Tolerance through Invariant Checking for the Lanczos Eigensolver
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T00%3A26%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Fault%20Tolerance%20through%20Invariant%20Checking%20for%20the%20Lanczos%20Eigensolver&rft.btitle=2020%2033rd%20International%20Conference%20on%20VLSI%20Design%20and%202020%2019th%20International%20Conference%20on%20Embedded%20Systems%20(VLSID)&rft.au=Loh,%20Felix&rft.date=2020-01&rft.spage=13&rft.epage=18&rft.pages=13-18&rft.eissn=2380-6923&rft_id=info:doi/10.1109/VLSID49098.2020.00020&rft.eisbn=9781728157016&rft.eisbn_list=1728157013&rft_dat=%3Cieee_CHZPO%3E9105557%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i203t-fa159acc3e1cc983153d0c422bb639d61a0def71f0881af1c431703308692c4a3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=9105557&rfr_iscdi=true