Loading…

Auto-tuning TRSM with an asynchronous task assignment model on multicore, multi-GPU and coprocessor systems

The increasing need for computing power today justifies the continuous search for techniques that decrease the time to answer usual computational problems. To take advantage of new hybrid parallel architectures composed by multithreading and multiprocessor hardware, our current efforts involve the d...

Full description

Saved in:
Bibliographic Details
Main Authors: Pinto, Clicia, Barreto, Marcos, Boratto, Murilo
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The increasing need for computing power today justifies the continuous search for techniques that decrease the time to answer usual computational problems. To take advantage of new hybrid parallel architectures composed by multithreading and multiprocessor hardware, our current efforts involve the design and validation of highly parallel algorithms that efficiently explore the characteristics of such architectures. In this paper, we propose an automatic tuning methodology to easily exploit multicore, multi-GPU and coprocessor systems. We present an optimization of an algorithm for solving triangular systems (TRSM), based on block decomposition and asynchronous task assignment, and discuss some results.
ISSN:2161-5330
DOI:10.1109/AICCSA.2016.7945637