Loading…

Tensorox: Accelerating GPU Applications via Neural Approximation on Unused Tensor Cores

Driven by the demands of deep learning, many hardware accelerators, including GPUs, have begun to include specialized tensor processing units to accelerate matrix operations. However, general-purpose GPU applications that have little or no large dense matrix operations cannot benefit from these tens...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on parallel and distributed systems 2022-02, Vol.33 (2), p.429-443
Main Authors: Ho, Nhut-Minh, Wong, Weng-Fai
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Driven by the demands of deep learning, many hardware accelerators, including GPUs, have begun to include specialized tensor processing units to accelerate matrix operations. However, general-purpose GPU applications that have little or no large dense matrix operations cannot benefit from these tensor units. This article proposes Tensorox, a framework that exploits the half-precision tensor cores available on recent GPUs for approximable, non deep learning applications. In essence, a shallow neural network is trained based on the input-output mapping of the function to be approximated. The key innovation in our implementation is the use of the small and dimension-restricted tensor operations in Nvidia GPUs to run multiple instances of the approximation neural network in parallel. With the proper scaling and training methods, our approximation yielded an overall accuracy that is higher than naïvely running the original programs with half-precision. Furthermore, Tensorox allows for the runtime adjustment of the degree of approximation. For the 10 benchmarks we tested, we achieved speedups from 2× to 112× compared to the original in single precision floating point, while maintaining the error caused by the approximation to below 10 percent in most applications.
ISSN:1045-9219
1558-2183
DOI:10.1109/TPDS.2021.3093239