Loading…

HPTMT Parallel Operators for High Performance Data Science & Data Engineering

Data-intensive applications are becoming commonplace in all science disciplines. They are comprised of a rich set of sub-domains such as data engineering, deep learning, and machine learning. These applications are built around efficient data abstractions and operators that suit the applications of...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2021-08
Main Authors: Abeykoon, Vibhatha, Kamburugamuve, Supun, Widanage, Chathura, Perera, Niranda, Uyar, Ahmet, Thejaka, Amila Kanewala, Gregor von Laszewski, Fox, Geoffrey
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Data-intensive applications are becoming commonplace in all science disciplines. They are comprised of a rich set of sub-domains such as data engineering, deep learning, and machine learning. These applications are built around efficient data abstractions and operators that suit the applications of different domains. Often lack of a clear definition of data structures and operators in the field has led to other implementations that do not work well together. The HPTMT architecture that we proposed recently, identifies a set of data structures, operators, and an execution model for creating rich data applications that links all aspects of data engineering and data science together efficiently. This paper elaborates and illustrates this architecture using an end-to-end application with deep learning and data engineering parts working together.
ISSN:2331-8422