Loading…

Enabling heterogeneous ray‐tracing acceleration in edge/cloud architectures

Summary The ray‐tracing algorithm is very costly regarding time complexity and while many techniques have been conceived over the years with the purpose of accelerating its execution, one stands out: parallelism exploitation of ray‐triangle intersection operations. In this sense, field‐programmable...

Full description

Saved in:
Bibliographic Details
Published in:Concurrency and computation 2021-06, Vol.33 (11), p.n/a
Main Authors: Sampaio, Adrianno A., Sena, Alexandre C., Nery, Alexandre S.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Summary The ray‐tracing algorithm is very costly regarding time complexity and while many techniques have been conceived over the years with the purpose of accelerating its execution, one stands out: parallelism exploitation of ray‐triangle intersection operations. In this sense, field‐programmable gate arrays (FPGAs) have plenty resources to run specialized accelerators that execute multiple operations in parallel. Moreover, modern FPGAs are embedded with multiprocessor systems‐on‐chip based on ARM architecture, which can be used simultaneously with the FPGA programmable logic to further accelerate the application execution. In this work, we present and analyze a reconfigurable accelerator for ray‐tracing specialized in computing ray‐triangle intersections at the network edge of a heterogeneous cloud computing environment. The accelerator is specified using Xilinx high‐level synthesis and is implemented in a Xilinx Zynq FPGA (XC7Z020‐1CLG400C). We also present an execution model which enables the exploitation of the available computing elements of the heterogeneous system: ARM Cortex‐A53, FPGA programmable logic, and cloud machines. Experimental performance and synthesis results show that the heterogeneous system can efficiently render a simplified version of the Stanford Bunny model when using the hardware accelerator with up to six instances of a ray‐triangle intersection unit together with the other computing resources.
ISSN:1532-0626
1532-0634
DOI:10.1002/cpe.5822