Loading…

Acceleration of Multi-Body Molecular Dynamics With Customized Parallel Dataflow

FPGAs are drawing increasing attention in resolving molecular dynamics (MD) problems, and have already been applied in problems such as two-body potentials, force fields composed of these potentials, etc. Competitive performance is obtained compared with traditional counterparts such as CPUs and GPU...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on parallel and distributed systems 2024-12, Vol.35 (12), p.2297-2314
Main Authors: Deng, Quan, Liu, Qiang, Yuan, Ming, Duan, Xiaohui, Gan, Lin, Yang, Jinzhe, Zhao, Wenlai, Zhang, Zhenxiang, Wu, Guiming, Luk, Wayne, Fu, Haohuan, Yang, Guangwen
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:FPGAs are drawing increasing attention in resolving molecular dynamics (MD) problems, and have already been applied in problems such as two-body potentials, force fields composed of these potentials, etc. Competitive performance is obtained compared with traditional counterparts such as CPUs and GPUs. However, as far as we know, FPGA solutions for more complex and real-world MD problems, such as multi-body potentials, are seldom to be seen. This work explores the prospects of state-of-the-art FPGAs in accelerating multi-body potential. An FPGA-based accelerator with customized parallel dataflow that features multi-body potential computation, motion update, and internode communication is designed. Major contributions include: (1) parallelization applied at different levels of the accelerator; (2) an optimized dataflow mixing atom-level pipeline and cell-level pipeline to achieve high throughput; (3) a mixed-precision method using different precision at different stages of simulations; and (4) a communication-efficient method for internode communication. Experiments show that, our single-node accelerator is over 2.7× faster than an 8-core CPU design, performing 20.501 ns/day on a 55,296-atom system for the Tersoff simulation. Regarding power efficiency, our accelerator is 28.9× higher than I7-11700 and 4.8× higher than RTX 3090 when running the same test case.
ISSN:1045-9219
1558-2183
DOI:10.1109/TPDS.2024.3420441