Loading…
Benchmarking of GPU-based pulsar processing pipeline of 40-m Thai national radio telescope
In recent years, Graphics Processing Unit (GPU) have been widely used in several astronomical applications. The 40-m Thai National Radio telescope (TNRT) is under construction in Chiang Mai, Thailand. We conducted benchmarking of the pulsar processing software to evaluate the capabilities of a compu...
Saved in:
Published in: | Journal of physics. Conference series 2019-11, Vol.1380 (1), p.12160 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In recent years, Graphics Processing Unit (GPU) have been widely used in several astronomical applications. The 40-m Thai National Radio telescope (TNRT) is under construction in Chiang Mai, Thailand. We conducted benchmarking of the pulsar processing software to evaluate the capabilities of a computer with Xeon E5-2630 and GPU GTX1080Ti. The pulsar software DSPSR was used to simulate raw baseband data, coherently de-disperse the data and generate a folded time-frequency-domain pulse profile. We experimented with combinations of bandwidth, the size of the sub-bands, a range of dispersion measure (DM) values, and parallel instances of DSPSR jobs. The result shows the processing time increases with higher values of bandwidth DM as expected. However, the processing time appears to decrease with the size of the sub-bands, that at the same total bandwidth the processing with 1.5625 MHz/channel is faster than those with 3.125, 6.25 and 12.5 MHz/channel by approximately 10, 25 and 50 percent, respectively. This indicates that the processing in DSPSR is best when the channel resolution is high, however, further investigation is needed to determine the highest optimal value. We also consider parallel processing and to this, one, two and four identical scripts were simultaneously executed in parallel, where we found that single job is six times faster than four simultaneous jobs. In principle, parallel computing is expected to be more efficient, however, this can be explored further to find the actual bottleneck in the pipeline and hardware. |
---|---|
ISSN: | 1742-6588 1742-6596 |
DOI: | 10.1088/1742-6596/1380/1/012160 |