Loading…

Toward Energy-Efficient Stochastic Circuits Using Parallel Sobol Sequences

Stochastic computing (SC) often requires long stochastic sequences and, thus, a long latency to achieve accurate computation. The long latency leads to an inferior performance and low energy efficiency compared with most conventional binary designs. In this paper, a type of low-discrepancy sequences...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on very large scale integration (VLSI) systems 2018-07, Vol.26 (7), p.1326-1339
Main Authors: Liu, Siting, Han, Jie
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Stochastic computing (SC) often requires long stochastic sequences and, thus, a long latency to achieve accurate computation. The long latency leads to an inferior performance and low energy efficiency compared with most conventional binary designs. In this paper, a type of low-discrepancy sequences, the Sobol sequence, is considered for use in SC. Compared to the use of pseudorandom sequences generated by linear feedback shift registers (LFSRs), the use of Sobol sequences improves the accuracy of stochastic computation with a reduced sequence length. The inherent feature in Sobol sequence generators enables the parallel implementation of random number generators with an improved performance and hardware efficiency. In particular, the underlying theory is formulated and circuit design is proposed for an arbitrary level of parallelization in a power of 2. In addition, different strategies are implemented for parallelizing combinational and sequential stochastic circuits. The hardware efficiency of the parallel stochastic circuits is measured by energy per operation (EPO), throughput per area (TPA), and runtime. At a similar accuracy, the 8{\times} parallel stochastic circuits using Sobol sequences consume approximately 1% of the EPO of the conventional LFSR-based nonparallelized circuits. Meanwhile, an average of 70 (up to 89) times improvements in TPA and less than 1% runtime are achieved. A sorting network is implemented for a median filter (MF) as an application. For a similar image processing quality, a higher energy efficiency is obtained for an 8{\times} parallelized stochastic MF compared with its binary counterpart.
ISSN:1063-8210
1557-9999
DOI:10.1109/TVLSI.2018.2812214