Loading…

The BlackParrot BedRock Cache Coherence System

This paper presents BP-BedRock, the open-source cache coherence protocol and system implemented within the BlackParrot 64-bit RISC-V multicore processor. BP-BedRock implements the BedRock directory-based MOESIF cache coherence protocol and includes two different open-source coherence protocol engine...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2022-11
Main Authors: Wyse, Mark, Petrisko, Daniel, Gilani, Farzam, Yuan-Mao Chueh, Gao, Paul, Jung, Dai Cheol, Muralitharan, Sripathi, Shashank Vijaya Ranga, Oskin, Mark, Taylor, Michael
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper presents BP-BedRock, the open-source cache coherence protocol and system implemented within the BlackParrot 64-bit RISC-V multicore processor. BP-BedRock implements the BedRock directory-based MOESIF cache coherence protocol and includes two different open-source coherence protocol engines, one FSM-based and the other microcode programmable. Both coherence engines support coherent uncacheable access to cacheable memory and L1-based atomic read-modify-write operations. Fitted within the BlackParrot multicore, BP-BedRock has been silicon validated in a GlobalFoundries 12nm FinFET process and FPGA validated with both coherence engines in 8-core configurations, booting Linux and running off the shelf benchmarks. After describing BP-BedRock and the design of the two coherence engines, we study their performance by analyzing processing occupancy and running the Splash-3 benchmarks on the 8-core FPGA implementations. Careful design and coherence-specific ISA extensions enable the programmable controller to achieve performance within 1% of the fixed-function FSM controller on average (2.3% worst-case) as demonstrated on our FPGA test system. Analysis shows that the programmable coherence engine increases die area by only 4% in an ASIC process and increases logic utilization by only 6.3% on FPGA with one additional block RAM added per core.
ISSN:2331-8422