Loading…

2.3 A 220GOPS 96-Core Processor with 6 Chiplets 3D-Stacked on an Active Interposer Offering 0.6ns/mm Latency, 3Tb/s/mm2 Inter-Chiplet Interconnects and 156mW/mm2@ 82%-Peak-Efficiency DC-DC Converters

In the context of high-performance computing and big-data applications, the quest for performance requires modular, scalable, energy-efficient, low-cost manycore systems. Partitioning the system into multiple chiplets 3D-stacked onto large-scale interposers - organic substrate [1], 2.5D passive inte...

Full description

Saved in:
Bibliographic Details
Main Authors: Vivet, Pascal, Guthmuller, Eric, Thonnart, Yvain, Pillonnet, Gael, Moritz, Guillaume, Miro-Panades, Ivan, Fuguet, Cesar, Durupt, Jean, Bernard, Christian, Varreau, Didier, Pontes, Julian, Thuries, Sebastien, Coriat, David, Harrand, Michel, Dutoit, Denis, Lattard, Didier, Arnaud, Lucile, Charbonnier, Jean, Coudrain, Perceval, Garnier, Arnaud, Berger, Frederic, Gueugnot, Alain, Greiner, Alain, Meunier, Quentin, Farcy, Alexis, Arriordaz, Alexandre, Cheramy, Severine, Clermidy, Fabien
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In the context of high-performance computing and big-data applications, the quest for performance requires modular, scalable, energy-efficient, low-cost manycore systems. Partitioning the system into multiple chiplets 3D-stacked onto large-scale interposers - organic substrate [1], 2.5D passive interposer [2] or silicon bridge [3] -leads to large modular architectures and cost reductions in advanced technologies by the Known Good Die (KGD) strategy and yield management. However, these approaches lack flexible efficient long-distance communications, smooth integration of heterogeneous chiplets, and easy integration of less-scalable analog functions, such as power management [4] and system IOs. To tackle these issues, this paper presents an active interposer integrating: i) a Switched Capacitor Voltage Regulator (SCVR) for on-chip power management; ii) flexible system interconnect topologies between all chiplets for scalable cache coherency support; iii) energy-efficient 3D-plugs for dense inter-layer communication; iv) a memory-IO controller and PHY for socket communication. The chip (Fig. 2.3.7) integrates 96 cores in 6 chiplets in 28nm FDSOI CMOS, 30-stacked in a face-to-face configuration using 20µm-pitch micro-bumps (µ-bumps) onto a 200 mm 2 active interposer with 40µm-pitch Through Silicon Via (TSV) middle in a 65nm technology node. Even though complex functions are integrated, active-interposer yield is high thanks to the mature 65nm node and a reduced complexity (0.08transistors/µm 2 ), with 30% of interposer area devoted to a SCVR variability-tolerant capacitors scheme.
ISSN:2376-8606
DOI:10.1109/ISSCC19947.2020.9062927