Loading…

Cost effective data center servers

The exploding growth of digitalized information has led to the rapid growth of data centers, both in numbers and in size. Cluster has been the dominating system architecture used in most data centers. However, the increasingly diversified data center applications have requirements beyond what the cl...

Full description

Saved in:
Bibliographic Details
Main Authors: Rui Hou, Tao Jiang, Liuhang Zhang, Pengfei Qi, Jianbo Dong, Haibin Wang, Xiongli Gu, Shujie Zhang
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The exploding growth of digitalized information has led to the rapid growth of data centers, both in numbers and in size. Cluster has been the dominating system architecture used in most data centers. However, the increasingly diversified data center applications have requirements beyond what the cluster architecture can deliver. For instance, clouding computing requires flexible sharing of all data center resources. Big data applications often need large memory capacity. A few applications can use GPGPU effectively. Existing system might be extended to a certain degree to meet those needs. Those extensions however would often be prohibitively expensive. The paper presents our attempt to design a system using commodity products that can meet the varying needs of many emerging data center applications in a cost-effective way. Our attempt is to create a system by connecting multiple nodes through a PCIe switch and then extend the software stack to support resource sharing among these nodes. In particular, a node can directly use the memory, NIC, and GPGPU of other nodes through the PCIe switch with no or little involvement from other nodes. We build a prototype as our evaluation platform. Our evaluation results indicate that those resources can be shared effectively in many cases. For using remote memory as block device, our prototype system has 5 times bandwidth, 11 times IOPS and 1/12 latency compared with the system connected by 10GigE in average for Orion benchmark; Using remote GPGPU via PCIe switch achieves average 60 times speedup than the case without GPGPU, and the performance loss is also acceptable (its average execution time is 1/3 of local GPGPU) for micro-benchmarks from GPU computing SDK; And using remote NIC via PCIe switch achieves average 95% bandwidth and 1.4 times latency of local NIC in httperf testing. While our prototype system offers multiple benefits, it is not perfect and has a lot room for further optimization and extension. We hope the outcome presented in this paper will encourage more researchers to join us in designing highly efficient and cost-effective servers.
ISSN:1530-0897
2378-203X
DOI:10.1109/HPCA.2013.6522317