Loading…

uncX: Federated Function as a Service for Science

ƒ unc X is a distributed function as a service (FaaS) platform that enables flexible, scalable, and high performance remote function execution. Unlike centralized FaaS systems, ƒ unc X decouples the cloud-hosted management functionality from the edge-hosted execution functionality. ƒ unc X's en...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on parallel and distributed systems 2022-12, Vol.33 (12), p.4948-4963
Main Authors: Li, Zhuozhao, Chard, Ryan, Babuji, Yadu, Galewsky, Ben, Skluzacek, Tyler J., Nagaitsev, Kirill, Woodard, Anna, Blaiszik, Ben, Bryan, Josh, Katz, Daniel S., Foster, Ian, Chard, Kyle
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:ƒ unc X is a distributed function as a service (FaaS) platform that enables flexible, scalable, and high performance remote function execution. Unlike centralized FaaS systems, ƒ unc X decouples the cloud-hosted management functionality from the edge-hosted execution functionality. ƒ unc X's endpoint software can be deployed, by users or administrators, on arbitrary laptops, clouds, clusters, and supercomputers, in effect turning them into function serving systems. ƒ unc X's cloud-hosted service provides a single location for registering, sharing, and managing both functions and endpoints. It allows for transparent, secure, and reliable function execution across the federated ecosystem of endpoints-enabling users to route functions to endpoints based on specific needs. ƒ unc X uses containers (e.g., Docker, Singularity, and Shifter) to provide common execution environments across endpoints. ƒ unc X implements various container management strategies to execute functions with high performance and efficiency on diverse ƒ unc X endpoints. ƒ unc X also integrates with an in-memory data store and Globus for managing data that may span endpoints. We motivate the need for ƒ unc X, present our prototype design and implementation, and demonstrate, via experiments on two supercomputers, that ƒ unc X can scale to more than 130000 concurrent workers. We show that ƒ unc X's container warming-aware routing algorithm can reduce the completion time for 3,000 functions by up to 61% compared to a randomized algorithm and the in-memory data store can speed up data transfers by up to 3x compared to a shared file system.
ISSN:1045-9219
DOI:10.1109/TPDS.2022.3208767