Abstract
Exascale computers offer transformative capabilities to combine data-driven and learning-based approaches with traditional simulation applications to accelerate scientific discovery and insight. However, these software combinations and integrations are difficult to achieve due to the challenges of coordinating and deploying heterogeneous software components on diverse and massive platforms. We present the ExaWorks project, which addresses many of these challenges. We developed a workflows Software Development Toolkit (SDK) -- a curated collection of workflow technologies
that can be composed and interoperated through common interface, and engineered following current best practices and specifically designed to work on HPC platforms. ExaWorks also developed PSI/J -- a job management abstraction API to simplify the construction of software components and applications that are portable over various HPC scheduler. The PSI/J API is a minimal interface to submitting and monitoring jobs and their execution state across multiple different and commonly used HPC schedulers. We also describe several leading and innovative workflow examples where ExaWorks tools were put to use on DOE leadership platforms. Furthermore, we discuss how our project is working with the workflow community, large computing facilities, and HPC platform vendors to address the requirements of workflows sustainably at the exascale.