Wiki

Clone wiki

upcxx / Home

UPC++ Version 1.0

NEWS:

A UPC++ Support Slack is now available! (email us if you need a workspace invite)

December 15, 2023: A new UPC++ 2023.9.0 release is now available for download!

  • Notably adds experimental accelerated memory kinds support for Intel GPUs with HPE Slingshot-11

Latest Stable Downloads:

  • UPC++ Implementation 2023.9.0 (tar.gz)
    • Contains everything you need to start using UPC++ on supported platforms
    • Note: Usage information for public installs of UPC++ at certain computing centers is available online.
    • Includes all of the documentation
    • See README.md, ChangeLog.md, and INSTALL.md
  • UPC++ Programmer's Guide (online)
    • A gentle introduction to UPC++ with examples and descriptions.
    • Also available in PDF format
  • UPC++ Specification (PDF)
    • Formal specification of the UPC++ library interface.
  • UPC++ Extras (Repo)
    • Optional extensions, including a dist_array class template for scalable distributed arrays
    • Extended example codes and tutorial materials

Training Materials

  • Learning to use the library

Publications

  • Includes UPC++ project publications and citation information for the documentation.

Events

  • Upcoming and past training events for UPC++, and an archive of prior releases.

User Testimonials

  • See what real users have to say about UPC++!

Overview

UPC++ is a C++ library that supports Partitioned Global Address Space (PGAS) programming, and is designed to interoperate smoothly and efficiently with MPI, OpenMP, C++/POSIX threads, CUDA, ROCm/HIP, oneAPI and other HPC frameworks. It leverages GASNet-EX to deliver low-overhead, fine-grained communication, including Remote Memory Access (RMA) and Remote Procedure Call (RPC).

Design Philosophy

UPC++ exposes a PGAS memory model, including one-sided communication (RMA and RPC). However, there are departures from the approaches taken by some predecessors such as UPC. These changes reflect a design philosophy that encourages the UPC++ programmer to directly express what can be implemented efficiently (ie without a need for parallel compiler analysis).

  1. Most operations are non-blocking, and the powerful synchronization mechanisms encourage applications to design for aggressive asynchrony.

  2. All communication is explicit - there is no implicit data motion.

  3. UPC++ encourages the use of scalable data-structures and avoids non-scalable library features.

What Features Comprise UPC++?

  • RMA. UPC++ provides asynchronous one-sided communication (Remote Memory Access, a.k.a. Put and Get) for movement of data among processes.

  • RPC. UPC++ provides asynchronous Remote Procedure Call for running code (including C++ lambdas) on other processes.

  • Futures, promises and continuations. Futures are central to handling asynchronous operation of RMA and RPC. UPC++ uses a continuation-based model to express task dependencies.

  • Global pointers and memory kinds. UPC++ provides uniform interfaces for RMA transfers among host and device memories, including acceleration of GPU memory transfers via RDMA offload on compatible hardware. Future releases will continue to refine this capability.

  • Remote atomics use an abstraction that enables efficient offload where hardware support is available.

  • Distributed objects. UPC++ enables construction of a scalable distributed object from any C++ object type, with one instance on each rank of a team. RPC can be used to access remote instances.

  • Serialization. UPC++ introduces several complementary mechanisms for efficiently passing large and/or complicated data arguments to RPCs.

  • Non-contiguous RMA. UPC++ provides functions for non-contiguous RMA data transfers to/from arrays in shared memory, for example to efficiently copy or transpose sections of N-dimension dense arrays.

  • Teams represent ordered sets of processes and play a role in collective communication. Currently we support barrier, broadcast and reductions, including abstractions to enable offload of reductions supported in hardware.

  • Progress guarantees. Because UPC++ has no internal service threads, the library makes progress only when a core enters an active UPC++ call. However, the "persona" concept makes writing progress threads simple.

A comparison to the feature set of UPC++ v0.1 is also available.

Notable applications/kernels/frameworks using UPC++:

Other related software:

  • upcxx-extras: UPC++ extra examples and optional extensions
  • upcxx-utils: Set of utilities layered over UPC++, authored by the HipMer group
  • Berkeley UPC: Supports hybrid UPC/UPC++ applications
  • GASNet-EX: The portable, high-performance communication runtime used by UPC++
  • MRG8: An efficient, high-period PRNG with skip-ahead, designed for exascale HPC

Acknowledgments:

UPC++ is developed and maintained by staff in the CLaSS Group at Lawrence Berkeley National Laboratory (LBNL), funded primarily by the Advanced Scientific Computing Research (ASCR) program of the U.S. Department of Energy's Office of Science.

Contact Info

Updated