U.S. Department of Energy

Pacific Northwest National Laboratory

Chapel for the Cray XMT

From DIC

Jump to: navigation, search
The Chapel compiler being developed by Cray researchers as part of the DARPA High Productivity Computing Systems program uses source-to-source compilation to implement a user’s Chapel program via standard C code with calls to runtime libraries that implement the necessary parallelism and communication. This permits the Chapel compiler to portably target such diverse architectures as multicore desktops, commodity clusters, and Cray supercomputers (not to mention those developed by other vendors). As part of PNNL’s Center for Adaptive Supercomputing Software-Multithreaded Architectures (CASS-MT) the Cray research team is modifying the open-source Chapel compiler so that its generated C code can automatically be parallelized by the standard XMT C compiler when it serves as the back-end compiler. This will permit standard data parallel constructs in Chapel to transparently make effective use of the thousands of hardware thread contexts supported by the Cray XMT. The team will then implement XMT-specific performance optimizations with the goal of making Chapel’s performance competitive with user-written native XMT C. In addition to improving the Chapel compiler to make more effective use of the Cray XMT, this project also focuses on extending the language and compiler to permit a single Chapel program to execute in parallel across a variety of distinct architectures. One such example would be to have a Chapel program execute using the compute and service nodes of the Cray XMT. A second would be to have a single program execute using a Cray XMT in combination with distinct external systems such as a desktop computer, Cray CX1000, and/or Cray XE6. To this end, a new locality feature—the realm—has been added to the Chapel implementation to represent distinct target architectures. This permits Chapel programmers to specify the node types that should be used for each sub-computation within their program. As an example, a user could specify that a large unstructured graph computation should execute on the realm representing a Cray XMT’s compute nodes while a distinct part of the program requiring dense linear algebra could execute simultaneously using the XMT’s service nodes or the compute nodes of a more traditional external system. To improve the mapping of Chapel programs to the Cray XMT, the Chapel team has been modifying the Chapel compiler and its standard modules to generate C loops and XMT-specific pragmas in order to generate the parallelism required to make effective use of the Cray XMT. In addition, the team has been implementing new scalar optimizations in order to reduce memory traffic and generate performance competitive with user-written C for the Cray XMT. To support the realm concept, the Chapel compiler and runtime libraries have been improved to better support distinct target architectures, potentially with different native data sizes and formats.

Website: http://cass-mt.pnl.gov/research/default.aspx

Article Title: Chapel for the Cray XMT

Article Added: 2010/09/09

Category(s): Software

Last Update: 13 July 2011 | Pacific Northwest National Laboratory