You are in the main area:Organization
Headerimage for: Abstract Data and Communication Library (ADCL)

Abstract Data and Communication Library (ADCL)

The Problem

The constant increase of complexity of clustered high performance computing environments entails resource-intensive tuning to exploit the capabilities of the given hardware and software environment. Certain software components can be tuned for a given platform before the execution of the application. This approach has been taken by several projects such as ATLAS or ATCC. However, several features influencing the performance of the application can only be determined while executing the application itself, since the performance depends on factors such as process placement, resource utilization (e.g. multiple parallel jobs running simultaneously) and application characteristics (e.g. message sizes depending on input data).

The Library

To overcome the limitations of statically tuned software, we have recently introduced the Abstract Data and Communication Library (ADCL). The main goals of ADCL are threefold: firstly to define higher level abstractions for often occurring application level communication patterns, secondly to provide a large number of implementations for each communication pattern and thirdly to choose at runtime the implementation giving the best performance.

    The ADCL API

    The ADCL API offers high level interfaces of application level collective operations. These are required in order to be able to switch the implementation of the according collective operation within the library without modifying the application itself. The main objects within the ADCL API are:

    • ADCL_Topology: provides a description of the process topology and neighborhood relations within the application.
    • ADCL_Vector: specifies the data structures to be used during the communication. The user can for example register a data structure such as a vector or a matrix with the ADCL library, detailing how many dimensions the object has, the extent of each dimension, the number of halo-cells, the basic datatype of the object, and the pointer to the data array of the object.
    • ADCL_Function: each ADCL function is the equivalent to an actual implementation of a particular communication pattern.
    • ADCL_Fnctset: a collection of ADCL functions providing the same functionality. ADCL provides pre-defined function sets, such as for neighborhood communication (ADCL_FNCTSET_NEIGHBORHOOD). The user can however also register its own functions in order to utilize the ADCL runtime selection logic.
    • ADCL_Attribute: abstraction for a particular characteristic of a function/implementation. Each attribute is represented by the set of possible values for this characteristic.
    • ADCL_Attrset: a collection of ADCL attributes
    • ADCL_Request: combines a process topology, a function-set and a vector object. The application can initiate a communication by starting a particular ADCL request.

    A sample code

    This is a simple example for an ADCL code, using a 2-D neighborhood communication on a 2-D process topology.

    double vector[...][...];
    ADCL_Vector vec;
    ADCL_Topology topo;
    ADCL_Request request;

    /* Generate a 2-D process topology */
    MPI_Cart_create (MPI_COMM_WORLD, 2, cart_dims, periods, 0, &cart_comm);
    ADCL_Topology_create (cart_comm, &topo );

    /* Register a 2D vector with ADCL */
    ADCL_Vector_register (ndims, vec_dims, NUM_HALO_CELLS, MPI_DOUBLE, vector, &vec);

    /* Combine description of data structure and process topology */
    ADCL_Request_create (vec, topo, ADCL_FNCTSET_NEIGHBORHOOD, &request );

    /* Main application loop */
    for (i=0; i<NIT; i++ ) {
    ...
    /* Initiate neighborhood communication */
    ADCL_Request_start (request );
    ...
    }

    The concept

    ADCL selects the fastest of the available implementations for a given communication pattern during the regular execution of the application. During the first iterations, ADCL will call each implementation multiple times in order to determine the fastest implementation from a given set of functions. Each process keeps track of the execution times of all implementations in a data array which is attached to the according ADCL_Request. After all implementations have been tested the required number of times, the measurements are analyzed mainly locally.

    ADCL incorporates as of today two separate runtime selection algorithms. A brute force search strategy evaluates all available implementations before choosing which implementation leads to the best performance. While this approach guarantees to find the best performing implementation on a given platform, it has the drawback that testing of all implementations can take a significant amount of time. In order to speed up the selection logic, an alternative runtime heuristic based on attributes characterizing an implementation has been developed. The heuristic is based on the assumption that the fastest implementation for a given problem size on a given execution environment is also the implementation having 'optimal' values for the attributes. Therefore, the algorithm tries to determine the optimal value for each attribute used to characterize an implementation. Once the optimal value for an attribute has been found, the library removes all implementations not having the required value for the corresponding attribute and thus shrinks the list of available implementations.

    Relevant Publications

    • ADCL Homepage
    • Saber Feki and Edgar Gabriel, 'Incorporating Historic Knowledge into a Communication Library for Self-Optimizing High Performance Computing Applications', accepted for publication at the Second IEEE International Conference on Self-Adaptive and Self-Organizing Systems Venice (I), October 20-24, 2008.
    • Mohamad Chaarawi, Jeff Squyres, Edgar Gabriel, Saber Feki, 'A Tool for Optimizing Runtime Parameters of Open MPI', accepted for publication at the EuroPVM/MPI conference, September 7-10, 2008, Dublin, Ireland.
    • Edgar Gabriel, Saber Feki, Katharina Benkert, and Michael M. Resch, 'Towards Performance and Portability through Runtime Adaption for High Performance Computing Applications', accepted for publication at the International Supercomputing Conference, June 17-20, 2008, Dresden, Germany.
    • Edgar Gabriel, Saber Feki, Katharina Benkert, and Mohamad Chaarawi, 'The Abstract Data and Communication Library', accepted for publication in 'Journal of Algorithms and Computational Technology', special issue on Computation Science for Medicine, Energy and Environment Applications, to appear 2008.
    • Katharina Benkert, Edgar Gabriel, and Michael M. Resch, 'Outlier Detection in Performance Data of Parallel Applications', in Proceedings of the 9th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing, held in conjunction with the IPDPS 2008, Miami, Fl, USA, March 2008.
    • Edgar Gabriel and Shuo Huang, Runtime Optimization of Application Level Communication Patterns, 12th International Workshop on High-Level Parallel Programming Models and Supportive Environments, held in conjunction with IPDPS 2007, Long Beach, CA, March 26th 2007.