Upon encountering a global operation, the compiler creates a routine for that global operation that implements the data exchange and reduction operator. A template featuring an algorithm for a target architecture is used. Textual substitution of the "operator space holder" with the actual text of the operator used for the global operation is performed.
For a base communication algorithm there are a number of cases, e.g.,
- Base algorithm (e.g., hypercube, divide & conquer, variants) (~4 cases) - full or partial groups (2 cases) - vector or scalar (2 cases) - user defined function (x cases) - data types (7 cases)These variants can result in thirty or more instances.
The final, desired case is extracted from the template by pattern substitution and conditional compilation. This procedure is outlined below.