Loop interchange is a performance optimization technique that is used to improve the loop's memory access pattern and potentially enable vectorization. Loop interchange if applied correctly can yield a huge performance improvement.
Loop interchange as an optimization technique is applicable to loop nests. A loop nest consists of two or more nested loops. After loop interchange, a loop that is once outer has now become inner, and the loop that was once inner has become outer.
To illustrate loop interchange, consider the following example:
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
b[i] += c[j][i];
}
}
If we analyze the memory access pattern
for the arrays b
and c
in the context of the innermost loop, we can
determine that the access to array b
is constant, whereas access to the
matrix c
is strided. By doing loop interchange, we get the following loop
nest:
for (int j = 0; j < n; j++) {
for (int i = 0; i < n; i++) {
b[i] += c[j][i];
}
}
Notice that the loop over j
, that was originally the inner loop, is now the
outer loop. Similarly, the loop over i
that was originally an outer loop is
now the inner loop.
The memory access pattern for b[i]
is now sequential (originally it was
constant) and the memory access pattern for c[j][i]
is now sequential too
(originally it was strided). The memory access pattern allows this loop to be
vectorized.
Loop interchange is possible if the following conditions are fulfilled:
-
The loops are perfectly nested: all the statements are inside the innermost loop. If this is not the case, often it is possible to make them perfectly nested, see Perfect-loop nesting on how to do it.
-
The value of the inner loop's iterator variable should be independent of the value of the outer's loop iterator variable.
-
Sometimes loop interchange is not possible in the presence of loop-carried dependencies.