OPEN CATALOG

# Perfect-loop nesting

In the context of program optimizations, perfect loop nesting happens when all the statements in a loop nest of two or more loops are in the innermost loop.

Consider following example:

``````for (int i = 0; i < n; i++) {
for (int j = 0; i < n; i++) {
c[i] += a[j] + b[j][i];
}
}
``````

In the above example of two nested loops, all the statements are in the innermost loop. We say that the loops are perfectly nested. Contrast this with the following example:

``````for (int i = 0; i < n; i++) {
c[i] = 0.0;
for (int j = 0; i < n; i++) {
c[i] += a[j] + b[j][i];
}
}
``````

In this example, the loops are not perfectly nested because of the statement `c[i] = 0.0` on line 2.
Perfect-loop nesting is very important for loop optimizations, since it enables other optimization techniques like loop interchange or loop tiling.

## Converting imperfectly-nested loops to perfectly nested#

As already mentioned, perfect loop nesting is a prerequisite for loop interchange. Luckily, many times imperfectly nested loops can be converted to perfectly nested. Here are three examples:

### Example 1: Isolating statements that prevent perfect-loop nesting into a separate loop #

``````for (int i = 0; i < n; i++) {
c[i] = 0.0;
for (int j = 0; i < n; i++) {
c[i] += a[j] + b[j][i];
}
}
``````

This loop can be made perfectly nested by extracting the statement `c[i] = 0.0` on line 2 into a separate loop, like this:

``````for (int i = 0; i < n; i++) {
c[i] = 0.0;
}

for (int i = 0; i < n; i++) {
for (int j = 0; i < n; i++) {
c[i] += a[j] + b[j][i];
}
}
``````

The loop nest on lines 5-10 is perfectly nested and can profit from other optimizations.

### Example 2: Promoting temporary scalars to vectors to enable perfect loop nesting #

A second example of imperfectly nested loops:

``````for (int i = 0; i < n; i++) {
double sum = 0.0;
double dot = 0.0;
for (int j = 0; i < n; i++) {
sum += a[j] + b[j][i];
dot += a[j] * b[i][j];
}
c[i] = dot / sum;
}
``````

In this example, three statements on lines 2, 3 and 8 prevent perfect loop nesting. Conversion to perfect loop nesting is possible by promoting scalar variables `sum` and `dot` to arrays and splitting the loop into three smaller loops:

``````double* sum_arr = malloc(sizeof(double) * n);
double* dot_arr = malloc(sizeof(double) * n);

for (int i = 0; i < n; i++) {
sum_arr[i] = 0.0;
dot_arr[i] = 0.0;
}

for (int i = 0; i < n; i++) {
for (int j = 0; i < n; i++) {
sum_arr[i] += a[j] + b[j][i];
dot_arr[i] += a[j] * b[i][j];
}
}

for (int i = 0; i < n; i++) {
c[i] = dot_arr[i] / sum_arr[i];
}
free(sum_arr);
free(dot_arr);
``````

The loop nest on lines 9-14 is now perfectly nested and can be optimized further.

### Example 3: #

The last example looks like this:

``````for (int i = 0; i < n; i++) {
double* a_ptr = a + n * i;
double* b_ptr = b + i;

for (int j = 0; j < n; j++) {
*a_ptr = *b_ptr;
a_ptr++;
b_ptr+=n;
}
}
``````

This example uses pointer-based notation to access the elements of the array. By converting the index-based notation, we can eliminate the statements on lines 2 and 3 preventing perfect loop nesting:

``````for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
a[j + i * n] = b[i + j * n];
}
}
``````

By moving to the index based notations, we made the loops perfectly nested with the potential to enable other optimizations.

Resources