This term is part of the glossary of the open catalog of best practice rules for performance that is automatically detected and reported by Codee.
In the context of program optimizations, perfect loop nesting happens when all the statements in a loop nest of two or more loops are in the innermost loop.
Consider following example:
for (int i = 0; i < n; i++) {
for (int j = 0; i < n; i++) {
c[i] += a[j] + b[j][i];
}
}
In the above example of two nested loops, all the statements are in the innermost loop. We say that the loops are perfectly nested. Contrast this with the following example:
for (int i = 0; i < n; i++) {
c[i] = 0.0;
for (int j = 0; i < n; i++) {
c[i] += a[j] + b[j][i];
}
}
In this example, the loops are not perfectly nested because of the statement c[i] = 0.0
on line 2.
Perfect-loop nesting is very important for loop optimizations, since it enables other optimization techniques like loop interchange or loop tiling.
Converting imperfectly-nested loops to perfectly nested
As already mentioned, perfect loop nesting is a prerequisite for loop interchange. Luckily, many times imperfectly nested loops can be converted to perfectly nested. Here are three examples:
Example 1: Isolating statements that prevent perfect-loop nesting into a separate loop
for (int i = 0; i < n; i++) {
c[i] = 0.0;
for (int j = 0; i < n; i++) {
c[i] += a[j] + b[j][i];
}
}
This loop can be made perfectly nested by extracting the statement c[i] = 0.0
on line 2 into a separate loop, like this:
for (int i = 0; i < n; i++) {
c[i] = 0.0;
}
for (int i = 0; i < n; i++) {
for (int j = 0; i < n; i++) {
c[i] += a[j] + b[j][i];
}
}
The loop nest on lines 5-10 is perfectly nested and can profit from other optimizations.
Example 2: Promoting temporary scalars to vectors to enable perfect loop nesting
A second example of imperfectly nested loops:
for (int i = 0; i < n; i++) {
double sum = 0.0;
double dot = 0.0;
for (int j = 0; i < n; i++) {
sum += a[j] + b[j][i];
dot += a[j] * b[i][j];
}
c[i] = dot / sum;
}
In this example, three statements on lines 2, 3 and 8 prevent perfect loop nesting. Conversion to perfect loop nesting is possible by promoting scalar variables sum
and dot
to arrays and splitting the loop into three smaller loops:
double* sum_arr = malloc(sizeof(double) * n);
double* dot_arr = malloc(sizeof(double) * n);
for (int i = 0; i < n; i++) {
sum_arr[i] = 0.0;
dot_arr[i] = 0.0;
}
for (int i = 0; i < n; i++) {
for (int j = 0; i < n; i++) {
sum_arr[i] += a[j] + b[j][i];
dot_arr[i] += a[j] * b[i][j];
}
}
for (int i = 0; i < n; i++) {
c[i] = dot_arr[i] / sum_arr[i];
}
free(sum_arr);
free(dot_arr);
The loop nest on lines 9-14 is now perfectly nested and can be optimized further.
Example 3:
The last example looks like this:
for (int i = 0; i < n; i++) {
double* a_ptr = a + n * i;
double* b_ptr = b + i;
for (int j = 0; j < n; j++) {
*a_ptr = *b_ptr;
a_ptr++;
b_ptr+=n;
}
}
This example uses pointer-based notation to access the elements of the array. By converting the index-based notation, we can eliminate the statements on lines 2 and 3 preventing perfect loop nesting:
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
a[j + i * n] = b[i + j * n];
}
}
By moving to the index based notations, we made the loops perfectly nested with the potential to enable other optimizations.
Resources

Building performance into the code from day one with Codee