This recommendation is part of the open catalog of best practice rules for performance that is automatically detected and reported by Codee.
Issue
The loop contains independant blocks some of which can be vectorized through loop fission.
Relevance
Many times loops contain independent computations some of which can be vectorized while others can not. By extracting the vectorizable computations to a new loop, it can become vectorized with the corresponding performance increase.
The common code that prevents vectorization:
- Loop-carried dependencies – when the data needed in the current iteration of the loop depends on the data from the previous iteration of the loop.
- Conditional statements in the loop body that depend on the data – the vectorization paradigm requires that the same operation is applied to all the data in the vector. Conditional statements in the loop body break this requirement – depending on the data value, the loop is performing a different operation.
- Non-sequential memory accesses – non-sequential memory accesses do not inhibit vectorization per se, but because of them the vectorization will not pay off and the compiler will omit it.
Actions
Extract vectorizable computations to a new loop isolated from non-vectorizable computations.
Code example
The second loop in the following example exhibits a forall and a sparse reduction compute pattern for A
and B
respectively. The sparse reduction will inhibit loop vectorization.
void example() {
int A[1000], B[1000], C[1000];
for (int i = 0; i < 1000; i++) {
A[i] = B[i] = C[i] = i;
}
for (int i = 0; i < 1000; i++) {
A[i] += i;
B[C[i]] += i;
}
}
By splitting the loop into two, the forall pattern can be vectorized:
void example() {
int A[1000], B[1000], C[1000];
for (int i = 0; i < 1000; i++) {
A[i] = B[i] = C[i] = i;
}
for (int i = 0; i < 1000; i++) {
A[i] += i;
}
for (int i = 0; i < 1000; i++) {
B[C[i]] += i;
}
}

Building performance into the code from day one with Codee