Best practices for performance

column-major access pattern

A bi-dimensional array memory access follows a column-major pattern when iterating over the columns sequentially, then doing the same for each row within the column.

for (j=0; j<COLS; ++j) {
  for (i=0; i<ROWS; ++i) {
    A[i][j]= … 

Note in the pseudocode that the outer loop drives the second dimension of the array while the inner loop drives the first dimension.