This article addresses single-core performance optimizations that enable an efficient usage of the memory available in the hardware. We use Codee to pinpoint opportunities in the code to benefit from loop interchange, which enables sequential memory accesses and favors vectorization. Why is Loop Interchange important? Loop interchange is a performance optimization technique that is used […]
loop interchange
Case Study: How we made Canny edge detector algorithm run faster? (part 1)
Canny edge detector algorithm is a famous algorithm in image processing. One of our customers was inquiring about the performance of that algorithm, so our performance engineers took a look. As always when dealing with unfamiliar codebases, we didn’t know what to expect. Some algorithms have a lot of room for performance improvements, others don’t. […]