• Skip to primary navigation
  • Skip to main content
  • Skip to footer

Codee

Automated code inspection for performance

  • Product
  • Use Cases
  • Resources
    • Catalog of best practice rules
    • Webinars
    • Blog
    • Releases
  • Company
    • About us
    • News
    • Careers
  • Request a demo

08/03/2021 Por Ivica Bogosavljević Leave a Comment

Case Study: How we made the Canny edge detector run faster? (part 2)

In the previous post we talked about the performance improvement we obtained for the Canny edge detection algorithm. Most of the changes we did there were focused on hot loops: making memory accesses sequential and making the loops longer, which in turn allowed the compiler to autovectorize them. This made our program run significantly faster because we used the hardware resources we had at our disposal more efficiently.

Most modern processors, including embedded processors, will have more than one CPU core. Our next performance improvement idea would be to take advantage of the additional CPU cores.

In this post we use OpenMP to achieve parallelization. OpenMP is a standard that allows porting single-threaded code bases to multicore systems using compiler pragmas and special library calls. The good thing about OpenMP is that all the major compilers support it and that it is simple to port the existing single-threaded code to a multi-threaded environment. If the compiler does not support OpenMP, it ignores the pragmas but nevertheless you are still left with well-built, single-threaded code.

As a starting point we are using an optimized single-threaded Canny source code which we created in the previous post 1. Ideally you want to start with a serial source code that is well optimized for a single processor; using an unoptimized code base will also yield performance improvements, but it might happen that the well-optimized single-threaded version is as fast as an unoptimized multithreaded version. And multithreading comes with downsides: increased energy consumption and overhead that becomes significant if the input size is small.

Profiling

As most of the time when doing performance analysis, we start off with profiling. We want to see which functions have the longest runtime and focus our work there. Here is the flamegraph of the Canny algorithm for a really large image (15360 x 8640 pixels).

Functions that dominate the profile are non_max_supp, gaussian_smoth, apply_hysteresis. Functions derivative_x_y and magnitude_x_y take only a few percent of the runtime, but if we manage to speed up other functions, they can also appear significantly in the profile. 

A note: in the following tests we are measuring only the runtime of the Canny algorithm. The main part of the program is the algorithm itself, but there are a few other functions, like reading and writing the image, parsing arguments that take time; we are not measuring the non-algorithmic parts.

Analysis

Function non_max_supp

The function that takes the largest amount of time is non_max_supp. If we look at the source code (link to the source code)., we see that it processes the image pixel by pixel, going through the image row-wise. This is a good pattern as it uses available memory bandwidth optimally. The function contains a loop that consists of many nested if conditions These are difficult to parallelize using SIMD 2; however, splitting the workload between several CPU cores is perfectly suitable for parallelizing complex control flows.

The only thing we need to make sure is that the iterations of the outer loop are completely independent of one another 3. This was simple to check because each iteration was writing to a different value of the output array resultptr, which is the only output array in the program. 

Originally, the loop header looked like this (click here for the link to the full code of the loop):

   for (rowcount = 1, magrowptr = mag + ncols + 1,
        gxrowptr = gradx + ncols + 1, gyrowptr = grady + ncols + 1,
        resultrowptr = result + ncols + 1;
         rowcount < nrows - 2; rowcount++, magrowptr += ncols,
        gyrowptr += ncols, gxrowptr += ncols, resultrowptr += ncols) {
        for (colcount = 1, magptr = magrowptr, gxptr = gxrowptr,
            gyptr = gyrowptr, resultptr = resultrowptr;
             colcount < ncols - 2;
             colcount++, magptr++, gxptr++, gyptr++, resultptr++) { 
             m00 = *magptr;
             ...
             if (...) {
                  if (...){
                      if (...) {
                         ... 
                    }  else {
                       ...
                    }
                }  else  {
                    ...
                }
            } else  {
               ...
            }

           *resultptr = ...

    }

This is a very complicated loop header and it is not possible to parallelize using OpenMP since OpenMP requires loops to satisfy the canonical loop form. The essence of the canonical loop form is that the loop trip count can be predicted before the loop starts (more details in section 2.6 of the OpenMP specification). To achieve this, the loops need to:

  • Be a for loop (while and do-while loops are not supported).
  • Use one loop index variable to iterate the loop.
  • The initialization of the loop is a simple statement assigning a start value to the loop index variable (e.g. i = 0 or i = m).
  • The loop condition is a simple arithmetic condition. No variables in the loop condition change their value in the loop except for the loop index value (e.g. i < N and N doesn’t change the value inside the loop).
  • The loop increment is known at compile time (i++ or i+=2).

The code inside the loop body dereferences induction pointers to access data (*magptr, *gxptr, *resultptr, lines 10 and 26). What we want to do instead is to access data with array-based notation (magptr[index], gradx[index], etc). This transformation will allow us to verify easily that the iterations are independent and it will also make it easier for the compiler to optimize our code.

✉ Subscribe! Code performance tips for your inbox.

In our case, it was possible to transform the two loops to OpenMP canonical form. Here is what the header looks like:

for (rowcount = 1; rowcount < nrows - 2; rowcount++) {
        int row = ncols * rowcount; 
        for (colcount = 1; colcount < ncols - 2; colcount++) {
            int index = row + colcount;
            m00 = mag[index];
            if(...) {
                  ….
            }
            else {
                …
            }
            …
            result[index] = ...
        }
    }

Both loops are now simple loops that comply with the OpenMP canonical form. We introduced two new variables row and index to calculate the offset in the input and output arrays. For example, instead of accessing array mag using *magptr, where magptr is an induction pointer variable whose access pattern is not easily understood, we are accessing using mag[index] (line 5).

To distribute the loop to multiple CPU cores, we use OpenMP pragmas. Here is the source code of the previous loops with OpenMP pragma. Note that no modification is done to the source code of the loop, the only thing we did is to introduce the #pragma.

 #pragma omp parallel for shared(result, mag, gradx, grady, ncols, nrows) private(rowcount, colcount) default(none)
    for (rowcount = 1; rowcount < nrows - 2; rowcount++) {
        int row = ncols * rowcount; 
        for (colcount = 1; colcount < ncols - 2; colcount++) {
            int index = row + colcount;
            m00 = mag[index];
            if(...) {
                  ….
            }
            else {
                …
            }
            …
            result[index] = ...
        }
    }

Note the #pragma … on line 1. This pragma will tell the compiler to create a version of the loop that can be distributed to multiple CPU cores. Here is the meaning of the keywords in the pragma:

  • omp – marks the pragma as OpenMP pragma.
  • parallel – creates a parallel region. Inside the region, the same code will be executed by several threads.
  • for – distributes the for loop among several threads that will run concurrently on different CPU cores. It also splits the iteration space according to the number of threads and assigns each part of the space to a thread.
  • shared(...) – variables listed inside the parenthesis are shared among the threads. All read-only variables (mag, gradx, grady, ncols, nrows) should be shared to prevent unnecessary copying. Output array result is also shared because each thread writes to a different part of the array.
  • private(...) – variables inside the parentheses are private to each thread, i.e. each thread has its own copy. The loop index variable should be private because each thread will get a range of values for the loop index variable to work on (e.g. thread 0 will get rowcount from 0 to 199, thread 1 will get rowcount from 200 to 399, etc). Temporary variables used inside the loop body are also private; you can optionally move their declaration inside the loop body because all the variables declared there are automatically private.
  • default(none) – force the programmer to mark all the variables used inside the loop with a data scoping attribute. This is a good practice to avoid bugs.

The full source code of the function after the optimization is available here. The loop runtime went down from 1.67 seconds to 0.25 seconds in an environment with eight threads. That’s a speedup of about 6.7. The overall Canny algorithm runtime went down from 3.73 seconds to 2.3 seconds.

Function gaussian_smooth

Function gaussian_smooth was the second function with the longest runtime that was showing up in the performance profile.

The source code of the function gaussian_smooth is available here, also the relevant part is given below. We immediately show the OpenMP version, we didn’t change anything related to the source code, we only added OpenMP pragmas to allow parallelization:

 #pragma omp parallel shared(rows,cols,center,image,kernel,tempim,smoothedim) private(r,c,cc,rr) default(none)
    {
        float* dot_arr = calloc(cols, sizeof(float));
        float* sum_arr = calloc(cols, sizeof(float));

        #pragma omp for
        for (r = 0; r < rows; r++) {
            memset(dot_arr, 0, cols * sizeof(float));
            memset(sum_arr, 0, cols * sizeof(float));
            for (cc = (-center); cc <= center; cc++) {
                for (c = MAX(0, -cc); c < MIN(cols, cols - cc); c++) {
                    dot_arr[c] +=
                        (float)image[r * cols + (c + cc)] * kernel[center + cc];
                    sum_arr[c] += kernel[center + cc];
                }
            }

            for (c = 0; c < cols; c++) {
                tempim[r * cols + c] = dot_arr[c] / sum_arr[c];
            }
        }

        /****************************************************************************
         * Blur in the y - direction.
         ****************************************************************************/
        #pragma omp single
        if (VERBOSE)
            printf("   Bluring the image in the Y-direction.\n");
        
        #pragma omp for
        for (r = 0; r < rows; r++) {
            memset(dot_arr, 0, cols * sizeof(float));
            memset(sum_arr, 0, cols * sizeof(float));
            for (rr = (-center); rr <= center; rr++) {
                if (((r + rr) >= 0) && ((r + rr) < rows)) {
                    for (c = 0; c < cols; c++) {
                        dot_arr[c] +=
                            tempim[(r + rr) * cols + c] * kernel[center + rr];
                        sum_arr[c] += kernel[center + rr];
                    }
                }
            }
            for (c = 0; c < cols; c++) {
                (*smoothedim)[r * cols + c] =
                    (short int)(dot_arr[c] * BOOSTBLURFACTOR / sum_arr[c] + 0.5);
            }
        }
        free(dot_arr);
        free(sum_arr);
    }

The function gaussian_smooth consists of two loop nests (lines 7-21 and lines 31-47). We used OpenMP pragmas to annotate the source code to allow the distribution to multiple CPU cores.

Pragma #pragma omp parallel at line 1 creates a parallel region. We use shared and private clauses to mark data as private or shared, according to the principles we already explained. Inside the parallel region, each line of code is executed by several threads. So, each thread will allocate a buffer using calloc at lines 8 and 9.

Loops at lines 7 and 31 are compliant with the  OpenMP canonical form, so to parallelize them we only added #pragma omp for. Note the line 26, where we added #pragma omp single. We want only one thread to perform the printf, and this pragma makes sure that only one thread executed the command.

So, how did it help reduce the runtime? The function runtime went down from 703 ms to 118 ms on a system with eight threads. The overall Canny algorithm runtime fell from 2.3 seconds to 1.7 seconds.

Function apply_hysteresis

There are several loops inside the function apply_hysteresis, but two of them have a significant performance impact. 

The first loop

Here is the source code of the first loop inside apply_hysteresis, we omitted the rest of the code for brevity:

void apply_hysteresis(short int *mag, unsigned char *nms, int rows, int cols,
    float tlow, float thigh, unsigned char *edge) {
…
    for (r = 0, pos = 0; r<rows; r++){
        for (c = 0; c<cols; c++, pos++){
            if (edge[pos] == POSSIBLE_EDGE) hist[mag[pos]]++;
        }
    }
…
}

This loop (we call it the histogram computation loop) iterates over the array edge and increases the elements of hist array. Unfortunately, which element of the hist array will get increased will be known only at runtime.

The code hist[mag[pos]] is the spot where we can expect many data cache misses and a huge slowdown; distributing this code to multiple CPUs can bring a large performance increase. This kind of code can be distributed to multiple CPU cores according to the following schema:

  • Each thread gets its own private copy of the hist array.
  • We distribute the workload to several threads. Each thread gets a part of the iteration space (e.g. thread 0 gets iterations 0-9999, thread 1 gets iteration 10000-19999, etc).
  • Threads perform the operation on their own private version of hist array
  • At the end, we sum up all the private hist arrays to make the final hist array.

In Codee’s terminology, this is called sparse reduction. OpenMP version 4.5 and later supports sparse reduction with a simple syntax. We had to rewrite the loop so it conforms to the OpenMP canonical loop form. Here is the source code:

   #pragma omp parallel for private(r,c, pos) shared(edge,mag,rows,cols) reduction(+:hist) default(none)
    for (r = 0; r<rows; r++){
        pos = r * cols;
        for (c = 0; c<cols; c++, pos++){
            if (edge[pos] == POSSIBLE_EDGE) hist[mag[pos]]++;
        }
    }

We got rid of the pos variable from the initialization clause of the loop over r (line 2) to make it conform to the OpenMP canonical form. We marked variables as private or shared, depending on the way they are used.

Note the reduction(+:hist) clause. It specifies that each thread should have a local copy of the hist array. Each thread performs the operations on its own copy; when they all are done, the end array hist is created by summing all the local arrays element-wise.

✉ Subscribe! Code performance tips for your inbox.

As far as performance is concerned, the Canny algorithm runtime fell down from 1.7 seconds to 1.45 seconds on a system with eight threads. When we measured, originally the loop only took 330 ms and after the distribution to multiple threads it took 57 ms.

The second loop

The source code of the second significant loop in apply_hysteresis function looks like this:

  for (r = 0, pos = 0; r<rows; r++){
        for (c = 0; c<cols; c++, pos++){
            if ((edge[pos] == POSSIBLE_EDGE) && (mag[pos] >= highthreshold)){
                edge[pos] = EDGE;
                follow_edges((edge + pos), (mag + pos), lowthreshold, cols);
            }
        }

Inside the loop nest there is a call to follow_edges function. In order to tell if we can use OpenMP pragmas to distribute the loop to several CPU cores, the iterations over loop r need to be independent of one another. To find out if this is the case, we need to look inside follow_edges function.

void follow_edges(unsigned char *edgemapptr, short *edgemagptr, short lowval,
    int cols)
{
    short *tempmagptr;
    unsigned char *tempmapptr;
    int i;
    float thethresh;
    int x[8] = { 1, 1, 0, -1, -1, -1, 0, 1 },
        y[8] = { 0, 1, 1, 1, 0, -1, -1, -1 };

    for (i = 0; i<8; i++){
        tempmapptr = edgemapptr - y[i] * cols + x[i];
        tempmagptr = edgemagptr - y[i] * cols + x[i];

        if ((*tempmapptr == POSSIBLE_EDGE) && (*tempmagptr > lowval)){
            *tempmapptr = (unsigned char)EDGE;
            follow_edges(tempmapptr, tempmagptr, lowval, cols);
        }
    }
}

This code takes a pixel, then it iterates through eight surrounding pixels. If their value is POSSIBLE_EDGE, it will change its value to EDGE and then recursively call the same function on the new pixel.

The iterations are not independent. Depending on the way the iterations are distributed we might get a different result. This kind of code cannot be efficiently parallelized without altering the structure of this algorithm. So this code remains as it is, unparallelized.

Function magnitude_x_y

Here is the source code of function magnitude_x_y:

void magnitude_x_y(short int *delta_x, short int *delta_y, int rows, int cols,
    short int **magnitude)
{
    int r, c, pos, sq1, sq2;

    /****************************************************************************
    * Allocate an image to store the magnitude of the gradient.
    ****************************************************************************/
    if ((*magnitude = (short *)calloc(rows*cols, sizeof(short))) == NULL){
        fprintf(stderr, "Error allocating the magnitude image.\n");
        exit(1);
    }

    for (r = 0, pos = 0; r<rows; r++){
        for (c = 0; c<cols; c++, pos++){
            sq1 = (int)delta_x[pos] * (int)delta_x[pos];
            sq2 = (int)delta_y[pos] * (int)delta_y[pos];
            (*magnitude)[pos] = (short)(0.5 + sqrt((float)sq1 + (float)sq2));
        }
    }

}

The loop nest at lines 14-20 is computation heavy, but the iterations are independent of one another and can be easily distributed to several CPU cores. To do this, we need to convert the loop over r (line 14) to OpenMP canonical form by getting rid of the induction variable pos. The result looks like this:

  #pragma omp parallel for private(r,c,pos,sq1,sq2) shared(rows,cols,delta_x,delta_y,magnitude) default(none)
    for (r = 0; r<rows; r++){
        pos = r * cols;
        for (c = 0; c<cols; c++, pos++){
            sq1 = (int)delta_x[pos] * (int)delta_x[pos];
            sq2 = (int)delta_y[pos] * (int)delta_y[pos];
            (*magnitude)[pos] = (short)(0.5 + sqrt((float)sq1 + (float)sq2));
        }
    }

The runtime of the loop went down from 264 ms to 40 ms. The Canny algorithm runtime went from 1.45 seconds to 1.2 seconds.

Function derivative_x_y

Source code of function derivative_x_y looks like this (irrelevant parts omitted for brevity):

void derrivative_x_y(short int *smoothedim, int rows, int cols,
    short int **delta_x, short int **delta_y)
{
    int r, c, pos;
    ...
    /****************************************************************************
    * Compute the x-derivative. Adjust the derivative at the borders to avoid
    * losing pixels.
    ****************************************************************************/
    if (CANNY_LIB_VERBOSE) printf("   Computing the X-direction derivative.\n");
    for (r = 0; r<rows; r++){
        pos = r * cols;
        (*delta_x)[pos] = smoothedim[pos + 1] - smoothedim[pos];
        pos++;
        for (c = 1; c<(cols - 1); c++, pos++){
            (*delta_x)[pos] = smoothedim[pos + 1] - smoothedim[pos - 1];
        }
        (*delta_x)[pos] = smoothedim[pos] - smoothedim[pos - 1];
    }

    /****************************************************************************
    * Compute the y-derivative. Adjust the derivative at the borders to avoid
    * losing pixels.
    ****************************************************************************/
    if (CANNY_LIB_VERBOSE) printf("   Computing the Y-direction derivative.\n");
    for (c = 0; c < cols; c++) {
        pos = c;
        (*delta_y)[pos] = smoothedim[pos + cols] - smoothedim[pos];
        pos = rows * (rows - 1) + c;
        (*delta_y)[pos] = smoothedim[pos] - smoothedim[pos - cols];
    }
    for (r = 1; r < (rows - 1); r++) {
        for (c = 0; c < cols; c++) {
            pos = r * cols + c;
            (*delta_y)[pos] = smoothedim[pos + cols] - smoothedim[pos - cols];
        }
    }
}

The three loops presented above can be easily parallelized, because, for each loop, each iteration writes to a different element of the output arrays delta_x or delta_y. We can distribute them in one go using OpenMP directives. Here are the same loops, but decorated with OpenMP pragmas.

  #pragma omp parallel shared(rows,cols,delta_x,delta_y,smoothedim) private(r,c,pos) default(none)
    {
        #pragma omp for
        for (r = 0; r<rows; r++){
            pos = r * cols;
            (*delta_x)[pos] = smoothedim[pos + 1] - smoothedim[pos];
            pos++;
            for (c = 1; c<(cols - 1); c++, pos++){
                (*delta_x)[pos] = smoothedim[pos + 1] - smoothedim[pos - 1];
            }
            (*delta_x)[pos] = smoothedim[pos] - smoothedim[pos - 1];
        }

        /****************************************************************************
        * Compute the y-derivative. Adjust the derivative at the borders to avoid
        * losing pixels.
        ****************************************************************************/
        #pragma omp single
        if (CANNY_LIB_VERBOSE) printf("   Computing the Y-direction derivative.\n");
        #pragma omp for
        for (c = 0; c < cols; c++) {
            pos = c;
            (*delta_y)[pos] = smoothedim[pos + cols] - smoothedim[pos];
            pos = rows * (rows - 1) + c;
            (*delta_y)[pos] = smoothedim[pos] - smoothedim[pos - cols];
        }
        #pragma omp for
        for (r = 1; r < (rows - 1); r++) {
            for (c = 0; c < cols; c++) {
                pos = r * cols + c;
                (*delta_y)[pos] = smoothedim[pos + cols] - smoothedim[pos - cols];
            }
        }
    }

We’ve already seen a similar OpenMP pragma pattern in function gaussian_smooth, so we won’t go over it again. 

The function runtime went down from 156 ms to 52 ms and the Canny algorithm runtime went down from 1.2 seconds to 1.1 seconds.

The end profile

The end profile looks like this:

We’ve managed to speed up most of the functions. The ones that now dominate the profile are unparallelizable: apply_hysteresis and write_pgm_image. Function apply_hysteresis is partially unparallelizable, the other functions are IO bound, and speeding up those is not possible with parallelization.

Conclusion

At the beginning of the post, the Canny algorithm took 3.7 seconds to complete. After the work was done, the time was reduced to 1.1 seconds. The speedup factor was about 3.4 on a system with eight threads. Notice that our program was not running on eight threads for the whole time.

If we add up all the loop runtimes before the parallelization we get the number of 3.128 seconds. After the parallelization the same loops took 0.517 seconds. The speedup factor for the code we parallelized is around 6, which is a much better number. The non parallelizable part of code takes 0.61 seconds. That means that unparallelizable functions now dominate the performance profile. This is a clear demonstration of Amdahl’s law: after speeding up the parallelizable part as much as possible, the non-parallelizable part is the limiting factor in our system. Even if we had an endless amount of CPUs and zero parallelization overhead, our program would not run faster than 0.61 seconds.

When we started with the Canny in the previous post, the whole program (the Canny algorithm and reading/writing images) took 12.9 seconds. The optimized serial version of Canny which we used as a starting point in this post took 4 seconds, and the multithreaded version took 1.5 seconds. That’s an overall speedup of more than eight times! 

About OpenMP: OpenMP is a light approach to distribute the workloads to several CPU cores, supported by all major compilers. It is especially useful for legacy codebases since the overhead is very low, but it is also useful for newly developed codes. It shares drawbacks with all other multithreading frameworks: there is a new class of bugs related to race conditions, access to unprotected variables, overhead on small input sizes.

In the upcoming posts we will investigate Codee for this same problem: we will check its autoparallelization capabilities, and also we will present a few ways it can help you catch bugs in the development process. We also plan to compare different types of loops based on their arithmetic intensity, memory access pattern, etc. and see how the speedup using multithreading depends on those parameters.

🔗 Read part 1 of ‘How we made the Canny edge detector run faster‘

Additional information

The source code of both the original version and the optimized version is available in our repository. You can easily see the changes we made by comparing the source file canny-original.c and canny-optimized.c.

All the tests were executed on AMD Ryzen 7 4800H CPU with 16 cores and 16 GB of RAM memory with GCC 9.3 compiler on Ubuntu 20.04. We disabled processor frequency scaling in order to decrease runtime variance. The reported runtimes are averaged for 36 runs.


1 The version of the Canny algorithm we used as a starting point here differs slightly from the version obtained as a result of optimization in the previous post. More on this later.

2 In the SIMD world, there is no such thing as conditional execution. Conditional execution is simulated by executing both the commands in the true branch and in the false branch, and then using bitmasking to choose the correct value depending on the condition.

3 Independent iterations mean that the input values used in the current iteration do not depend on the output values produced in the previous iterations. This allows us to iterate the loop in any way we like and the output result will be the same. In the case of multithreading we can split the iteration space 0..N into subranges, 0 .. N1, .., Nm … N and assign each subrange to a dedicated thread.

Building performance into the code from day one with Codee

Request a demo
Contact our team

Filed Under: Technical Tagged With: Codee, multi core, multithreading, OpenMP

Previous Post
Next Post

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Footer

PRODUCT

Request a demo

Purchase inquiry

Knowledge base

COMPANY

About us

Contact us

Newsletter

FOLLOW US

  • LinkedIn
  • Twitter
  • YouTube

NEWSLETTER

Cambiar | Eliminar

APPENTRA SOLUTIONS S.L
P.º Marítimo, 22b, 15679 Cambre, A Coruña, Spain

Tel. +34 881 01 55 56

Cookie Policy | Privacy policy | Legal Disclaimer

Copyright © 2023 Appentra Solutions, S.L.

We use cookies!
Read more about how we use our own and thirdy-party cookies to allow you to access to our website (neccesary cookies) and for analytical purposes. You can accept all cookies by clicking "Agree" button or configure them by clicking the "Manage cookies" button.

Manage cookiesAccept Cookies
Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. these cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your experience.

CookieTypeDurationDescription
0
__asc030 minutes
__auc01 year
__bs_id01 year
__cfduid11 monthThe cookie is set by CloudFare. The cookie is used to identify individual clients behind a shared IP address d apply security settings on a per-client basis. It doesnot correspond to any user ID in the web application and does not store any personally identifiable information.
__gads01 yearThis cookie is set by Google and stored under the name dounleclick.com. This cookie is used to track how many times users see a particular advert which helps in measuring the success of the campaign and calculate the revenue generated by the campaign. These cookies can only be read from the domain that it is set on so it will not track any data while browsing through another sites.
__lxGr__ses015 minutes
__lxGr__var_654116015 minutes
__lxGr__var_654122015 minutes
__lxGr__var_654124015 minutes
__lxGr__var_654130015 minutes
__lxGr__var_654134015 minutes
__lxGr__var_654146015 minutes
__lxGr__var_654157015 minutes
__lxGr__var_654161015 minutes
__lxGr__var_654163015 minutes
__lxGr__var_654165015 minutes
__lxGr__var_654333015 minutes
__stid01 yearThe cookie is set by ShareThis. The cookie is used for site analytics to determine the pages visited, the amount of time spent, etc.
__stidv01 year
__utma02 yearsThis cookie is set by Google Analytics and is used to distinguish users and sessions. The cookie is created when the JavaScript library executes and there are no existing __utma cookies. The cookie is updated every time data is sent to Google Analytics.
__utmb030 minutesThe cookie is set by Google Analytics. The cookie is used to determine new sessions/visits. The cookie is created when the JavaScript library executes and there are no existing __utma cookies. The cookie is updated every time data is sent to Google Analytics.
__utmc0The cookie is set by Google Analytics and is deleted when the user closes the browser. The cookie is not used by ga.js. The cookie is used to enable interoperability with urchin.js which is an older version of Google analytics and used in conjunction with the __utmb cookie to determine new sessions/visits.
__utmt010 minutesThe cookie is set by Google Analytics and is used to throttle the request rate.
__utmt_onm010 minutes
__utmz06 monthsThis cookie is set by Google analytics and is used to store the traffic source or campaign through which the visitor reached your site.
_abck01 year
_cb01 year
_cb_ls01 year
_cb_svref030 minutes
_chartbeat201 year
_fbp02 monthsThis cookie is set by Facebook to deliver advertisement when they are on Facebook or a digital platform powered by Facebook advertising after visiting this website.
_ga02 yearsThis cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, camapign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assigns a randoly generated number to identify unique visitors.
_ga_Y5Q8GRTQY902 years
_gat01 minuteThis cookies is installed by Google Universal Analytics to throttle the request rate to limit the colllection of data on high traffic sites.
_gat_bgs01 minute
_gat_gtag_UA_25587466_801 minuteGoogle uses this cookie to distinguish users.
_gat_gtag_UA_84471197_1601 minuteGoogle uses this cookie to distinguish users.
_gat_hearst01 minute
_gat_tDelegDominio01 minute
_gat_tDominio01 minute
_gat_tRollupComscore01 minute
_gat_tRollupDelegacion01 minute
_gat_tRollupGlobal01 minute
_gat_tRollupLvgTotal01 minute
_gat_tRollupNivel101 minute
_gat_UA-5144860-201 minuteThis is a pattern type cookie set by Google Analytics, where the pattern element on the name contains the unique identity number of the account or website it relates to. It appears to be a variation of the _gat cookie which is used to limit the amount of data recorded by Google on high traffic volume websites.
_gid01 dayThis cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the wbsite is doing. The data collected including the number visitors, the source where they have come from, and the pages viisted in an anonymous form.
_kuid_05 monthsThe cookie is set by Krux Digital under the domain krxd.net. The cookie stores a unique ID to identify a returning user for the purpose of targeted advertising.
_li_ss01 month
Necessary
Always Enabled

Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.

CookieTypeDurationDescription
__cfduid11 monthThe cookie is set by CloudFare. The cookie is used to identify individual clients behind a shared IP address d apply security settings on a per-client basis. It doesnot correspond to any user ID in the web application and does not store any personally identifiable information.
cookielawinfo-checkbox-necessary011 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-necessary01 hourThis cookie is set by GDPR Cookie Consent plugin. The purpose of this cookie is to check whether or not the user has given the consent to the usage of cookies under the category 'Necessary'.
cookielawinfo-checkbox-non-necessary011 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Non Necessary".
cookielawinfo-checkbox-non-necessary01 hourThis cookie is set by GDPR Cookie Consent plugin. The purpose of this cookie is to check whether or not the user has given their consent to the usage of cookies under the category 'Non-Necessary'.
DSID11 hourTo note specific user identity. Contains hashed/encrypted unique ID.
JSESSIONID1Used by sites written in JSP. General purpose platform session cookies that are used to maintain users' state across page requests.
PHPSESSID0This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
pmpro_visit0The cookie is set by the Paid Membership Pro plugin. The cookie is used to manage user memberships.
viewed_cookie_policy011 monthsThe cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
viewed_cookie_policy01 hourThe cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Analytics

Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.

CookieTypeDurationDescription
__gads01 yearThis cookie is set by Google and stored under the name dounleclick.com. This cookie is used to track how many times users see a particular advert which helps in measuring the success of the campaign and calculate the revenue generated by the campaign. These cookies can only be read from the domain that it is set on so it will not track any data while browsing through another sites.
__stid01 yearThe cookie is set by ShareThis. The cookie is used for site analytics to determine the pages visited, the amount of time spent, etc.
_ga02 yearsThis cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, camapign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assigns a randoly generated number to identify unique visitors.
_gat_gtag_UA_25587466_801 minuteGoogle uses this cookie to distinguish users.
_gat_gtag_UA_84471197_1601 minuteGoogle uses this cookie to distinguish users.
_gid01 dayThis cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the wbsite is doing. The data collected including the number visitors, the source where they have come from, and the pages viisted in an anonymous form.
ad-id17 monthsProvided by amazon-adsystem.com for tracking user actions on other websites to provide targeted content
demdex05 monthsThis cookie is set under the domain demdex.net and is used by Adobe Audience Manager to help identify a unique visitor across domains.
GPS030 minutesThis cookie is set by Youtube and registers a unique ID for tracking users based on their geographical location
pardot0The cookie is set when the visitor is logged in as a Pardot user.
tk_lr01 yearThis cookie is set by JetPack plugin on sites using WooCommerce. This is a referral cookie used for analyzing referrer behavior for Jetpack
tk_or05 yearsThis cookie is set by JetPack plugin on sites using WooCommerce. This is a referral cookie used for analyzing referrer behavior for Jetpack
tk_r3d03 daysThe cookie is installed by JetPack. Used for the internal metrics fo user activities to improve user experience
Advertisement

Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.

CookieTypeDurationDescription
_fbp02 monthsThis cookie is set by Facebook to deliver advertisement when they are on Facebook or a digital platform powered by Facebook advertising after visiting this website.
_kuid_05 monthsThe cookie is set by Krux Digital under the domain krxd.net. The cookie stores a unique ID to identify a returning user for the purpose of targeted advertising.
ad-privacy15 yearsProvided by amazon-adsystem.com for tracking user actions on other websites to provide targeted content to the users.
ATN12 yearsThe cookie is set by atdmt.com. The cookies stores data about the user behavior on multiple websites. The data is then used to serve relevant advertisements to the users on the website.
dpm05 monthsThe cookie is set by demdex.net. This cookie assigns a unique ID to each visiting user that allows third-party advertisers target that users with relevant ads.
everest_g_v201 yearThe cookie is set under eversttech.net domain. The purpose of the cookie is to map clicks to other events on the client's website.
fr12 monthsThe cookie is set by Facebook to show relevant advertisments to the users and measure and improve the advertisements. The cookie also tracks the behavior of the user across the web on sites that have Facebook pixel or Facebook social plugin.
IDE12 yearsUsed by Google DoubleClick and stores information about how the user uses the website and any other advertisement before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.
khaos01 yearThis cookie is set by rubiconproject.com. The cookie is used to store user data in an anonymous form such as the IP address, geographical location, websites visited, and the ads clicked. The purpose of the cookie is to tailor the ads displayed to the users based on the users movement on other site in the same ad network.
ljt_reader01 yearThis is a Lijit Advertising Platform cookie. The cookie is used for recognizing the browser or device when users return to their site or one of their partner's site.
mako_uid01 yearThis cookie is set under the domain ps.eyeota.net. The cookies is used to collect data about the users' visit to the website such as the pages visited. The data is used to create a users' profile in terms of their interest and demographic. This data is used for targeted advertising and marketing.
mc01 yearThis cookie is associated with Quantserve to track anonymously how a user interact with the website.
NID16 monthsThis cookie is used to a profile based on user's interest and display personalized ads to the users.
p201 weekThe cookies is set by ownerIQ for the purpose of providing relevant advertisement.
personalization_id02 yearsThis cookie is set by twitter.com. It is used integrate the sharing features of this social media. It also stores information about how the user uses the website for tracking and targeting.
PUBMDCID02 monthsThis cookie is set by pubmatic.com. The cookie stores an ID that is used to display ads on the users' browser.
si05 yearsThe cookies is set by ownerIQ for the purpose of providing relevant advertisement.
TDCPM01 yearThe cookie is set by CloudFare service to store a unique ID to identify a returning users device which then is used for targeted advertising.
TDID01 yearThe cookie is set by CloudFare service to store a unique ID to identify a returning users device which then is used for targeted advertising.
test_cookie015 minutesThis cookie is set by doubleclick.net. The purpose of the cookie is to determine if the users' browser supports cookies.
tuuid01 yearThis cookie is set by .bidswitch.net. The cookies stores a unique ID for the purpose of the determining what adverts the users have seen if you have visited any of the advertisers website. The information is used for determining when and how often users will see a certain banner.
tuuid_lu01 yearThis cookie is set by .bidswitch.net. The cookies stores a unique ID for the purpose of the determining what adverts the users have seen if you have visited any of the advertisers website. The information is used for determining when and how often users will see a certain banner.
uid01 monthThis cookie is used to measure the number and behavior of the visitors to the website anonymously. The data includes the number of visits, average duration of the visit on the website, pages visited, etc. for the purpose of better understanding user preferences for targeted advertisments.
uuid01 monthTo optimize ad relevance by collecting visitor data from multiple websites such as what pages have been loaded.
VISITOR_INFO1_LIVE15 monthsThis cookie is set by Youtube. Used to track the information of the embedded YouTube videos on a website.
Technical

Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.

CookieTypeDurationDescription
__utma02 yearsThis cookie is set by Google Analytics and is used to distinguish users and sessions. The cookie is created when the JavaScript library executes and there are no existing __utma cookies. The cookie is updated every time data is sent to Google Analytics.
__utmb030 minutesThe cookie is set by Google Analytics. The cookie is used to determine new sessions/visits. The cookie is created when the JavaScript library executes and there are no existing __utma cookies. The cookie is updated every time data is sent to Google Analytics.
__utmc0The cookie is set by Google Analytics and is deleted when the user closes the browser. The cookie is not used by ga.js. The cookie is used to enable interoperability with urchin.js which is an older version of Google analytics and used in conjunction with the __utmb cookie to determine new sessions/visits.
__utmt010 minutesThe cookie is set by Google Analytics and is used to throttle the request rate.
__utmz06 monthsThis cookie is set by Google analytics and is used to store the traffic source or campaign through which the visitor reached your site.
_gat01 minuteThis cookies is installed by Google Universal Analytics to throttle the request rate to limit the colllection of data on high traffic sites.
_gat_UA-5144860-201 minuteThis is a pattern type cookie set by Google Analytics, where the pattern element on the name contains the unique identity number of the account or website it relates to. It appears to be a variation of the _gat cookie which is used to limit the amount of data recorded by Google on high traffic volume websites.
AMP_TOKEN01 hourThis cookie is set by Google Analytics - This cookie contains a token that can be used to retrieve a Client ID from AMP Client ID service. Other possible values indicate opt-out, inflight request or an error retrieving a Client ID from AMP Client ID service.
audit01 yearThis cookie is set by Rubicon Project and is used for recording cookie consent data.
bcookie02 yearsThis cookie is set by linkedIn. The purpose of the cookie is to enable LinkedIn functionalities on the page.
lang0This cookie is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
lidc01 dayThis cookie is set by LinkedIn and used for routing.
mailchimp_landing_site04 weeksThe cookie is set by the email marketing service MailChimp.
na_id01 yearThis cookie is set by Addthis.com to enable sharing of links on social media platforms like Facebook and Twitter
ouid01 yearThe cookie is set by Addthis which enables the content of the website to be shared across different networking and social sharing websites.
pid01 yearHelps users identify the users and lets the users use twitter related features from the webpage they are visiting.
PugT01 monthThis cookie is set by pubmatic.com. The purpose of the cookie is to check when the cookies were last updated on the browser in order to limit the number of calls to the server-side cookie store.
sid0This cookie is very common and is used for session state management.
test_cookie011 monthsThis cookie is set by doubleclick.net. The purpose of the cookie is to determine if the users' browser supports cookies.
YSC1This cookies is set by Youtube and is used to track the views of embedded videos.
Save & Accept
X