I would like to tell you about an error that a person not familiar with OpenMP technology too well can easily make. The error is related to a wrong supposition about how atomic directive works. atomic directive works faster than critical sections because some atomic operations can be directly replaced with processor commands. And that is why it is convenient to use when calculating various expressions. But you should keep in mind that atomic in no way influences the calls of the functions used in the expression.
Let us explain it by an example:
class Example
{
public:
unsigned m_value;
Example() : m_value(0) {}
unsigned GetValue()
{
return ++m_value;
}
unsigned GetSum()
{
unsigned sum = 0;
#pragma omp parallel for
for (ptrdiff_t i = 0; i < 100; i++)
{
#pragma omp atomic
sum += GetValue();
}
return sum;
}
};
This example contains a race condition error and the value returned by it can vary every time the code is executed. If you try this example and the result is always correct you may change the function GetValue as shown below to make the error more transparent:
unsigned GetValue()
{
Sleep(0);
m_value++;
Sleep(0);
return m_value;
}
In the code, "sum" variable is protected from increment with the atomic directive. But this directive does not influence the call of the function GetValue(). The calls occur in parallel threads and it leads to errors when executing "++m_value" operation inside the function GetValue.
Keep in mind that the functions used in the expressions to which atomic directive is applied, must be thread-safe. atomic directive deals with operations of the following types only:
Here x is a scalar variable, expr is an expression with scalar types that misses x variable, binop is a non-overloaded operator +, *, -, /, &, ^, |, <<, or >>. In all the other cases you must not use atomic directive.
In the example above, atomic directive protects "sum += " operation but not the call of the function GetValue. To correct the error mentioned you should use a critical section or other ways to protect m_value variable.
0