To get a trial key
fill out the form below
Team License (a basic version)
Enterprise License (an extended version)
* By clicking this button you agree to our Privacy Policy statement

Request our prices
New License
License Renewal
--Select currency--
* By clicking this button you agree to our Privacy Policy statement

Free PVS-Studio license for Microsoft MVP specialists
* By clicking this button you agree to our Privacy Policy statement

To get the licence for your open-source project, please fill out this form
* By clicking this button you agree to our Privacy Policy statement

I am interested to try it on the platforms:
* By clicking this button you agree to our Privacy Policy statement

Message submitted.

Your message has been sent. We will email you at

If you haven't received our response, please do the following:
check your Spam/Junk folder and click the "Not Spam" button for our message.
This way, you won't miss messages from our team in the future.

Parallel notes N1 - OpenMP technology

Parallel notes N1 - OpenMP technology

Feb 03 2010

In the next few posts we will tell you about using multi-core processors in practice. For whatever they say about the multi-core, you need to "teach" programs to efficiently use multiple cores anyway. And in this first post you will see the announcement of the next issues and the first introductory note.


Parallel programming technologies

We should point out right away that there are pretty many various parallel programming technologies. And they differ not only and not quite in the programming languages but in the architecture approaches to building parallel systems.

For example, some technologies imply building parallel solutions resting on several computers (belonging to one type or different types), others imply work on one machine with several processor cores.

The systems based on using several computers are referred to the class of distributed computing systems. Such solutions have been used for a long time, they are quite clear to the industry's experts and there is much literature on this type of systems. The most telling example of the distributed computing technologies is MPI[] (Message Passing Interface). MPI is the most popular standard of the data exchange interface in the parallel programming. It was implemented for many computer platforms. MPI provides the programmer with the single mechanism of branch interaction inside a parallel application regardless of the computer architecture (single-processor/multi-processor with shared/separate memory), relative location of the branches (on one processor or on different ones).

As MPI is intended for the systems with separate memory, it is not a very good idea to use it to arrange a parallel process in the system with shared memory. This will be too redundant and complicated, that is why the solutions like OpenMP began to develop. Yet nothing prevents you from making MPI-solutions for one computer anyway.

But the parallel programming systems to work on one machine have begun to develop relatively recently. Of course, you should not think that these are brand-new solutions but it is because of the appearance (more exact, the approaching appearance) of multi-core systems on desktops that programmers should consider such technologies as OpenMP, Intel Thread Building Blocks, Microsoft Parallel Extensions and some others.

It is very important that a parallel programming technology enable you to parallelize a program gradually. Of course, an ideal parallel program must be parallel from the beginning and rather be written in some functional language where there is no question of parallelization at all... But programmers live and work in the real world where they have 10 Mbytes of code in C++ at best, or even in C, instead of the trendy multifunctional F#. And they must parallelize this code gradually. In this case, OpenMP technology (for instance) will be a very lucky choice. It allows you to find the fragments in the application that need to be parallelized most and make them parallel in the first place. In practice it looks like this. The programmer searches for bottlenecks in the program which are the slowest with the help of some profiling tool. Why should you use any tool at all? Because you will not be able to find bottlenecks in an unfamiliar project of 10 Mbytes if you are not a telepath. Then these bottlenecks are made parallel with OpenMP. After that you may find other bottlenecks and so on until you get the needed performance. The process of developing the parallel version may be interrupted while you release intermediate products, and then you may return to it as far as you need. That is, in particular, why OpenMP technology became rather popular.

Well, what is OpenMP?

OpenMP (Open Multi-Processing) is a set of compiler directives, library procedures and environment variables intended for programming multithreaded applications on multi-processor systems with shared memory (SMP-systems).

The first OpenMP standard was developed in 1997 as an API intended for writing easily-portable multithreaded applications. At first it was based on Fortran language but then included C and C++.

OpenMP interface became one of the most popular parallel programming technologies. OpenMP is successfully used both while programming super-computer systems with many processors and in desktop user systems or, for example, in Xbox 360.

Development of OpenMP specification is performed by several large hardware and software vendors whose work is controlled by the non-commercial organization "OpenMP Architecture Review Board" (ARB).

OpenMP uses the parallel execution model "fork-join". An OpenMP program begins as a single execution thread called the master-thread. When the thread meets a parallel construct, it forks into a new thread-team that includes the master-thread itself and some additional threads, and becomes the master-thread in this team. All the members of the team (including the master-thread) execute the code inside the parallel construct. At the end of the parallel construct there is an implicit barrier. After the parallel construct the user code is executed only by the master-thread. A parallel region may include other parallel regions where each thread of the first region becomes the master-thread in its thread-team. Nested regions, in their turn, may include regions of deeper nesting levels.

The number of threads in the team executed in parallel may be controlled in several ways. One of them is to use the environment variable OMP_NUM_THREADS. Another method is to call the procedure omp_set_num_threads(). One more way is to use the expression num_threads together with the directive parallel.

Announcement of the coming notes on parallel programming

By this note we begin a small cycle of publications devoted to studying OpenMP technology and the toolkit for parallel software development. In the next posts you will learn:

  • what tools you need to develop parallel programs;
  • how to create a parallel program from scratch;
  • how to add parallel execution into an existing program with the help of OpenMP technology;
  • what type problems you are to encounter when developing OpenMP-applications and how to diagnose them;
  • how to optimize parallel programs.

Wait for the next issue of lessons...

Related materials

Popular related articles
PVS-Studio ROI

Date: Jan 30 2019

Author: Andrey Karpov

Occasionally, we're asked a question, what monetary value the company will receive from using PVS-Studio. We decided to draw up a response in the form of an article and provide tables, which will sho…
The Evil within the Comparison Functions

Date: May 19 2017

Author: Andrey Karpov

Perhaps, readers remember my article titled "Last line effect". It describes a pattern I've once noticed: in most cases programmers make an error in the last line of similar text blocks. Now I want t…
Technologies used in the PVS-Studio code analyzer for finding bugs and potential vulnerabilities

Date: Nov 21 2018

Author: Andrey Karpov

A brief description of technologies used in the PVS-Studio tool, which let us effectively detect a large number of error patterns and potential vulnerabilities. The article describes the implementati…
The Ultimate Question of Programming, Refactoring, and Everything

Date: Apr 14 2016

Author: Andrey Karpov

Yes, you've guessed correctly - the answer is "42". In this article you will find 42 recommendations about coding in C++ that can help a programmer avoid a lot of errors, save time and effort. The au…
Free PVS-Studio for those who develops open source projects

Date: Dec 22 2018

Author: Andrey Karpov

On the New 2019 year's eve, a PVS-Studio team decided to make a nice gift for all contributors of open-source projects hosted on GitHub, GitLab or Bitbucket. They are given free usage of PVS-Studio s…
Appreciate Static Code Analysis!

Date: Oct 16 2017

Author: Andrey Karpov

I am really astonished by the capabilities of static code analysis even though I am one of the developers of PVS-Studio analyzer myself. The tool surprised me the other day as it turned out to be sma…
Static analysis as part of the development process in Unreal Engine

Date: Jun 27 2017

Author: Andrey Karpov

Unreal Engine continues to develop as new code is added and previously written code is changed. What is the inevitable consequence of ongoing development in a project? The emergence of new bugs in th…
The way static analyzers fight against false positives, and why they do it

Date: Mar 20 2017

Author: Andrey Karpov

In my previous article I wrote that I don't like the approach of evaluating the efficiency of static analyzers with the help of synthetic tests. In that article, I give the example of a code fragment…
PVS-Studio for Java

Date: Jan 17 2019

Author: Andrey Karpov

In the seventh version of the PVS-Studio static analyzer, we added support of the Java language. It's time for a brief story of how we've started making support of the Java language, how far we've co…
The Last Line Effect

Date: May 31 2014

Author: Andrey Karpov

I have studied many errors caused by the use of the Copy-Paste method, and can assure you that programmers most often tend to make mistakes in the last fragment of a homogeneous code block. I have ne…

Comments (0)

Next comments
This website uses cookies and other technology to provide you a more personalized experience. By continuing the view of our web-pages you accept the terms of using these files. If you don't want your personal data to be processed, please, leave this site.
Learn More →