Universität Bonn: Autonomous
Intelligent Systems Institute for Computer Science VI: Autonomous Intelligent Systems  

Hannes Schulz (Staff member and PhD student)

Content

Latest Blog Posts

Hannes Schulz

Easy Parallelization with C++0X lambda functions, Thread Building Blocks

25 Mar 2011 Tagged With   c++ , benchmark

Lambda functors in C++0X

The relatively new gcc-4.5 release supports lambda expressions, which – in contrast to boost.lambda, boost.bind and the like – provide easy capturing of variables in context in the lambda expression. This finally makes the STL algorithms usable, such as

struct add_val{
  add_val(double d):val(d){}
  void operator()(const double& d){
    d+=val;
  }
  double val;
};


int main(){
  std::vector<double> v;
  // ... fill vector

  double inc = 3.0;

  // the old way of doing things
  add_val av(inc);
  std::for_each(v.begin(),v.end(),av);

  // using a boost.lambda function
  std::for_each(v.begin(),v.end(), _1+=inc);

  // using a c++0x lambda function
  std::for_each(v.begin(),v.end(), [=](double& d){d+=inc;});
}

The "old way" primarily has inconveniences for the programmer:

  • We need to define a struct outside the scope, even for tiny functionality.
  • We need to explicitly capture variables from the scope of the surrounding code (here: inc) in the struct.

Boost.Lambda tried to resolve this problem, in an elegant way I believe. However, there are still shortcomings of this approach:

  • Variables have unintuitive names (_1, _2)
  • The code in the lambda expression is not really a block of code. It is an expression, where parts may be evaluated surprisingly.
  • Also, since this is an expression, conditionals and loops must be expressed in an awkward (that is, non-C++) way using if_, for_ and so on.

The new C++0X lambda syntax gets rid of all these problems. While the syntax looks a bit strange at first, it is much more readable than boost.lambda constructs. The block of code is not out of scope, all names of the surrounding code can be used as copy ([=]) or reference ([&]).

You might wonder, why we do not just use boost.foreach and get away with writing

#include <boost/foreach.hpp>
#define foreach BOOST_FOREACH
// ...as before...
foreach(double& d, v){
  d+=inc;
}

// in c++0x, not yet implemented, this will be
for(double& d : v){
  d+=inc;
}

… which is an idiom quite well-known in other languages. The fun part is, that we cannot change what for does, but we can change the implementation of for_each. This is one of the things that the Intel (R) Threading Building Blocks library does.

Parallelizing your for-loop by changing two lines

A nice trick now is to change your (side-effect-free) for-loops like this:

#include <tbb/parallel_for_each.h>

// before
foreach(double& d, v){
  d+=inc;
}

// after
tbb::parallel_for_each(v.begin(),v.end(),[=](double& d){
  d+=inc;
});

… and everything in this loop is automatically run in parallel.

Similar things can be done with OpenMP:

int end = v.size();
#pragma omp parallel for
for(int i=0;i<end;i++){
   v[i]+=inc;
}

but this is already much more intrusive. Furthermore, the loop index must be an int and the last index must be known. While variables may be passed as copy (using private (inc)), these variables are private to the thread, not to the "wrapped" function. Of course, OpenMP gives you much more fine-grained control over parallelization as well.

Enjoyed this post? Tell others about it! DeliciousDelicious | twitterTwitter | redditReddit |
blog comments powered by Disqus

Universität Bonn, Institute for Computer Science, Departments: I, II, III, IV, V, VI