Functional programming using bind

Since the introduction of the STL (Standard Template Library) the use of functors has been a prevalent part of writing C++. Most of the STL algorithms require the use of a functor. For example, the std::transform requires a function object that, given an input of the current value of the current position, it will return a new value what will be used to modify the current value.

This is really nice as it separates the mechanics of the algorithm from the “smarts”. In other words, you don’t have to know nor care how std::transform works, you just need to know what it does and how the functor it uses works. Equally, because std::transform relies on a functor for it’s “smarts” it can be used over and over for different reasons. This is the very epitome of generic code!

The general term for implementing reusable packets of functionality is “functional” programming. It’s a very simple paradigm but it’s also very powerful. By treating functions as object you can create functions that perform little units of work and then pass them around, preloaded with behaviour, to be used elsewhere in the code.

The place the function is used doesn’t need to know what or how the functor works, just that it provides the correct interface. In the case of a functor, the interface spec is it’s prototype; what arguments it expects and what, if anything, it will return. Using function objects in this way is an example of the “Command” design pattern.

Another nice thing with functional programming is you can do what is called “function composition”. Basically, this means you can create a complicated functor out of the composition of smaller and less complicated functors. It’s a little bit like building functionality out of Lego.

However; do not confuse this with normal OOP (Object Oriented Programming). The former is just about binding together function objects, whereas the latter is about creating objects that represent either abstract or concrete concepts. They have properties and interface methods to allow manipulation of those properties.

In functional programming the “objects” are functions; little packets of functionality. The do not have “properties” or “methods” and they do not represent concepts, abstract or otherwise. The closes you get to an interface is the function prototype (ie. the type and number of arguments the function expects when it is called. The only thing an OOP object and a function object have in common is that they can both be passed around and used as “first-class” types.

What’s slightly confusing in C++ is that function objects are implemented using classes that have an implementation of the function operator; allowing the class to be called like a function. This is a syntax thing of the language and doesn’t mean function objects and OOP objects should be thought of as the same thing, at least not semantically.

This is an example of how a simple function object is constructed in C++. Again, yes it is implemented in terms of a class but do not think of it as a class type object. This is just a C++ syntax thing; the semantics of a function object are very different from a class object:

#include <string>
#include <iostream>
#include <functional>

// remember a struct is just a class whose members are, by default public!
struct hello_message
{
   // function operator of type std::string(std::string const &) const
   std::string operator () (
         std::string const & name
      ) const
   {
      return "hello, " + name;
   }
};

// this expects a functor to execute
void execute_functor(
   std::function<std::string (std::string)> const & functor
   )
{
   auto && msg = functor("evilrix");
   std::cout << msg << std::endl;
}

int main()
{
   // create the function object
   auto && hellomsg = hello_message();

   // pass to the executor
   execute_functor(hellomsg);
}
&#91;/code&#93;

<a href="http://ideone.com/RKodSM">Edit</a> in Ideone.com

Actually, I lied slightly when I said that functors can't have properties. Of course, because a functor is just implemented as a class it can have class members (both member functions and variables). This means that the functor can actually be pre-loaded with data upon construction.

Below is the same functor as above but instead of passing the name when the function is called, it is passed when the function is constructed. This would allow it to be passed to another function and executed later. The other function doesn't need to know what data the function object is preloaded with, it will just call it and output the message:


#include <string>
#include <iostream>
#include <functional>

// remember a struct is just a class whose members are, by default public!
struct hello_message
{
   // now we need a constructor to "pre-load" the functor with data
   hello_message (std::string const & name)
      : name_(name)
   {
   }

   // function operator of type std::string() const
   std::string operator () () const
   {
      return "hello, " + name_;
   }

   private:
      std::string name_;
};

// this expects a functor to execute, it has no idea what it will output!
void execute_functor(std::function<std::string()> const & functor)
{
   auto && msg = functor();
   std::cout << msg << std::endl;
}

int main()
{
   // create the function object
   auto && hellomsg = hello_message("evilrix");

   // pass to the executor
   execute_functor(hellomsg);
}
&#91;/code&#93;

<a href="http://ideone.com/wSLVzG">Edit</a> in Ideone.com

An example of where one might want to do something like this is if you were building a generic named dispatcher mechanism. For example, to allow a scripting language to invoke C++ functions. A simple dispatcher might be a std::map that stores, as its key, the name of the function and, as its value, a function object.

To ensure consistency when invoking the dispatcher methods, you might want to ensure that each function in the map always performs certain pre-condition check and handle invalid post-conditions. Rather than imposing this as a restriction of each dispatch function, you can implement a functor object that can call the dispatch function, by <a href="http://en.wikipedia.org/wiki/Proxy_pattern">proxy</a>, and perform any pre or post condition checks.

For example, you might want to make sure that pointers aren't null before using them or you might want to take the return status of a function and convert it into an exception that represents the error condition. What you really want is a generic "executor" functor that has no direct coupling to the dispatch functions.

This is quite simple to do using functional programmer. We just create our executor function and have it accept, as an argument, another function object. You can then bind the dispatch function to the generic executor to create a new executor type specific to the dispatch function.

But hold on, I hear you ask...

<ol>
<li>firstly, this means you are creating a new function object for each dispatch function so what do you save? You're having to write a new functor for each dispatch function.</p></li>
<li><p>Secondly, what on earth do you mean "bind the dispatch function to the executor"?</p></li>
</ol>

<p>If you didn't ask these questions I'm going to assume you did anyway, just for the sake of having the opportunity to expand on both points 🙂

Okay, yes we do need to create a new and unique object for each dispatch function but not in the way you might think. We don't actually need to write any code to do this, we can use a special C++11 function, called <a href="http://www.cplusplus.com/reference/functional/bind/">std::bind</a>, to get the compiler to do that for us. This not only saves us from writing code it also saves us the possibility of introducing additional defects.

So, let's look at this bind and see what it is and what it can do for us...

In programming parlance, to bind means to attach one object to another or to create a co-dependency between then. If you are programming using classes you make use of binding all the time and don't even realise it. When you call a member function on a class you are using "<a href="http://en.wikipedia.org/wiki/Name_binding">name binding</a>" to bind that function to the class instance.

If the function is virtual C++ uses a slight variation, called "<a href="http://en.wikipedia.org/wiki/Late_binding">late binding</a>" to ensure the function behaves polymorphically. The difference is that the former is resolved at <a href="http://en.wikipedia.org/wiki/Compile_time">compile</a> time whilst the latter at <a href="http://en.wikipedia.org/wiki/Run_time_%28program_lifecycle_phase%29">runtime</a>.

In both early and late binding, all that's happing is the correct "this" pointer (either the static or dynamic type) is being bound to the function's first argument.

"Eh? But my member functions don't have such an argument", I can hear you whisper. Strap yourself in whilst I reveal the best kept secret in C++... your compiler lies to you. It lies to you all the time. The code you write is NOT the code it generates when it compiles your source!

Have you ever stopped to think how a class member function works? How does it know which instance of the class is being operated on? How does it know which instance it should use when you call access member functions or member data? Simple, each class instance has it's own member function... right? Wrong!!!

In fact class member functions are (at least in a semantic sense) just the same as free standing functions. They have no special affinity to the class instance context or, frankly, even the class they are defined within. There is only one of each but here is something slightly special about them though; they all have a hidden first argument, which is the "this" pointer.

Normally, the "this" pointer has the same type as the class in which is it defined. In virtual functions, the type is that of the dynamic rather than static type; in other words, it's the type of the base class where the function was first declared virtual.

When you call a class member such as:


myClass.foo(1);

This is really nothing more than syntactic sugar for:

foo(&myClass, 1);

Although your classes prototype, as you've defined it, looks like this:

void MyClass::foo(int);

The compiler actually sees it as this

void MyClass::foo(MyClass * this, int);

It adds a hidden first argument, which is a pointer to the class instance that the function is to operate on. So when you call a class member you are, in actually fact, binding that class instance to that class member function. In other words, you don't see it but the address of the class instance is passed into the function via a pointer as the first argument.

OK, but what does this have to do with our generic dispatch executor?

Okay, we have a generic executor and, as it's first argument, it expects to receive a functor to execute. It might also expect other arguments and that is just fine. What we do is bind the dispatcher function to the first argument of the executor function so that when the executor function is invoked the first argument will always be the passed as the dispatch function implicitly. Don't worry if that sounds a little confusing as there is a nice simple example coming up!

To achieve this bit of magic we'll employ the help of a new feature in C++11 called std::bind. This clever function, which was originally (and still is) part of Boost Allows you to create compose compund function objects from existing functions; binding values (which includes other function objects) to the arguments of the function. You don't need to write any code for this, the compiler does it all for you!

The result of this composition is a new function object that need only be called with those values that can't be or shouldn't be resolved until the point of evaluation. This is very similar to the example above, where we changed the hello_message functor to bind the name of the user at the point it was constructed rather than needing to pass it at the point it is called.

Before we move on with our dispatch engine, let's have a very quick look at a simple use of bind. Let's say we have a function called multiply that takes two int arguments and returns the product of these two arguments.

int multiply (int x, int y) { return x * y; }

Now, let's suppose that we had cause to always wanted to find the product of any number to a factor of 10. We could do this by always passing in 10 as the first argument but we want to create a new generic unit of functionality that we can then pass to std::transform to multiply all the numbers in a vector by 10.

Clearly, we have a problem here because std::transform don't know that this function will always need two arguments and that the first always has to be 10. No, std::transform expects the functor to only take on argument and the output should be that argument "transformed" into the new value. We could write a new wrapper function, that would work:

int multiplyby10 (int x, int y)
{
   return multiply(10 * y);
}

void somefunc(std::vector<int> & v)
{
   std::transform(
      v.begin(),
      v.end(),
      v.begin(),
      multiplyby10
      );
}

Of course, that works but it's not very elegant is it? I mean, it now means we have to create a new function just for, what could very well be, a localised requirement. There has to be a better solution... and there is! The solution is to use std::bind.

void somefunc(std::vector<int> & v)
{
   std::transform(
      v.begin(),
      v.end(),
      v.begin(),
      std::bind(multiply, 10, _1)
      );
}

What're we're doing here is creating a new function object, that only exists for the duration of that one line of code, that will call multiply passing 10 and the first argument). The result will be a functor that std::transform can call like this:

// where N is any integer value
int result = new_functor(N);

Notice the weird _1 (underscore one) as the 3rd argument to bind? It's called a placeholder and is just that. A placeholder for the argument that the functor still needs. It tells bind that whatever the value of the new functors first argument should be passed as the 2nd value to multiply.

We know it's the 2nd argument because if you look at bind you can thing of the function name as the 0th argument, the 10 as the 1st argument and the _1 and the 2nd argument. If multiply took more arguments, you can also use _2 ... _N as placeholders.

So back to your dispatcher. We need to bind the dispatch function to the executor functor. Now we understand bind a little better let's see how we can do that:

/**
 * @brief Simple dispatch mechanism using std::bind
 */

#include <functional>
#include <string>
#include <map>
#include <iostream>

using namespace std::placeholders; // for _1 .. _N

// Our dispatch map, that maps names to functions with a prototype of void(int)
using dispatch_map = std::map<
   std::string,
   std::function<void(int)>
   >;

// A test dispatch function
bool dispatch_function(int testval)
{
   // just for testing, return true if > 0, else false
   return testval > 0;
}

// Generic dispatch function executor
void executor(std::function<bool(int)> dispfunc, int testval)
{
   // pre-condition: testval must NOT be zero!
   if (0 == testval)
   {
      throw std::runtime_error(
         "dispatch function error: testval cannot be zero");
   }

   // execute dispatch function
   auto ok = dispfunc(testval);

   // post-condition: dispfunc must return true, else error!
   if (!ok)
   {
      throw std::runtime_error(
         "dispatch function error: " + std::to_string(testval));
   }

   // If we get here we've passed all pre and post conditions!
   std::cout << "OK: " + std::to_string(testval) << std::endl;
}

// Dispatcher (normally a class but to keep things simple it's a function)
void dispatch(dispatch_map & dispmap, std::string func_name, int testval)
{
   try
   {
      // find the requested function in the map
      auto itr_func = dispmap.find(func_name);

      // if we didn't find it that's bad..!
      if (itr_func == dispmap.end())
      {
         throw std::runtime_error("unknown dispatch function");
      }

      // Get a reference to the function we are going to execute
      auto & func = itr_func->second;

      // execute the function
      func(testval);
   }
   catch(std::exception const & e)
   {
      // gah, there be dragons!!!
      std::cerr << e.what() << std::endl;
   }
}

int main()
{
   // create a dispatch map
   auto && dispmap = dispatch_map
   {
      // register a dispatch function bound to the executor
      { "my_disp_func", std::bind(executor, dispatch_function, _1) }
   };

   // fire the dispatcher functions
   dispatch(dispmap, "my_disp_func", 1); // will not output an error
   dispatch(dispmap, "my_disp_func", 0); // will output an error
   dispatch(dispmap, "my_disp_func", -1); // will output an error
}
&#91;/code&#93;

<a href="http://ideone.com/3j5RbR">Edit</a> on Ideone.com

As you can see, we're creating a new dispatch function by binding the real one to the executor and using that to initialise the dispatch map. The executor take care of all error handling for us. Clearly, this is a very simplified example and is not meant to demonstrate the idea way to implement a dispatcher. It's a bear bones example so you can focus on how the bind mechanism works, to create compound function objects.

The bind mechanism is a great tool if you just want to create simple composite functors but it's a little inflexible. For example, if you want to use the output of bind at the input to another bind object, so as to created a <a href="http://www.boost.org/doc/libs/1_55_0/libs/bind/bind.html#nested_binds">nested</a> bind, you'll run into problems. This is because, unlike Boost bind, the std::bind doesn't provide a mechanism to protect agains early evaluation of the bind object.

When you pass the output of one bind object as the input to another std::bind call the nested bind object will be evaluated (run and the result obtained) and that is what will be passed to the inner bind. Sometimes this is exactly what you want but other times you actually want the bind object to be passed into the inner bind without being evaluated, so that it can be evaluated in the nested bind. You can do this with using "boost::protect" Boost bind, but not with std::bind.

I'm aware that this might sound a little like <a href="http://en.wikipedia.org/wiki/Gibberish">gobbledygook</a>, but a little example should make it very clear how the tested bind situation works.

Consider:


#include <boost/bind.hpp>

using boost;

bind(f, bind(g, _1))(x);

Is the same as:

f(g(x));

Whereas:

#include <boost/bind.hpp>
#include <boost/bind/protect.hpp> // not found in C++11

using boost;

bind(f, protect(bind(g, _1)))(x);

Is the same as:

f(bind(g, x));

So, essentially, boost::protect actually "protects" the nested bind from being evaluated before calling f, so the nested bind object rather than the result of calling the nested bind object, which would be the result of g(x), are is to function 'f'.

So, hold on a minute, you can't do this in C++11? Why did they remove this useful feature?

Yes, I agree, on face value it is slightly frustrating as lazy evaluation on nested bind object is very handy; however, do not despair... C++11 had a different trick up its sleeve as we'll see in my next article where we will take a look at lambda expressions.

  • Zbynek Novotny

    Thanks for this info, Ricky. I have dabbled with pre-C++11 function/member function binding methods but those seemed pretty heavy-handed. It’s good that they brought Boost’s bind functionality to the standard. I think I’m going to have to find myself some time to absorb all the new features in C++11. Some of them are pretty neat. Thanks again!