C++11 introduced a standardized memory model. What does it mean? And how is it going to affect C++ programming

cc++11language-lawyermemory-modelmultithreading

C++11 introduced a standardized memory model, but what exactly does that mean? And how is it going to affect C++ programming?

This article (by Gavin Clarke who quotes Herb Sutter) says that,

The memory model means that C++ code
now has a standardized library to call
regardless of who made the compiler
and on what platform it's running.
There's a standard way to control how
different threads talk to the
processor's memory.

"When you are talking about splitting
[code] across different cores that's
in the standard, we are talking about
the memory model. We are going to
optimize it without breaking the
following assumptions people are going
to make in the code," Sutter said.

Well, I can memorize this and similar paragraphs available online (as I've had my own memory model since birth :P) and can even post as an answer to questions asked by others, but to be honest, I don't exactly understand this.

C++ programmers used to develop multi-threaded applications even before, so how does it matter if it's POSIX threads, or Windows threads, or C++11 threads? What are the benefits? I want to understand the low-level details.

I also get this feeling that the C++11 memory model is somehow related to C++11 multi-threading support, as I often see these two together. If it is, how exactly? Why should they be related?

As I don't know how the internals of multi-threading work, and what memory model means in general, please help me understand these concepts. 🙂

Best Answer

First, you have to learn to think like a Language Lawyer.

The C++ specification does not make reference to any particular compiler, operating system, or CPU. It makes reference to an abstract machine that is a generalization of actual systems. In the Language Lawyer world, the job of the programmer is to write code for the abstract machine; the job of the compiler is to actualize that code on a concrete machine. By coding rigidly to the spec, you can be certain that your code will compile and run without modification on any system with a compliant C++ compiler, whether today or 50 years from now.

The abstract machine in the C++98/C++03 specification is fundamentally single-threaded. So it is not possible to write multi-threaded C++ code that is "fully portable" with respect to the spec. The spec does not even say anything about the atomicity of memory loads and stores or the order in which loads and stores might happen, never mind things like mutexes.

Of course, you can write multi-threaded code in practice for particular concrete systems – like pthreads or Windows. But there is no standard way to write multi-threaded code for C++98/C++03.

The abstract machine in C++11 is multi-threaded by design. It also has a well-defined memory model; that is, it says what the compiler may and may not do when it comes to accessing memory.

Consider the following example, where a pair of global variables are accessed concurrently by two threads:

           Global
           int x, y;

Thread 1            Thread 2
x = 17;             cout << y << " ";
y = 37;             cout << x << endl;

What might Thread 2 output?

Under C++98/C++03, this is not even Undefined Behavior; the question itself is meaningless because the standard does not contemplate anything called a "thread".

Under C++11, the result is Undefined Behavior, because loads and stores need not be atomic in general. Which may not seem like much of an improvement... And by itself, it's not.

But with C++11, you can write this:

           Global
           atomic<int> x, y;

Thread 1                 Thread 2
x.store(17);             cout << y.load() << " ";
y.store(37);             cout << x.load() << endl;

Now things get much more interesting. First of all, the behavior here is defined. Thread 2 could now print 0 0 (if it runs before Thread 1), 37 17 (if it runs after Thread 1), or 0 17 (if it runs after Thread 1 assigns to x but before it assigns to y).

What it cannot print is 37 0, because the default mode for atomic loads/stores in C++11 is to enforce sequential consistency. This just means all loads and stores must be "as if" they happened in the order you wrote them within each thread, while operations among threads can be interleaved however the system likes. So the default behavior of atomics provides both atomicity and ordering for loads and stores.

Now, on a modern CPU, ensuring sequential consistency can be expensive. In particular, the compiler is likely to emit full-blown memory barriers between every access here. But if your algorithm can tolerate out-of-order loads and stores; i.e., if it requires atomicity but not ordering; i.e., if it can tolerate 37 0 as output from this program, then you can write this:

           Global
           atomic<int> x, y;

Thread 1                            Thread 2
x.store(17,memory_order_relaxed);   cout << y.load(memory_order_relaxed) << " ";
y.store(37,memory_order_relaxed);   cout << x.load(memory_order_relaxed) << endl;

The more modern the CPU, the more likely this is to be faster than the previous example.

Finally, if you just need to keep particular loads and stores in order, you can write:

           Global
           atomic<int> x, y;

Thread 1                            Thread 2
x.store(17,memory_order_release);   cout << y.load(memory_order_acquire) << " ";
y.store(37,memory_order_release);   cout << x.load(memory_order_acquire) << endl;

This takes us back to the ordered loads and stores – so 37 0 is no longer a possible output – but it does so with minimal overhead. (In this trivial example, the result is the same as full-blown sequential consistency; in a larger program, it would not be.)

Of course, if the only outputs you want to see are 0 0 or 37 17, you can just wrap a mutex around the original code. But if you have read this far, I bet you already know how that works, and this answer is already longer than I intended :-).

So, bottom line. Mutexes are great, and C++11 standardizes them. But sometimes for performance reasons you want lower-level primitives (e.g., the classic double-checked locking pattern). The new standard provides high-level gadgets like mutexes and condition variables, and it also provides low-level gadgets like atomic types and the various flavors of memory barrier. So now you can write sophisticated, high-performance concurrent routines entirely within the language specified by the standard, and you can be certain your code will compile and run unchanged on both today's systems and tomorrow's.

Although to be frank, unless you are an expert and working on some serious low-level code, you should probably stick to mutexes and condition variables. That's what I intend to do.

For more on this stuff, see this blog post.

Multi-threading is possible in php

Yes you can do multi-threading in PHP with pthreads

From the PHP documentation:

pthreads is an object-orientated API that provides all of the tools needed for multi-threading in PHP. PHP applications can create, read, write, execute and synchronize with Threads, Workers and Threaded objects.

Warning: The pthreads extension cannot be used in a web server environment. Threading in PHP should therefore remain to CLI-based applications only.

Simple Test

#!/usr/bin/php
<?php
class AsyncOperation extends Thread {

    public function __construct($arg) {
        $this->arg = $arg;
    }

    public function run() {
        if ($this->arg) {
            $sleep = mt_rand(1, 10);
            printf('%s: %s  -start -sleeps %d' . "\n", date("g:i:sa"), $this->arg, $sleep);
            sleep($sleep);
            printf('%s: %s  -finish' . "\n", date("g:i:sa"), $this->arg);
        }
    }
}

// Create a array
$stack = array();

//Initiate Multiple Thread
foreach ( range("A", "D") as $i ) {
    $stack[] = new AsyncOperation($i);
}

// Start The Threads
foreach ( $stack as $t ) {
    $t->start();
}

?>

First Run

12:00:06pm:     A  -start -sleeps 5
12:00:06pm:     B  -start -sleeps 3
12:00:06pm:     C  -start -sleeps 10
12:00:06pm:     D  -start -sleeps 2
12:00:08pm:     D  -finish
12:00:09pm:     B  -finish
12:00:11pm:     A  -finish
12:00:16pm:     C  -finish

Second Run

12:01:36pm:     A  -start -sleeps 6
12:01:36pm:     B  -start -sleeps 1
12:01:36pm:     C  -start -sleeps 2
12:01:36pm:     D  -start -sleeps 1
12:01:37pm:     B  -finish
12:01:37pm:     D  -finish
12:01:38pm:     C  -finish
12:01:42pm:     A  -finish

Real World Example

error_reporting(E_ALL);
class AsyncWebRequest extends Thread {
    public $url;
    public $data;

    public function __construct($url) {
        $this->url = $url;
    }

    public function run() {
        if (($url = $this->url)) {
            /*
             * If a large amount of data is being requested, you might want to
             * fsockopen and read using usleep in between reads
             */
            $this->data = file_get_contents($url);
        } else
            printf("Thread #%lu was not provided a URL\n", $this->getThreadId());
    }
}

$t = microtime(true);
$g = new AsyncWebRequest(sprintf("http://www.google.com/?q=%s", rand() * 10));
/* starting synchronization */
if ($g->start()) {
    printf("Request took %f seconds to start ", microtime(true) - $t);
    while ( $g->isRunning() ) {
        echo ".";
        usleep(100);
    }
    if ($g->join()) {
        printf(" and %f seconds to finish receiving %d bytes\n", microtime(true) - $t, strlen($g->data));
    } else
        printf(" and %f seconds to finish, request failed\n", microtime(true) - $t);
}

C++ – What does the explicit keyword mean

The compiler is allowed to make one implicit conversion to resolve the parameters to a function. What this means is that the compiler can use constructors callable with a single parameter to convert from one type to another in order to get the right type for a parameter.

Here's an example class with a constructor that can be used for implicit conversions:

class Foo
{
public:
  // single parameter constructor, can be used as an implicit conversion
  Foo (int foo) : m_foo (foo) 
  {
  }

  int GetFoo () { return m_foo; }

private:
  int m_foo;
};

Here's a simple function that takes a Foo object:

void DoBar (Foo foo)
{
  int i = foo.GetFoo ();
}

and here's where the DoBar function is called:

int main ()
{
  DoBar (42);
}

The argument is not a Foo object, but an int. However, there exists a constructor for Foo that takes an int so this constructor can be used to convert the parameter to the correct type.

The compiler is allowed to do this once for each parameter.

Prefixing the explicit keyword to the constructor prevents the compiler from using that constructor for implicit conversions. Adding it to the above class will create a compiler error at the function call DoBar (42). It is now necessary to call for conversion explicitly with DoBar (Foo (42))

The reason you might want to do this is to avoid accidental construction that can hide bugs.
Contrived example:

You have a MyString class with a constructor that constructs a string of the given size. You have a function print(const MyString&) (as well as an overload print (char *string)), and you call print(3) (when you actually intended to call print("3")). You expect it to print "3", but it prints an empty string of length 3 instead.

Best Answer

Related Solutions

Php – How to one use multi threading in PHP applications

Multi-threading is possible in php

C++ – What does the explicit keyword mean

Related Topic