Go Concurrency – Understanding Goroutines and Scheduling

cconcurrencygooperating systemsscheduling

Background:

pthreads follow pre-emptive scheduling, whereas C++ fibers follow cooperative scheduling.

With Pthreads: the current execution path may be interrupted or preempted at any time This means that for threads, data integrity is a big issue because one thread may be stopped in the middle of updating a chunk of data, leaving the integrity of the data in a bad or incomplete state. This also means that the operating system can take advantage of multiple CPUs and CPU cores by running more than one thread at the same time and leaving it up to the developer to guard data access.

Using C,

int pthread_create(pthread_t *thread, const pthread_attr_t *attr,
                          void *(*start_routine) (void *), void *arg);

Using threads, application may have Concurrency,

Properties of Concurrency:

1) Multiple actors

2) Shared resource

3) Rules to access(Atomic/Conditional synchronization)


With C++ fibers: the current execution path is only interrupted when the fiber yields execution This means that fibers always start and stop in well-defined places, so data integrity is much less of an issue. Also, because fibers are often managed in the user space, expensive context switches and CPU state changes need not be made, making changing from one fiber to the next extremely efficient. On the other hand, since no two fibers can run at exactly the same time, just using fibers alone will not take advantage of multiple CPUs or multiple CPU cores.

In Win32, a fiber is a sort of user-managed thread. A fiber has its own stack and its own instruction pointer etc., but fibers are not scheduled by the OS: you have to call SwitchToFiber explicitly. Threads, by contrast, are pre-emptively scheduled by the operation system.

So roughly speaking a fiber is a thread that is managed at the application/runti me level rather than being a true OS thread.

Using C,

void __stdcall MyScheduler(void *param){
  ....
}

LPVOID *FiberScheduler = CreateFiber(0, MyScheduler, NULL);

Why C++ fibers?

OS threads give us everything we want, but for a heavy performance penalty: switching between threads involves jumping back and forth from user to kernel mode, possibly even across address space boundaries. These are expensive operations partly because they cause TLB flushes, cache misses and CPU pipelining havoc: that’s also why traps and syscalls can be orders of magnitude slower than regular procedure calls.

In addition, the kernel schedules threads (i.e. assigns their continuation to a CPU core) using a general-purpose scheduling algorithm, which might take into account all kinds of threads, from those serving a single transaction to those playing an entire video.

Fibers, because they are scheduled at the application layer, can use a scheduler that is more appropriate for their use-case. As most fibers are used to serve transactions, they are usually active for very short periods of time and block very often. Their behavior is often to be awakened by IO or another fiber, run a short processing cycle, and then transfer control to another fiber (using a queue or another synchronization mechanism).Such behavior is best served by a scheduler employing an algorithm called “work-stealing”;When fibers behave this way, work-stealing ensures minimal cache misses when switching between fibers.


Fiber does not exploit the power of multiple core, because what OS knows is, single threaded process.

In GO, we invoke goroutines using go keyword

func main(){
  go f() // f runs as new go routine
  f()
}

Question:

1) Is GO routine(f) a fiber that is non-preemptively scheduled by GO runtime, in user space?

2) If yes, Does Concurrency situation arise in GO environment?

3) Does GO support api for OS level thread?

Best Answer

Question #1: Not really

Goroutines are a bit weird. They are somewhat similar to fibers, but also somewhat similar to threads.

  • They might be preempted.
  • They might be concurrent.
  • They might share resources.
  • They often block on a queue (channel).
  • They have their own stacks.
  • They are not directly scheduled by the OS, but by the golang runtime.

The golang runtime usually starts up to GOMAXPROCS threads by default and schedules your goroutines among them. On any given thread, a goroutine will run until completion or until it blocks on a channel.

That means you can think of goroutines as fibers shared among threads.

This means that you can think of goroutines that don't access global state like you would think of fibers. But for goroutines that do access global state, you need to treat of them like threads.

Question #2: Yes

You need to be mindful when accessing global state!

However, the default communication mechanism, channels, synchronizes access to shared resources, which eases concurrent programming in Go by a lot.

Question #3: Not in the standard library

If you really want to, you could start threads by writing a library in C for Go that gives you access to the underlying OS thread functions (like pthread_create).

However, I strongly doubt you could use goroutines and channels on threads created in this way, as the golang runtime scheduler has no knowledge of them.

This might also cause problems with calls to other libraries (like the standard library!) that assume access to goroutines and channels.

In conclusion, I don't think directly creating and managing threads in Go is a good idea.