Does it ever make sense to use more concurrent processes than processor cores

concurrencycpugogolangmultithreading

I've got some process in Go. Here's an example counting lines in text, though the question is meant to be far more general than this particular example:

func lineCount(s string) int {
    count := 0
    for _, c := range s {
        if c == '\n' {
            count++
        }
    }
    return count
}

Alright, not bad, but it's too slow, so let's make it concurrent:

func newLine(r rune, c chan<- struct{}, wg sync.WaitGroup) {
    if r == '\n' {
        c <- struct{}
    }
    wc.Done()
}

func sumLines(c <-chan struct{}, result chan<- int) {
    count := 0
    for _ := range c {
        count++
    }
    result <- count
}

func lineCount(s string) int {
    c := make(chan struct{})
    var wg sync.WaitGroup
    for _, r := range s {
        wg.Add(1)
        go newLine(r, c, wg)
    }
    result := make(chan int)
    go sumLines(c, result)
    wg.Wait()
    close(c)
    return <-result
}
    

Better, because now we're using all our cores, but let's be honest, one goroutine per letter is probably overkill, and we're likely adding a lot of overhead between the horrendous number of goroutines and the locking/unlocking of the wait group. Let's do better:

func newLine(s string, c chan<- int, wg sync.WaitGroup) {
    count := 0
    for _, r := range s {
        if r == '\n' {
            count++
        }
    }
    c <- count
    wc.Done()
}

func sumLines(c <-chan int, result chan<- int) {
    count := 0
    for miniCount := range c {
        count += miniCount
    }
    result <- count
}

func lineCount(s string) int {
    c := make(chan int)
    var wg sync.WaitGroup
    for i := 0; i < len(s)/MAGIC_NUMBER; i++ {
        wg.Add(1)
        go newLine(s[i*MAGIC_NUMBER : (i+1)*MAGIC_NUMBER], c, wg)
    }
    result := make(chan int)
    go sumLines(c, result)
    wg.Wait()
    close(c)
    return <-result
}

So now we're dividing up our string evenly (except the last part) into goroutines. I've got 8 cores, so do I ever have a reason to set MAGIC_NUMBER to greater than 8? Again, while I'm writing this question with the example of counting lines in text, the question is really directed at any situation where the problem can be sliced and diced any number of ways, and it's really up the programmer to decide how many slices to go for.

Best Answer

The canonical time when you use far, far more processes than cores is when your processes aren't CPU bound. If your processes are I/O bound (either disk or more likely network), then you can absolutely and sensibly have a huge number of processes per core, because the processes are sleeping most of the time anyway. Unsurprisingly enough, this is how any modern web server works.

Related Topic