Java – Why does this Java code not utilize all CPU cores

cpujavamulticorescalabilityscaling

The attached simple Java code should load all available cpu core when starting it with the right parameters. So for instance, you start it with

java VMTest 8 int 0

and it will start 8 threads that do nothing else than looping and adding 2 to an integer. Something that runs in registers and not even allocates new memory.

The problem we are facing now is, that we do not get a 24 core machine loaded (AMD 2 sockets with 12 cores each), when running this simple program (with 24 threads of course). Similar things happen with 2 programs each 12 threads or smaller machines.

So our suspicion is that the JVM (Sun JDK 6u20 on Linux x64) does not scale well.

Did anyone see similar things or has the ability to run it and report whether or not it runs well on his/her machine (>= 8 cores only please)? Ideas?

I tried that on Amazon EC2 with 8 cores too, but the virtual machine seems to run different from a real box, so the loading behaves totally strange.

package com.test;

import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import java.util.concurrent.TimeUnit;

public class VMTest
{
    public class IntTask implements Runnable 
    {
        @Override
        public void run()
        {
            int i = 0;

            while (true)
            {
                i = i + 2;
            }
        }
    }
    public class StringTask implements Runnable 
    {
        @Override
        public void run()
        {
            int i = 0;

            String s;
            while (true)
            {
                i++;
                s = "s" + Integer.valueOf(i);
            }
        }
    }
    public class ArrayTask implements Runnable 
    {
        private final int size; 
        public ArrayTask(int size)
        {
            this.size = size;
        }
        @Override
        public void run()
        {
            int i = 0;

            String[] s;
            while (true)
            {
                i++;
                s = new String[size];
            }
        }
    }

    public void doIt(String[] args) throws InterruptedException
    {
        final String command = args[1].trim();

        ExecutorService executor = Executors.newFixedThreadPool(Integer.valueOf(args[0]));
        for (int i = 0; i < Integer.valueOf(args[0]); i++)
        {
            Runnable runnable = null;
            if (command.equalsIgnoreCase("int"))
            {
                runnable = new IntTask();
            }
            else if (command.equalsIgnoreCase("string"))
            {
                runnable = new StringTask();
            }
            Future<?> submit = executor.submit(runnable);
        }
        executor.awaitTermination(1, TimeUnit.HOURS);
    }

    public static void main(String[] args) throws InterruptedException
    {
        if (args.length < 3)
        {
            System.err.println("Usage: VMTest threadCount taskDef size");
            System.err.println("threadCount: Number 1..n");
            System.err.println("taskDef: int string array");
            System.err.println("size: size of memory allocation for array, ");
            System.exit(-1);
        }

        new VMTest().doIt(args);
    }
}

Best Answer

I don't see anything wrong with your code.

However, unfortunately, you can't specify the processor affinity in Java. So, this is actually left up to the OS, not the JVM. It's all about how your OS handles threads.

You could split your Java threads into separate processes and wrap them up in native code, to put one process per core. This does, of course, complicate communication, as it will be inter-process rather than inter-thread. Anyway, this is how popular grid computing applications like boink work.

Otherwise, you're at the mercy of the OS to schedule the threads.