What multithreading package for Lua “just works” as shipped

luamultithreadingparallel-processing

Coding in Lua, I have a triply nested loop that goes through 6000 iterations. All 6000 iterations are independent and can easily be parallelized. What threads package for Lua compiles out of the box and gets decent parallel speedups on four or more cores?

Here's what I know so far:

  • luaproc comes from the core Lua team, but the software bundle on luaforge is old, and the mailing list has reports of it segfaulting. Also, it's not obvious to me how to use the scalar message-passing model to get results ultimately into a parent thread.

  • Lua Lanes makes interesting claims but seems to be a heavyweight, complex solution. Many messages on the mailing list report trouble getting Lua Lanes to build or work for them. I myself have had trouble getting the underlying "Lua rocks" distribution mechanism to work for me.

  • LuaThread requires explicit locking and requires that communication between threads be mediated by global variables that are protected by locks. I could imagine worse, but I'd be happier with a higher level of abstraction.

  • Concurrent Lua provides an attractive message-passing model similar to Erlang, but it says that processes do not share memory. It is not clear whether spawn actually works with any Lua function or whether there are restrictions.

  • Russ Cox proposed an occasional threading model that works only for C threads. Not useful for me.

I will upvote all answers that report on actual experience with these or any other multithreading package, or any answer that provides new information.


For reference, here is the loop I would like to parallelize:

for tid, tests in pairs(tests) do
  local results = { }
  matrix[tid] = results
  for i, test in pairs(tests) do
    if test.valid then
      results[i] = { }
      local results = results[i]
      for sid, bin in pairs(binaries) do
        local outcome, witness = run_test(test, bin)
        results[sid] = { outcome = outcome, witness = witness }
      end
    end
  end
end

The run_test function is passed in as an argument, so a package can be useful to me only if it can run arbitrary functions in parallel. My goal is enough parallelism to get 100% CPU utilization on 6 to 8 cores.

Best Answer

Norman wrote concerning luaproc:

"it's not obvious to me how to use the scalar message-passing model to get results ultimately into a parent thread"

I had the same problem with a use case I was dealing with. I liked lua proc due to its simple and light implementation, but my use case had C code that was calling lua, which was triggering a co-routine that needed to send/receive messages to interact with other luaproc threads.

To achieve my desired functionality I had to add features to luaproc to allow sending and receiving messages from the parent thread or any other thread not running from the luaproc scheduler. Additionally, my changes allow using luaproc send/receive from coroutines created from luaproc.newproc() created lua states.

I added an additional luaproc.addproc() function to the api which is to be called from any lua state running from a context not controlled by the luaproc scheduler in order to set itself up with luaproc for sending/receiving messages.

I am considering posting the source as a new github project or contacting the developers and seeing if they would like to pull my additions. Suggestions as to how I should make it available to others are welcome.

Related Topic