For multiprocessing
with Process
, I can use Value, Array
by setting args
param.
With multiprocessing
with Pool
, how can I use Value, Array.
There is nothing in the docs on how to do this.
from multiprocessing import Process, Value, Array
def f(n, a):
n.value = 3.1415927
for i in range(len(a)):
a[i] = -a[i]
if __name__ == '__main__':
num = Value('d', 0.0)
arr = Array('i', range(10))
p = Process(target=f, args=(num, arr))
p.start()
p.join()
print(num.value)
print(arr[:])
I am trying to use Value, Array
within the code snippet below.
import multiprocessing
def do_calc(data):
# access num or
# work to update arr
newdata =data * 2
return newdata
def start_process():
print 'Starting', multiprocessing.current_process().name
if __name__ == '__main__':
num = Value('d', 0.0)
arr = Array('i', range(10))
inputs = list(range(10))
print 'Input :', inputs
pool_size = multiprocessing.cpu_count() * 2
pool = multiprocessing.Pool(processes=pool_size,initializer=start_process, )
pool_outputs = pool.map(do_calc, inputs)
pool.close() # no more tasks
pool.join() # wrap up current tasks
print 'Pool :', pool_outputs
Best Answer
I never knew "the reason" for this, but
multiprocessing
(mp
) uses different pickler/unpickler mechanisms for functions passed to mostPool
methods. It's a consequence that objects created by things likemp.Value
,mp.Array
,mp.Lock
, ..., can't be passed as arguments to such methods, although they can be passed as arguments tomp.Process
and to the optionalinitializer
function ofmp.Pool()
. Because of the latter, this works:and that prints
That's the only way I know of to get this to work across platforms.
On Linux-y platforms (where
mp
creates new processes viafork()
), you can instead create yourmp.Array
andmp.Value
(etc) objects as module globals any time before you domp.Pool()
. Processes created byfork()
inherit whatever is in the module global address space at the timemp.Pool()
executes.But that doesn't work at all on platforms (read "Windows") that don't support
fork()
.