I'm trying to instantiate objects using imported modules. To make these imports process safe (since I'm on windows), I'm using the import statements inside the if __name__ == '__main__': block.
My files look somewhat like this:
main.py
# main.pyfrom multiprocessing import Process
# target func for new processdef init_child(foo_obj, bar_obj):
passif __name__ == "__main__":
# protect imports from child processfrom foo import getFoo
from bar import getBar
# get new objects
foo_obj = getFoo()
bar_obj = getBar()
# start new process
child_p = Process(target=init_child, args=(foo_obj, bar_obj))
child_p.start()
# Wait for process to join
child_p.join()
foo.py
# foo.pyimport os
print 'Foo Loaded by PID: ' + str(os.getpid())
class Foo:
def __init__(self):
passdef getFoo():
# returning new instance of classreturn Foo()
bar.py
# bar.pyimport os
print 'Bar Loaded by PID: ' + str(os.getpid())
class Bar:
def __init__(self):
passdef getBar():
# not returning a new instancereturn 'bar'
output
Foo Loaded by PID: 58760
Bar Loaded by PID: 58760
Foo Loaded by PID: 29376
The output I get indicates that the foo module was loaded twice. I understand that the interpreter executes the main module again (since Windows does not support the fork system call), but what's odd is that it was imported inside the __main__ block.
It might be an issue when sharing objects; like Queues imported from a dedicated module. Any ideas what might cause this?
Thanks!
When you use the multiprocessing module in Python on Windows, it uses a process-spawning method to create new processes, which involves starting a new Python interpreter and running the script again in the new interpreter. This is why the foo module is being imported twice, once in the main process and once in the child process.
In order to avoid this, you can use the multiprocessing.Pool class to create a pool of worker processes, and then use the apply_async() method to run your init_child function in a worker process. This way, the worker process can use the objects created in the main process and there is no need to import the modules again.
Additionally, you should make sure that your module's classes, functions, and global variables are defined in the global scope, so that they can be shared between processes.
from multiprocessing import Pool def init_child(foo_obj, bar_obj): passif __name__ == "__main__": from foo import getFoo from bar import getBar foo_obj = getFoo() bar_obj = getBar() with Pool() as pool: pool.apply_async(init_child, (foo_obj, bar_obj)) pool.close() pool.join()
Also, note that when using Pool class, you should use the Pool.close() and Pool.join() methods to close the pool and wait for all worker processes to complete, respectively.