Running Python tasks in parallel
There are many options to run Python tasks in parallel. This provides a brief description for some of the options.
threading and multiprocessing are part of the standard library.
threadingis used for thread-based parallelism, while
multiprocessingis used for process-based parallelism. If your tasks involve Python objects that lock the Global Interpreter Lock (GIL), then
threadingdoes not provide much parallelism, and you should opt for
multiprocessing. On the other hand, if your tasks involve NumPy objects that release the GIL, then
threadingis a better choice. In general,
threadinghas less overheads compared to
concurrent.futures is also part of the standard library. It provides a high-level interface for asynchronous parallelism (using
Futureobjects). You can easily choose between a “thread” pool or a “process” pool by using ThreadPoolExecutor or ProcessPoolExecutor. If you are running user-level codes, I believe
concurrent.futuresis usually the best option. But if you are writing more low-level codes, I believe you still want to choose between
multiprocessing. Note that you can still use threads from
multiprocessingby using multiprocessing.pool.ThreadPool.
dask is a powerful library that helps with the common pains dealing with large data and parallel computing (e.g. delayed, lazy loading, array chunking, distributed computing). There are two kinds of schedulers: the single-machine scheduler (default) and the more advanced dask.distributed. The advanced
dask.distributedprovides asynchronous parallelism similar to
concurrent.futuresand can be used on a cluster. But if you are just running codes on a single machine, the default scheduler should suffice (it requires zero setup). You can set the scheduler to “threads”, “processes” or “single-threaded”. The Best Practices pages (1, 2, 3) contain some very useful examples.
joblib is a lightweight library that also provides lazy evaluation and parallel computing. For process-based parallelism, it uses the alternative serialization library
pickle, which allows you to serialize more things.
I want to mention that Keras and Tensorflow also have built-in parallelism. So, if you are dealing with ML stuff, you should be able to use what is offered by Keras/Tensorflow (e.g. keras.Model.fit). However, I think the documentation is kind of difficult to go through.
Simple code snippets:
def fun(x): return x * x # No parallelism result = map(fun, range(10)) # Using concurrent.futures import concurrent.futures with concurrent.futures.ProcessPoolExecutor(max_workers=4) as executor: result = executor.map(fun, range(10)) # Using multiprocessing import multiprocessing with multiprocessing.Pool(processes=4) as pool: result = pool.imap(fun, range(10)) # Using threads from multiprocessing with multiprocessing.pool.ThreadPool(processes=4) as pool: result = pool.imap(fun, range(10))