joblib.Parallel is overwriting variables

Joblib is a nice Python library for “lightweight pipelining”. This includes for example “easy simple parallel computing”. It is heavily used by scikit-learn to speed up for example machine learning algorithms. However, it has some quirks.

One of them is that joblib.Parallel (used for easily parallelizing for-loops) is overwriting certain variables defined by the outer scope. These variables include for example args, parser, exitcode, or spawn (see pierreglaser’s response here). This can be rather unexpected and cause confusion. There is a bug report asking to fix or to document this behavior. See the following code for an example:

In the previous code snippet you would expect get a list with two “5”s: [5,5]. Instead what you will get is a list of two Namespace object defined by the joblib.Parallel “context”.

For reference, this behavior was observed for Python3, joblib=0.13.1 and Ubuntu 18.04 with kernel Linux 4.15.0-43-generic x86_64.

One thought on “joblib.Parallel is overwriting variables”

Leave a Reply

Your email address will not be published. Required fields are marked *