Simple PytorchBenchmark script gives CUDA forked subprocess error

I am trying to run a simple benchmark script, but it fails due to a CUDA error, which leads to another error:

Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
Traceback (most recent call last):
  File "/home/cbarkhof/code-thesis/Experimentation/Benchmarking/benchmarking-models-claartje.py", line 23, in <module>
    benchmark.run()
  File "/home/cbarkhof/.local/lib/python3.6/site-packages/transformers/benchmark/benchmark_utils.py", line 674, in run
    memory, inference_summary = self.inference_memory(model_name, batch_size, sequence_length)
ValueError: too many values to unpack (expected 2)

My script is simply:

from transformers import PyTorchBenchmark, PyTorchBenchmarkArguments

benchmark_args = PyTorchBenchmarkArguments(models=["bert-base-uncased"],
                                           batch_sizes=[8],
                                           sequence_lengths=[8, 32, 128, 512],
                                           save_to_csv=True,
                                           log_filename='log',
                                           env_info_csv_file='env_info')

benchmark = PyTorchBenchmark(benchmark_args)
benchmark.run()

I am not aware of doing any multi-processing, so why is this happening?

If anyone can point me to why this might be happening. Please let me know :). Cheers!