I am trying to load a batch from a replay buffer with pytorch asyncronously while optimizing the model parameters and thereby hide the batch loading latency. The program I run is as follows:
for _ in range(100):
begin = time.time()
batch = sample_batch()
batch_load += time.time() - begin
begin = time.time()
optimize(batch)
optimize_time += time.time() - begin
When running this script, batch_load
takes about 0.001 seconds and optimize_time
about 0.009 seconds. To hide the latency of the batch_load (although it doesn’t take long in this program, it takes more time in another program which I would actually like to optimize), I thought I can use pythons
concurrent.futuresmodule to acquire a
futurefrom
sample_batchand load it whilst
optimize` is running. This program instead looks as follows:
with concurrent.futures.ProcessPoolExecutor(max_workers=12) as executor:
for _ in range(100):
begin = time.time()
future = executor.submit(sample_batch)
batch_load += time.time() - begin
begin = time.time()
optimize(batch)
optimize_time += time.time() - begin
batch = future.result()
This turned out to be a pretty bad idea. The data loading time increases to 0.085 seconds and the optimization time increases to 0.13 seconds.
Can somebody kindly educate me on why the second program is so much slower than the first? Furthermore, does somebody have any ideas on how to hide data loading latency? I appreciate any answers and suggestions very much!