Parallelization of multiple cost functions

Hi

I have a RNN and a set of time series data organized in n sequences as (y_1, u_1), (y_2, u_2), …, (y_n, u_n), where y_i is vector of data outputs and u_i is matrix where each column is a signal and each row a time sample of all signals.

The cost function to be minimized is in the form of a sum

min f_1(y_1, u_1) + f_2(y_2, u_2) + f_3(y_3, u_3) + … + f_n(y_n, u_n)

where each cost function f_i is different for each sequence i.

I was wondering if someone has any experience, or can help me find where to start look how to make each term f_i(y_i, u_i) in the cost-function to be evaluated in parallel using multiple cpus/gpus?

Hey @daner if you are using a single machine with multiple GPUs, you can try scatter + parallel_apply + gather. The implementation of DataParallel can serve as an example. [link]