I was wondering if there is a way to train multiple independent neural networks in parallel on CPU only with random grid search?

Imagine a trivial example of a network like stated below (this is just an example):

```
import torch.nn as nn
import torch
class NeuralNet(nn.Module):
def __init__(self):
super(NeuralNet, self).__init__()
self.Linear = nn.Linear(1, 1, bias=True)
def forward(self, x):
out = self.Linear(x)
return out
```

I subsequently perform some form of grid search:

```
from skorch import NeuralNetRegressor
from skorch.callbacks import EarlyStopping
from sklearn.model_selection import RandomizedSearchCV
es = EarlyStopping(monitor='valid_loss', patience=5, threshold=1e-4, threshold_mode='abs', lower_is_better=True)
model = NeuralNetRegressor(
module= NeuralNet,
criterion=nn.MSELoss,
optimizer=torch.optim.SGD,
optimizer__momentum=0.95,
optimizer__weight_decay=0.2,
max_epochs=200,
batch_size=32,
lr=1e-4,
callbacks=[es]
)
model.verbose = 0
params_grid = {
'lr': [1e-5, 1e-4],
'optimizer__momentum': [0.9, 0.95, 0.99],
'optimizer__weight_decay': [0.1, 0.9, 1.5]
}
gs = RandomizedSearchCV(model, params_grid, refit=True, cv=3,
n_jobs=5, n_iter=10, random_state=1234, scoring='neg_mean_squared_error')
gs.fit(x_train, y_train)
```

Can I perform the above simultaneously with let’s say three different models all at the same time, where all models are completely independent? i.e. can I perform three independent grid searches for three independent models at the same time if I have let’s say 80 cores or so and each grid search uses 20 cores.

My approach was to define a function that performs the grid search, and then execute the function in parallel but it did not seem to speed up any of my calculations? How would you approach a problem like this?

I hope this is not a stupid question!