^^What is the best way to perform hyper parameter search in PyTorch? Are there frameworks that can ease this process?

@kevinzakka has implemented hypersearch.

There are still some TODOs, so alternatively you could have a look at Skorch which allows you to use the scikit-learn grid search / random search.

An example:

```
class Net(torch.nn.Module):
def __init__(self):
'''
A feedForward neural network.
Argurmets:
n_feature: How many of features in your data
n_hidden: How many of neurons in the hidden layer
n_output: How many of neuros in the output leyar (defaut=1)
'''
super(Net, self).__init__()
self.hidden = torch.nn.Linear(D_in, H, bias=True) # hidden layer
self.predict = torch.nn.Linear(H, D_out, bias=True) # output layer
self.n_feature, self.n_hidden, self.n_output = D_in, H, D_out
def forward(self, x,**kwargs):
'''
Argurmets:
x: Features to predict
'''
torch.nn.init.constant_(self.hidden.bias.data,1)
torch.nn.init.constant_(self.predict.bias.data,1)
x = torch.sigmoid(self.hidden(x)) # activation function for hidden layer
x = torch.sigmoid(self.predict(x)) # linear output
return x
```

```
from skorch import NeuralNetRegressor
net = NeuralNetRegressor(Net
, max_epochs=100
, lr=0.001
, verbose=1)
```

```
X_trf = X
y_trf = y.reshape(-1, 1)
print(X_trf.shape,y_trf.shape)
```

```
from sklearn.model_selection import GridSearchCV
params = {
'lr': [0.001,0.005, 0.01, 0.05, 0.1, 0.2, 0.3],
'max_epochs': list(range(500,5500, 500))
}
gs = GridSearchCV(net, params, refit=False, scoring='r2', verbose=1, cv=10)
gs.fit(X_trf, y_trf)
```

Hi Ptrblck,

I hope you are doing well. Sorry to take your time. I want to do hyper parameter tuning for CNN layers ( 2 or 3 layers), number of filters for CNN, FC layers ( 2 or 3 layers) and number of neurons ([100:10:100]) , batch size {100,200}, LR {10^-4,10^-5}, Dropout{0.3,0.5,0.7}.

Would you please tell me what is your suggestion? Is there any function to use in Pytorch? Or it is better to do grid search for all combinations?

I am a bit skeptical of methods like grid and random search. It is nice to try them but I think experience is key in hyperparameter fine-tunning. These methods are not that good when your training takes 1 week and you do not have a server with 100’s of gpus.

For example, taking a better optimizer that converges faster is a cheaper and better way to optimize your training. Also, take for instances the batch size, a 32 batch size in a CNN will tend to perform better than a 4 or 8 batch size (at least in the dataset I am working on).

Experience plays a big role I guess.

In my opinion, you are 75% right, In the case of something like a CNN, you can scale down your model procedurally so it takes much less time to train, THEN do hyperparameter tuning. This paper found that a grid search to obtain the best accuracy possible, THEN scaling up the complexity of the model led to superior accuracy. Probably would not work for all cases, but definitely a good application for grid searches.

Hello,

I know it has been along time since this post.

It has been days since I tried to find a solution to find a better hyperparameters for my model. I tried many solutions but with no good results

I am working with your proposed solution here with a classification bert model developed under the Pytorch lightning. I couldn’t find what to put in the `fit`

method. i tried a tensor dataloader but It doesn’t work so I tried the result of the tokeniser like this :

```
input_ids_train = encoded_data_train['input_ids'] # Xs of my models
labels_train = torch.tensor(label.values) # the labels of my data
```

I got a `TypeError: forward() missing 1 required positional argument: 'attention_mask'`

and a `FitFailedWarning`

said that I need to reshape my data.

Could you please tell me how can I prepare my data (input_ids, attention_mask and labels) so they can fit the `fit`

methods, please ?

Hello,

How can we perform GridSearchCV with pretrained models in pytorch?

Thanks

I will check it out. Thank you very much

Hi @ptrblck

I attempted applying GridSearch to optimize hyperparameters, but . Affter search.fit(train_ds, y=None)

and search.best_score_

I wanted to see best score. It demonstrate nan.

What am I doing wrong for applying GridSearch?

My codes is following

Thank you

data_dir = ‘/content/drive/MyDrive/raphcatr’

train_transforms = transforms.Compose([

transforms.RandomResizedCrop(224),

transforms.RandomHorizontalFlip(),

transforms.ToTensor(),

transforms.Normalize([0.485, 0.456, 0.406],

[0.229, 0.224, 0.225])

])

val_transforms = transforms.Compose([

transforms.Resize(256),

transforms.CenterCrop(224),

transforms.ToTensor(),

transforms.Normalize([0.485, 0.456, 0.406],

[0.229, 0.224, 0.225])

])

train_ds = datasets.ImageFolder(

os.path.join(data_dir, ‘train’), train_transforms)

val_ds = datasets.ImageFolder(

os.path.join(data_dir, ‘val’), val_transforms)

class PretrainedModel(nn.Module):

def **init**(self, output_features):

super().**init**()

model = models.resnet18(pretrained=True)

num_ftrs = model.fc.in_features

model.fc = nn.Linear(num_ftrs, output_features)

classifier = nn.Sequential(nn.Dropout(p=0.5),

nn.Linear(num_ftrs, output_features))

```
model.fc=classifier
self.model = model
def forward(self, x):
return self.model(x)
```

params = {

‘lr’: [0.01, 0.02],

‘max_epochs’: [10, 20],

}

net = NeuralNetClassifier(

PretrainedModel,

criterion=nn.CrossEntropyLoss,

lr=0.01,

batch_size=16,

max_epochs=10,

optimizer=optim.SGD,

optimizer__momentum=0.9,

iterator_train__shuffle=True,

callbacks=[freezer, lrscheduler, checkpoint],

device=‘cuda’ # comment to train on cpu

)

gs = GridSearchCV(net, params, refit=False, cv=3, scoring=‘accuracy’, verbose=2)

gs.fit(train_ds, y=None)

Could you check, if you could see all scores and see if some of the runs created a NaN loss due to a bad hyperparameter set?

If so, you might want to select the highest score which is not a NaN. I’m not familiar with the internal implementation of `skorch`

’ grid search but would assume that invalid runs (yielding NaNs) would be removed from the best score.