^^What is the best way to perform hyper parameter search in PyTorch? Are there frameworks that can ease this process?
@kevinzakka has implemented hypersearch.
There are still some TODOs, so alternatively you could have a look at Skorch which allows you to use the scikit-learn grid search / random search.
An example:
class Net(torch.nn.Module):
def __init__(self):
'''
A feedForward neural network.
Argurmets:
n_feature: How many of features in your data
n_hidden: How many of neurons in the hidden layer
n_output: How many of neuros in the output leyar (defaut=1)
'''
super(Net, self).__init__()
self.hidden = torch.nn.Linear(D_in, H, bias=True) # hidden layer
self.predict = torch.nn.Linear(H, D_out, bias=True) # output layer
self.n_feature, self.n_hidden, self.n_output = D_in, H, D_out
def forward(self, x,**kwargs):
'''
Argurmets:
x: Features to predict
'''
torch.nn.init.constant_(self.hidden.bias.data,1)
torch.nn.init.constant_(self.predict.bias.data,1)
x = torch.sigmoid(self.hidden(x)) # activation function for hidden layer
x = torch.sigmoid(self.predict(x)) # linear output
return x
from skorch import NeuralNetRegressor
net = NeuralNetRegressor(Net
, max_epochs=100
, lr=0.001
, verbose=1)
X_trf = X
y_trf = y.reshape(-1, 1)
print(X_trf.shape,y_trf.shape)
from sklearn.model_selection import GridSearchCV
params = {
'lr': [0.001,0.005, 0.01, 0.05, 0.1, 0.2, 0.3],
'max_epochs': list(range(500,5500, 500))
}
gs = GridSearchCV(net, params, refit=False, scoring='r2', verbose=1, cv=10)
gs.fit(X_trf, y_trf)
Hi Ptrblck,
I hope you are doing well. Sorry to take your time. I want to do hyper parameter tuning for CNN layers ( 2 or 3 layers), number of filters for CNN, FC layers ( 2 or 3 layers) and number of neurons ([100:10:100]) , batch size {100,200}, LR {10^-4,10^-5}, Dropout{0.3,0.5,0.7}.
Would you please tell me what is your suggestion? Is there any function to use in Pytorch? Or it is better to do grid search for all combinations?
I am a bit skeptical of methods like grid and random search. It is nice to try them but I think experience is key in hyperparameter fine-tunning. These methods are not that good when your training takes 1 week and you do not have a server with 100’s of gpus.
For example, taking a better optimizer that converges faster is a cheaper and better way to optimize your training. Also, take for instances the batch size, a 32 batch size in a CNN will tend to perform better than a 4 or 8 batch size (at least in the dataset I am working on).
Experience plays a big role I guess.
In my opinion, you are 75% right, In the case of something like a CNN, you can scale down your model procedurally so it takes much less time to train, THEN do hyperparameter tuning. This paper found that a grid search to obtain the best accuracy possible, THEN scaling up the complexity of the model led to superior accuracy. Probably would not work for all cases, but definitely a good application for grid searches.
Hello,
I know it has been along time since this post.
It has been days since I tried to find a solution to find a better hyperparameters for my model. I tried many solutions but with no good results
I am working with your proposed solution here with a classification bert model developed under the Pytorch lightning. I couldn’t find what to put in the fit
method. i tried a tensor dataloader but It doesn’t work so I tried the result of the tokeniser like this :
input_ids_train = encoded_data_train['input_ids'] # Xs of my models
labels_train = torch.tensor(label.values) # the labels of my data
I got a TypeError: forward() missing 1 required positional argument: 'attention_mask'
and a FitFailedWarning
said that I need to reshape my data.
Could you please tell me how can I prepare my data (input_ids, attention_mask and labels) so they can fit the fit
methods, please ?
Hello,
How can we perform GridSearchCV with pretrained models in pytorch?
Thanks
skorch
is compatible with these scikit-learn methods so you might want to check it out.
I will check it out. Thank you very much
Hi @ptrblck
I attempted applying GridSearch to optimize hyperparameters, but . Affter search.fit(train_ds, y=None)
and search.best_score_
I wanted to see best score. It demonstrate nan.
What am I doing wrong for applying GridSearch?
My codes is following
Thank you
data_dir = ‘/content/drive/MyDrive/raphcatr’
train_transforms = transforms.Compose([
transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406],
[0.229, 0.224, 0.225])
])
val_transforms = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406],
[0.229, 0.224, 0.225])
])
train_ds = datasets.ImageFolder(
os.path.join(data_dir, ‘train’), train_transforms)
val_ds = datasets.ImageFolder(
os.path.join(data_dir, ‘val’), val_transforms)
class PretrainedModel(nn.Module):
def init(self, output_features):
super().init()
model = models.resnet18(pretrained=True)
num_ftrs = model.fc.in_features
model.fc = nn.Linear(num_ftrs, output_features)
classifier = nn.Sequential(nn.Dropout(p=0.5),
nn.Linear(num_ftrs, output_features))
model.fc=classifier
self.model = model
def forward(self, x):
return self.model(x)
params = {
‘lr’: [0.01, 0.02],
‘max_epochs’: [10, 20],
}
net = NeuralNetClassifier(
PretrainedModel,
criterion=nn.CrossEntropyLoss,
lr=0.01,
batch_size=16,
max_epochs=10,
optimizer=optim.SGD,
optimizer__momentum=0.9,
iterator_train__shuffle=True,
callbacks=[freezer, lrscheduler, checkpoint],
device=‘cuda’ # comment to train on cpu
)
gs = GridSearchCV(net, params, refit=False, cv=3, scoring=‘accuracy’, verbose=2)
gs.fit(train_ds, y=None)
Could you check, if you could see all scores and see if some of the runs created a NaN loss due to a bad hyperparameter set?
If so, you might want to select the highest score which is not a NaN. I’m not familiar with the internal implementation of skorch
’ grid search but would assume that invalid runs (yielding NaNs) would be removed from the best score.
You could try raytune+pytorch-lighyning. As for me, that’s better than handle or sktorch.