How to use Skorch if your data does not fit into memory for grid-search? I have a dataloader that returns mini-batches and I want to do a grid-search to find the best hyper-parameters for my model.
As you may know, you can use datasets or even your own dataloader with skorch but this is a bit problematic in conjunction with GridSearchCV
or other parameter searches since they expect indexable inputs (torch datasets aren’t). For this reason skorch has a SliceDataset
. You can use it as follows:
gs = GridSearchCV(mySkorchNet, params, ...)
ds = MyCustomDataset()
X_sl = SliceDataset(ds, idx=0)
y_sl = SliceDataset(ds, idx=1)
gs.fit(X_sl, y_sl)
What SliceDataset
does is to emulate index operations so that GridSearchCV
can compute the train/validation split and slice the data properly without loading all of the data, so basically a torch dataset you can slice.
Thanks, I managed to make it work using the sliceDataset
hey how do you perform grid search please guide me.
Hi @Thabang_Lukhetho, I tried following the solution provided above but it didn’t work for me. Can I ask how did you managed to make it work using sliceDataset?
Can you post the error message or problem you are getting?
Hi @Thabang_Lukhetho, many thanks for your reply.
I have actually posted my question here in PyTorch Forums but haven’t found a solution yet. You can see my code along with the error message for this here: How to use PyTorch's DataLoader together with skorch's GridSearchCV
@Muhammad_Izaz have a look at this tutorial, I also followed it,