Save torch model in Databricks Cluster

Hi. I’m using a Databricks CPU Cluster to train my NN on Pytorch. The problem is that when I try to save my model, I can only save it on the Cluster CPU (which is unnaccessible) and when I restart the cluster I lose my saved model.
How could I make it to save it into local path(windows)? If it is not possible, could it be possible to save the file on a table-like format and then reformat it into PyTorchModelType?

Thanks :slight_smile:

I’m not sure I understand this sentence correctly.
Are you able to access some folders on this machine?
If so, you should be able to store the state_dict there (similar to your script files).

I can acces into some folders of the Cluster but they are non-savable paths for the model. It seems like I’m saving the .pt into a non accessible path.

Were are you storing the script files? Could you save the state_dict to the same location?

Any path that I try that is not the direct path -> “filename.pt” returns me this error:

FileNotFoundError: [Errno 2] No such file or directory:

Are you getting this error from torch.save(model.state_dict(), 'filename.pt')?

This gets correctly saved and can be loaded but when cluster is restarted it disappears:

torch.save(model.state_dict(), ‘filename.pt’)

Andthis returns de error mentioned before:

torch.save(model.state_dict(), ‘dbfs:/existingfilepath/filename.pt’)

So the files disappear after a restart?
This seems to be a Databricks issue and I’m unfortunately not familiar with their cluster.
Could you add some persistent user folders, where you could store data or maybe mount another drive to the machine?