Error on torch.load() (PytorchStreamReader failed)

Pritesh_Gohil · September 3, 2020, 7:58pm

Hi,
I was trying to load the pytorch model but facing an unexpected error. I do not disturb the folder structure and still getting this error.

>>> torch.load("outputs/test_validation_loss_logging/model_001650.pth", map_location="cpu")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/pritesh/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/serialization.py", line 586, in load
    with _open_zipfile_reader(f) as opened_zipfile:
  File "/home/pritesh/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/serialization.py", line 246, in __init__
    super(_open_zipfile_reader, self).__init__(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: [enforce fail at inline_container.cc:143] . PytorchStreamReader failed reading zip archive: failed finding central directory

Linux OS
Python3.7
torch version 1.5.0

Check how the model is saved here

.

ptrblck · September 7, 2020, 7:26am

Were you saving the complete model via torch.save(model, path) instead of the state_dict?
If that’s the case, did you change any files or folder structure?

Pritesh_Gohil · September 9, 2020, 10:30am

I save the model using torch.save(model, path) and folder or file structure is not changed.

Pritesh_Gohil · September 9, 2020, 2:13pm

Ok, I’m able to load the model. The problem was with the saved weight file. It wasn’t saved properly and the weight file size was smaller (only 90 MB instead of 200 MB).

Tupakula_Mallikarjun · January 15, 2021, 5:24am

Hi @ptrblck @Pritesh_Gohil. I am also facing the same problem. I have a downloaded model. How can I load it?

ptrblck · January 15, 2021, 9:00am

Could you explain your use case and the issue you are seeing a bit more?
Usually you could create the model instance and load the downloaded state_dict via model.load_state_dict.

Tupakula_Mallikarjun · January 16, 2021, 5:25am

Hey @ptrblck. I solved the error. I am getting this error All bounding boxes should have positive height and width. Found invaid box [380.0525207519531, 247.53013610839844, 380.0525207519531, 249.6278533935547] for the target at index 3.Don’t you’ve any idea about this?

ptrblck · January 16, 2021, 8:22am

I’m not sure how the bounding box coordinates are encoded, but if they are stored as [x0, y0, x1, y1], the width would be zero, since x1 - x0 = 0.

HaukurPall · January 19, 2021, 10:20am

I am also having this issue.

Have not changed the directory structure / code.
I am able to load the model in certain cases, see below.

I use torch.save to distribute my model more easily to users. I attach some vocabulary mappings and tokenizers to the Modules so I don’t have to load them separately.
When training models which have smaller vocabulary mappings I am able to load the model. When using larger vocabularies (for a release) I am unable to load the model and I get this error. I am using PyTorch version 1.7.1 and Python 3.8.6.

HaukurPall · January 21, 2021, 10:08am

I was able to resolve my issue.

I noticed that the files for the larger models were smaller than to be expected. Then I checked the logs of the training and the run crashed when saving the model due to OOM.

eslam_fouda · February 10, 2021, 10:33am

shouldn’t one use model.load_weights(opt.weights_path)?

Udith_Haputhanthri · March 28, 2021, 6:37pm

Hi, I have solved this error by creating a new .pth file using the content of the existing .pth file as explained in here. Even though this newly created .pth file has the exactly the same content as in the previous one, it could be loaded without errors.

NickYi1990 · May 11, 2021, 3:12am

lol, same for me, thanks!

Jianquan_Zhao · October 18, 2021, 9:13am

I am agree with you .
During the programming, there are two ways to save and load model.
The first one is save the whole model and we must load in the same folder where the second one is to save it’s weights(state_dict) and we must claim a model and load state_dict.

aktaseren · November 4, 2021, 10:46am

Hi @Pritesh_Gohil , @HaukurPall, sorry for tagging you both here. It seems that you solved the issue I am also having. Some of you said that the saved weight wasn’t saved properly. I am sure that my weight file saved properly, whose size is around 300MB.

I tried both saving options as follows with both filing ways:

        # torch.save(model.state_dict(), '/home/aktaseren/people-torch/pidxx.pt')
        # torch.save(model, '/home/aktaseren/people-torch/pidxx.pth')

However, I am not able to load it via any of these. I am having the same error you got. My main purpose is to convert one Pytorch model to ONXX version in order to use it for Opencv framework.

I would be appreciated if any of you can explain in detail how you solved this.

G123 · January 5, 2022, 8:33pm

Hi! New member here! I am following this code: How to Train YOLO v5 on a Custom Dataset | Paperspace Blog
but when I tried this command: python train.py --img 640 --cfg yolov5s.yaml --hyp hyp.scratch.yaml --batch 32 --epochs 100 --data road_sign_data.yaml --weights yolov5s.pt --workers 24 --name yolo_road_det

I got this error:

File "/home/UbuntuUser/.local/lib/python3.8/site-packages/torch/serialization.py", line 242, in __init__ super(_open_zipfile_reader, self).__init__(torch._C.PyTorchFileReader(name_or_buffer)) RuntimeError: [enforce fail at inline_container.cc:145] . PytorchStreamReader failed reading zip archive: failed finding central directory

Any idea, what is wrong and how to fix it?

G123 · January 5, 2022, 10:02pm

How do you save the weight? Where?

enterthevoidf22 · January 25, 2022, 9:28am

i am getting the same error when trying to load a simple torch tensor (.pt) it didn’t happen in the past and only started happening after i did some preprocessing protocol on all the tensors in my in-disk dataset.

my code is simple and goes like this:

mask = np.load(path_to_mask) #the preprocess is actually a simple mask application
for path in os.listdir(path_to_all_pt_files):
    scan = torch.load(path)
    scan = preprocess(scan,mask)
    torch.save(scan,path)

def preprocess(scan,mask):
     masked = scan * torch.from_numpy(mask)
     masked = masked.to(dtype=scan.dtype)
     return masked

what could be the cause?

lzq1477960451 · February 4, 2022, 9:20am

Thank you. I have the same problem as you. The problem has been solved. It should be the last time the parameter is saved, it is forcibly interrupted, resulting in saving The .pth file is incomplete.

desmond-rn · August 8, 2022, 5:15pm

It could be a problem with the PyTorch version and saving mechanism.
I had the same problem and solved it by passing the kwarg _use_new_zipfile_serialization=False when saving the model.
More details here.