MNIST server down

skynet10100 · March 11, 2021, 9:25am

Hello together,
can someone confirm, that the server for downloading MNIST dataset is down? I cannot access the dataset by the dataloader. The following message is printed:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/opt/conda/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/app/task_listener.py", line 67, in do_work
    response = eval_fitness(arch,args)
  File "/app/fitness_evaluator.py", line 101, in eval_fitness
    train_dataloader, valid_dataloader = get_train_valid_dataloaders(args,input,kfold,0)
  File "/app/utils/dataloader_provider.py", line 354, in get_train_valid_dataloaders
    train_ds = INPUT_FUNCTOR[key](root=path.join(".",input), test=False, train=True,download=True,transform=None,kfolds=k, current_fold=current_fold,key=key)
  File "/app/utils/dataloader_provider.py", line 37, in __init__
    super().__init__(root, train=(not test), transform=transform, target_transform=target_transform, download=download)
  File "/opt/conda/lib/python3.8/site-packages/torchvision/datasets/mnist.py", line 79, in __init__
    self.download()
  File "/opt/conda/lib/python3.8/site-packages/torchvision/datasets/mnist.py", line 146, in download
    download_and_extract_archive(url, download_root=self.raw_folder, filename=filename, md5=md5)
  File "/opt/conda/lib/python3.8/site-packages/torchvision/datasets/utils.py", line 256, in download_and_extract_archive
    download_url(url, download_root, filename, md5)
  File "/opt/conda/lib/python3.8/site-packages/torchvision/datasets/utils.py", line 84, in download_url
    raise e
  File "/opt/conda/lib/python3.8/site-packages/torchvision/datasets/utils.py", line 70, in download_url
    urllib.request.urlretrieve(
  File "/opt/conda/lib/python3.8/urllib/request.py", line 247, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/opt/conda/lib/python3.8/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/opt/conda/lib/python3.8/urllib/request.py", line 531, in open
    response = meth(req, response)
  File "/opt/conda/lib/python3.8/urllib/request.py", line 640, in http_response
    response = self.parent.error(
  File "/opt/conda/lib/python3.8/urllib/request.py", line 569, in error
    return self._call_chain(*args)
  File "/opt/conda/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/opt/conda/lib/python3.8/urllib/request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 503: Service Unavailable
0it [00:00, ?it/s]

If I try to download from my browser, it also does not work and shows the following:

does someone else face this problem? Is there some information, when the server will be up again?

Best Regards

ptrblck · March 11, 2021, 9:39am

The server seems indeed to be down.
CC @seemethere

EDIT: it also seems that the download of (some) of the MNIST files works after a few retries.

Jinhyun_Park · March 11, 2021, 4:35pm

I am also having the same problem since last night. @skynet10100

Does anyone know when it will be fixed?

ptrblck · March 12, 2021, 7:43am

Issue is being tracked here.

Shravya_Kuldeep · March 17, 2021, 3:41am

Seems like the issue isn’t fixed yet, but the github issue is closed.

crcrpar · March 17, 2021, 3:48am

Do you mean you can’t download MNIST with master branch of torchvision?

Shravya_Kuldeep · March 17, 2021, 9:03am

I am able to download it now.

LuoXin-s · March 17, 2021, 10:26am

I still can not download it.

crcrpar · March 17, 2021, 10:43am

If the version of torchvision is 0.9.0, which is currently stable, being unable to download MNIST is (unfortunately) expected, but if the version is nightly, it’s not expected.

skynet10100 · March 17, 2021, 11:52am

@crcrpar indeed, it seems to be fixed by another alternative download location in the nigthly version, thanks a lot for the hint!

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 503: Service Unavailable

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz to ./MNIST/raw/train-images-idx3-ubyte.gz
9913344it [00:01, 5985275.67it/s]                                                                                                                                                                           
Extracting ./MNIST/raw/train-images-idx3-ubyte.gz to ./MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 503: Service Unavailable

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz to ./MNIST/raw/train-labels-idx1-ubyte.gz
29696it [00:00, 182824.66it/s]                                                                                                                                                                              
Extracting ./MNIST/raw/train-labels-idx1-ubyte.gz to ./MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 503: Service Unavailable

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz to ./MNIST/raw/t10k-images-idx3-ubyte.gz
1649664it [00:01, 1648913.12it/s]                                                                                                                                                                           
Extracting ./MNIST/raw/t10k-images-idx3-ubyte.gz to ./MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 503: Service Unavailable

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz to ./MNIST/raw/t10k-labels-idx1-ubyte.gz
5120it [00:00, 2691419.54it/s]                                                                                                                                                                              
Extracting ./MNIST/raw/t10k-labels-idx1-ubyte.gz to ./MNIST/raw

Processing...
/home/klos/anaconda3/lib/python3.7/site-packages/torchvision/datasets/mnist.py:502: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at  /pytorch/torch/csrc/utils/tensor_numpy.cpp:179.)
  return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)
Done!
Dataset MNIST
    Number of datapoints: 60000
    Root location: .
    Split: Train

Aryan_Jha · March 18, 2021, 3:13am

Hi @crcrpar, can you suggest any workaround (if any) to download the MNIST dataset using torchvision 0.9.0. It would really help.

Thanks in advance.

crcrpar · March 18, 2021, 3:52am

One (cumbersome) workaround would be like

manually download mnist files from yann lecun homepage
manually copy & paste the processing from torchvision source code and write out them in train.pt and test.pt files.
disable download option of the MNIST (torchvision dataset class) and specify the directory where the files created above are located.

If there aren’t train.pt and test.pt in the data directory, torchvision v0.9 tries to download mnist dataset files from yann lecun server, which fails with high probability (You can check this behavior in vision/mnist.py at 240792d446cfa65141a76b4bac0b0ecbf15aacc0 · pytorch/vision · GitHub).

skynet10100 · March 18, 2021, 9:10am

if the download from the website of Yann LeCunn is not available, you can try the alternativ download links:

train-images-idx3-ubyte.gz
train-labels-idx1-ubyte.gz
t10k-images-idx3-ubyte.gz
t10k-labels-idx1-ubyte.gz

tjrym · March 18, 2021, 10:32am

Manually download mnist files from yann lecun 's homepage.
Replace class MNIST with code below(class MNIST_local). It works for me.

import torch
from torchvision.datasets import MNIST
from torchvision.datasets.mnist import read_image_file,read_label_file
from torchvision.datasets.utils import extract_archive
from typing import Optional,Callable
import os

def load_and_extract_archive(
    source_root: str,
    target_root: str,
    filename: Optional[str] = None,
    md5: Optional[str] = None,
    remove_finished: bool = False,
) -> None:
    archive = os.path.join(source_root, filename)
    print("Extracting {} to {}".format(archive, target_root))
    extract_archive(archive, target_root, remove_finished)



class MNIST_local(MNIST):
    """

    folder :
    which contains files below:
        train-images-idx3-ubyte.gz
        train-labels-idx1-ubyte.gz 
        t10k-images-idx3-ubyte.gz 
        t10k-labels-idx1-ubyte.gz

    root:
    the same as MNIST.root

    """
    def __init__(
            self,
            folder: str,
            root:str,
            train: bool = True,
            transform: Optional[Callable] = None,
            target_transform: Optional[Callable] = None,
    ) -> None:

        super(MNIST, self).__init__(root, transform=transform,
                                    target_transform=target_transform)
        self.train = train  # training set or test set
        
        self.load(folder)
        
        if not self._check_exists():
            raise RuntimeError('Dataset not found.' +
                               ' You can use download=True to download it')

        if self.train:
            data_file = self.training_file
        else:
            data_file = self.test_file
        self.data, self.targets = torch.load(os.path.join(self.processed_folder, data_file))

    @property
    def raw_folder(self) -> str:
        return os.path.join(self.root, MNIST.__name__, 'raw')

    @property
    def processed_folder(self) -> str:
        return os.path.join(self.root, MNIST.__name__, 'processed')

    def load(self,folder):
        if self._check_exists():
            return
        os.makedirs(self.raw_folder, exist_ok=True)
        os.makedirs(self.processed_folder, exist_ok=True)

        for url, md5 in self.resources:
            filename = url.rpartition('/')[2]
            load_and_extract_archive(source_root=folder, target_root=self.raw_folder, filename=filename, md5=md5) # NOTICE

        # process and save as torch files
        print('Processing...')

        training_set = (
            read_image_file(os.path.join(self.raw_folder, 'train-images-idx3-ubyte')),
            read_label_file(os.path.join(self.raw_folder, 'train-labels-idx1-ubyte'))
        )
        test_set = (
            read_image_file(os.path.join(self.raw_folder, 't10k-images-idx3-ubyte')),
            read_label_file(os.path.join(self.raw_folder, 't10k-labels-idx1-ubyte'))
        )
        with open(os.path.join(self.processed_folder, self.training_file), 'wb') as f:
            torch.save(training_set, f)
        with open(os.path.join(self.processed_folder, self.test_file), 'wb') as f:
            torch.save(test_set, f)

        print('Done!')

Rayhanul_Rumel · March 25, 2021, 1:31pm

I downloaded all the files shared by @skynet10100 and did the following steps:

Uploaded to my google drive. Inside “MNIST” folder.
Did the mount:

import os

import sys

from google.colab import drive

drive.mount(’/content/gdrive/’)

data_file_location = “/content/gdrive/MyDrive/MNIST/”

os.chdir(data_file_location)

Declared the path/ source for MNIST:

mnist_data = torchvision.datasets.MNIST(’/content/gdrive/MyDrive/MNIST/’)

Created the dataloader (Train Loader particularly):

dl = torch.utils.data.DataLoader(mnist_data, batch_size=16, shuffle=False)

The only thing I was not able to achieve is using transform. If I am not wrong, torch.utils.data.DataLoader doesn’t allow to use transform.

If anyone knows how to achieve it, please let me know about it. It would be a great help.

alierkan · March 25, 2021, 2:16pm

Edit mnist.py under …/torchvision/datasets and replace “http://yann.lecun.com/exdb” with “https://ossci-datasets.s3.amazonaws.com/” in resources array under class MNIST. Then it downloads datasets.

Novin_Nouri · March 7, 2024, 5:28pm

you can download from this link
and change manually from this link in Github and change everything