Error in Transformer tutorial building vocab

I’m following the Language Modeling with nn.Transformer and TorchText — PyTorch Tutorials 1.12.1+cu102 documentation and am running into an error in the vocab portion:

from torchtext.datasets import WikiText2
from torchtext.data.utils import get_tokenizer
from torchtext.vocab import build_vocab_from_iterator
train_iter = WikiText2(split='train')
tokenizer = get_tokenizer('basic_english')
vocab = build_vocab_from_iterator(map(tokenizer, train_iter), specials=['<unk>'])
**Exception** : OnDiskCache Exception: C:\Users\XXXXX/.cache\torch\text\datasets\WikiText2\wikitext-2-v1.zip expected to be written by different process, but file is not ready in 300 seconds. This exception is thrown by __iter__ of MapperIterDataPipe(datapipe=UnBatcherIterDataPipe, fn=functools.partial(<function _wait_promise_fn at 0x0000016F75AFF2E0>, 300), input_col=None, output_col=None)

I thought maybe it’s a network issue since I’m on a work network but I was able to download it from https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-2-v1.zip.

1 Like

Hi there were you able to solve this problem if so how did you do it