Note: the torchtext.vocab.build_vocab_from_iterator() on the Google Colab notebook above is calling this dataset. (Sorry for not being specific in describing the problem)
if you do not want to update the URL of multi30k, you can just download the file from url above and put the tar.gz file to the torch cache directory. for my machine, the directory is :/root/.cache/torch/text/datasets/Multi30k , copy the tar.gz file into directory ,and run the code. pytorch will uncompress the file , get train.de train.en files
On the google colab, when trying to use the archive from the URL above, there is this error on the block with build_vocabulary() function
RuntimeError: The computed hash 6d1ca1dba99e2c5dd54cae1226ff11c2551e6ce63527ebb072a1f70f72a5cd36 of /root/.torchtext/cache/Multi30k/mmt16_task1_test.tar.gz does not match the expectedhash 0681be16a532912288a91ddd573594fbdd57c0fbb81486eff7c55247e35326c2. Delete the file manually and retry.