Problem with Tutorial: "SPEECH COMMAND CLASSIFICATION WITH TORCHAUDIO"

Hi, I am brand new to PyTorch and am currently working on the tutorials. I am currently working on this one: Speech Command Classification with torchaudio — PyTorch Tutorials 1.10.0+cu102 documentation

Basically, I am trying to get the model to predict spoken words, however, all the predictions are wrong. What exactly am I doing wrong? I have run the model for 20 epochs and also tried to run it for 30 epochs, the results are the same. I used Google Colab to run the tutorial:

Google Colab.

My Output is the following:

Expected: backward. Predicted: right.
Expected: bed. Predicted: backward.
Expected: bird. Predicted: backward.
Expected: cat. Predicted: backward.
Expected: dog. Predicted: backward.
Expected: down. Predicted: backward.
Expected: eight. Predicted: backward.
Expected: five. Predicted: backward.
Expected: follow. Predicted: backward.
Expected: forward. Predicted: backward.
Expected: four. Predicted: backward.
Expected: go. Predicted: backward.
Expected: happy. Predicted: backward.
Expected: house. Predicted: one.
Expected: learn. Predicted: backward.
Expected: left. Predicted: backward.
Expected: marvin. Predicted: one.
Expected: nine. Predicted: backward.
Expected: no. Predicted: backward.
Expected: off. Predicted: backward.
Expected: on. Predicted: backward.
Expected: one. Predicted: backward.
Expected: right. Predicted: backward.
Expected: seven. Predicted: backward.
Expected: sheila. Predicted: backward.
Expected: six. Predicted: backward.
Expected: stop. Predicted: backward.
Expected: three. Predicted: backward.
Expected: tree. Predicted: backward.
Expected: two. Predicted: backward.
Expected: up. Predicted: backward.
Expected: visual. Predicted: backward.
Expected: wow. Predicted: backward.
Expected: yes. Predicted: backward.
Expected: zero. Predicted: backward.