Training Trigram Language model gives indexing error

Aabid_Karim · September 11, 2024, 7:12pm

I am training a Trigram model where I have created two dictionaries as below, to give index to each character.

stoi = {s: i+1 for i, s in enumerate(chars)}
stoi['.'] = 0
itos = {i:s for s,i in stoi.items()}

I have torch array of 3D. And I converted all the entries of array to probabilities as given below

P = (N+2).float()
P = P / P.sum(1, keepdims = True)

I didn’t voilate any broadcasting rules and the shape of P is 3D.
Now I am using torch.multinomial to draw a sample from my probablity distribution using the code below.

g = torch.Generator().manual_seed(2147483647)

for i in range(10):
    out = []
    ix = 0  
    while True:
        p = P[ix]
        ix = torch.multinomial(p, num_samples=1, replacement=True, generator=g)
        out.append(itos[ix])  
        if ix == 0:  
            break
    print(''.join(out))

But the above code does not draw samples from probability distribution, and gives me the below output.

""KeyError Traceback (most recent call last)
in <cell line: 3>()
7 p = P[ix]
8 ix = torch.multinomial(p, num_samples=1, replacement=True, generator=g)
----> 9 out.append(itos[ix])
10 if ix == 0:
11 break

KeyError: tensor([[10],
[14],
[18],
[10],
[ 1],
[14],
[ 0],
[17],
[ 5],
[15],
[ 0],
[25],
[ 0],
[ 3],
[ 1],
[ 2],
[ 5],
[16],
[ 5],
[25],
[ 8],
[ 0],
[17],
[ 8],
[24],
[ 1],
[25]]) “”

Can someone explain where there is an error in indexing, why itos dictionary doesn’t get proper index from probability distribution.

ptrblck · September 11, 2024, 8:18pm

Could you post the size of itos as well as how P (or N) was created?

Aabid_Karim · September 12, 2024, 4:11am

There are total 26 alphabats and itos (integer to string) is a dictionary that stores string with its index, the code is below.

N = torch.zeros((27,27,27), dtype = torch.int32)
chars = sorted(list(set(''.join(words))))
stoi = {s: i+1 for i, s in enumerate(chars)}
stoi['.'] = 0
itos = {i:s for s,i in stoi.items()}
itos

Itos:
{1: ‘a’,
2: ‘b’,
3: ‘c’,
4: ‘d’,
5: ‘e’,
6: ‘f’,
7: ‘g’,
8: ‘h’,
9: ‘i’,
10: ‘j’,
11: ‘k’,
12: ‘l’,
13: ‘m’,
14: ‘n’,
15: ‘o’,
16: ‘p’,
17: ‘q’,
18: ‘r’,
19: ‘s’,
20: ‘t’,
21: ‘u’,
22: ‘v’,
23: ‘w’,
24: ‘x’,
25: ‘y’,
26: ‘z’,
0: ‘.’}