RuntimeError: invalid argument 4: Index tensor must have same dimensions as input tensor at

Hi,

Iam new to this forum and to pytorch.
Could someone help me with that error
RuntimeError: invalid argument 4: Index tensor must have same dimensions as input tensor at

def learn(self, batch, gamma):
“”"Update value parameters using given batch of experience tuples.

    Params
    ======
        experiences (Tuple[torch.Variable]): tuple of (s, a, r, s', done) tuples 
        gamma (float): discount factor
    """
    
    #states, actions, rewards, next_states, dones = experiences
    states = np.array([each[0][0] for each in batch], ndmin=3)
    actions = np.array([each[0][1] for each in batch])
    rewards = np.array([each[0][2] for each in batch]) 
    next_states = np.array([each[0][3] for each in batch], ndmin=3)

    dones = np.array([each[0][4] for each in batch])
    #dones  = np.array([map(lambda x: 1 if x else 0, dones)],dtype=np.float16)
    dones = dones.astype(np.int16)
    states = torch.from_numpy(states).float().cuda()
    next_states = torch.from_numpy(next_states).float().cuda()
    rewards = torch.from_numpy(rewards).float().cuda()
    actions = torch.from_numpy(actions).long().cuda()
    dones = torch.from_numpy(dones).float().cuda()
    #next_states = torch.from_numpy(next_states).float().unsqueeze(0).cuda()
    #print(states.shape)
    states = states.view(32,8)
    #print(states.shape)
    next_states = next_states.view(32,8)
    ## TODO: compute and minimize the loss
    "*** YOUR CODE HERE ***"
    Q_targets_next = self.qnetwork_target(next_states).detach().max(1)[0].unsqueeze(1)
    print(rewards.shape)
    print(Q_targets_next)
    Q_targets =  rewards + (gamma * Q_targets_next * (1-dones))
    
    Q_expected = self.qnetwork_local(states).gather(1, actions)
    
    loss = F.mse_loss(Q_expected, Q_targets)
    self.optimizer.zero_grad()
    loss.backward()
    self.optimizer.step()

    # ------------------- update target network ------------------- #
    self.soft_update(self.qnetwork_local, self.qnetwork_target, self.tau) 

I don’t know if this gather is correct here found the code online and it worked
now I try to modify to my problem

Thank you very much for your help
If you need more information let me know
Best Regards
Chris

Which line throws this error?
Could you post the shapes and if necessary the values of all used tensors in this operation?

oh yes I forgot to post the trace-back

/usr/local/lib/python3.5/dist-packages/ipykernel_launcher.py:67: RuntimeWarning: divide by zero encountered in double_scalars

RuntimeError Traceback (most recent call last)
in ()
----> 1 scores = dqn()
2
3 # plot the scores
4 fig = plt.figure()
5 ax = fig.add_subplot(111)

in dqn(n_episodes, max_t, eps_start, eps_end, eps_decay)
23 score += reward
24 #reward = np.tanh(reward)
—> 25 agent.step(state, action, reward, next_state, done)
26 state = next_state
27 if done:

in step(self, state, action, reward, next_state, done)
57 if np.count_nonzero(self.memory.tree.tree) > self.batch_size:
58 tree_idx, batch, ISWeights_mb = self.memory.sample(BATCH_SIZE)
—> 59 self.learn(batch, GAMMA)
60
61 def act(self, state, eps=0.):

in learn(self, batch, gamma)
114 Q_targets = rewards + (gamma * Q_targets_next * (1-dones))
115
–> 116 Q_expected = self.qnetwork_local(states).gather(1, actions)
117
118 loss = F.mse_loss(Q_expected, Q_targets)

RuntimeError: invalid argument 4: Index tensor must have same dimensions as input tensor at /pytorch/aten/src/THC/generic/THCTensorScatterGather.cu:16

Thank you for your help

Thanks for the stack trace. Could you print out the shapes of self.qnetwork_local(states) and actions? If might be actions has to be unsqueezed, but I would need to know your shapes.

for actions : torch.Size([32])
for the network: torch.Size([32, 4])


RuntimeError Traceback (most recent call last)
in ()
----> 1 scores = dqn()
2
3 # plot the scores
4 fig = plt.figure()
5 ax = fig.add_subplot(111)

in dqn(n_episodes, max_t, eps_start, eps_end, eps_decay)
23 score += reward
24 #reward = np.tanh(reward)
—> 25 agent.step(state, action, reward, next_state, done)
26 state = next_state
27 if done:

in step(self, state, action, reward, next_state, done)
57 if np.count_nonzero(self.memory.tree.tree) > self.batch_size:
58 tree_idx, batch, ISWeights_mb = self.memory.sample(BATCH_SIZE)
—> 59 self.learn(batch, GAMMA)
60
61 def act(self, state, eps=0.):

in learn(self, batch, gamma)
120 Q_expected = self.qnetwork_local(states).gather(1, actions.unsqueeze(1))
121
–> 122 loss = F.mse_loss(Q_expected, Q_targets)
123 self.optimizer.zero_grad()
124 loss.backward()

/usr/local/lib/python3.5/dist-packages/torch/nn/functional.py in mse_loss(input, target, size_average, reduce)
1567 “”"
1568 return _pointwise_loss(lambda a, b: (a - b) ** 2, torch._C._nn.mse_loss,
-> 1569 input, target, size_average, reduce)
1570
1571

/usr/local/lib/python3.5/dist-packages/torch/nn/functional.py in _pointwise_loss(lambd, lambd_optimized, input, target, size_average, reduce)
1535 return torch.mean(d) if size_average else torch.sum(d)
1536 else:
-> 1537 return lambd_optimized(input, target, size_average, reduce)
1538
1539

RuntimeError: input and target shapes do not match: input [32 x 1], target [32 x 32] at /pytorch/aten/src/THCUNN/generic/MSECriterion.cu:15

I tried and it worked but now a have a new problem
Is there any tutorial where they explain this stuff with unsqueeze etc , because I have no idea what I’m doing there OK I see they need to have the right shape but still

Thanks for your help
Maybe you also have an idea for this one

I finally managed, after a lot of reshaping the tensors it worked thanks for your help

I am facing the same error. can u plz help me.