Using Tensor Instead of numpy array in reinforcement learning tasks

Hi, I am trying to make an RL agent and environment, in this process I am using numpy array basically for action space which is a shape of [3]. So the main input and output of the neural net agent class which extends the torch.nn module, is a numpy ndarray. And the reward and other primary variables are numpy array too. But, the input of agent should be torch tensor. For example:

action, prediction = agent.act(state)
next_state, reward, done = env.step(action)
states.append(np.expand_dims(state, axis=0))
next_states.append(np.expand_dims(next_state, axis=0))
action_onehot = np.zeros(3)

In the code shown that is a function is based on numpy array like expand. And the challeng is transforming the numpy array to torch tensor and vice versa.
After all, I came up with three options and now I want to know which one is better to use(or any other option you may give):
1.Use the from_numpy function and convert the array to tensor before inputting it to the agent and then again transform it to numpy after prediction?
2.Use the transformers(from numpy and to numpy) in the forward function of the agent class?
3.Use torch tensor instead of numpy array in the whole process(which may not possible)?
And I believe keeping the grad of torch tensor is crucial, So please consider this.
Please explain how should I do the task.
Your answer much appreciated in regard.

Would you please help @ptrblck

I don’t fully understand your use case, but in case you are not using custom autograd.Functions, you are already breaking the computation graph, so

might not be important (assuming your current code works fine)?

In any case, I would try to use PyTorch tensors and operations only, as it would also allow you to use the GPU without moving the data back and forth between the GPU and CPU.