Deploying a Reinforcement Learning Model (DDPG

Are there any kinds of literature that talked about deploying a trained agent? I been looking, but I couldn’t find any. I want to know how to deploy a trained agent. In a particular case using the DDPG algorithm, do I just call the Actor with the same parameters as my trained Actor and load the information? Do I need to include the learning algorithm from DDPG? Is the reward still needed, or can I neglect that?

import torch
from matplotlib.animation import FuncAnimation
from matplotlib.widgets import TextBox
from tclab import clock
import numpy as np

actor = ActorNetwork(alpha=0.0001, input_dims=(2,), fc1_dims=400,
fc2_dims=300, n_actions=2, name=‘actor’)
env = PIEnv()

facecolor = ‘lightgoldenrodyellow’
fig, axs = plt.subplots(2, figsize=(12, 8), tight_layout=True)

SP_input = env.states[1]

SP_mem = []
PV_mem = []
OP_mem = []
time_mem = []

def animate(i, PV_mem, SP_mem, OP_mem, time_mem):
states = env.reset()
action = None
with torch.no_grad():
state = torch.tensor([states], dtype=torch.float)
if i % 180 == 0 or action is None:
action = actor.forward(state)
states_, PV, SP, OP, Ks = env.step(action.detach().numpy()[0], 50)
states = states_
print(f’Time: {int(i)}, PV: {PV:.2f}, SP: {SP:.0f}, OP: {OP:.2f}’)


    axs[0].plot(time_mem, SP_mem, 'r--', label='Setpoint')
    axs[0].plot(time_mem, PV_mem, 'b-', label='Process Variable')
    axs[0].set_ylabel('Temperature °C')

    axs[1].plot(time_mem, OP_mem, 'g-', label='Heater Output')
    axs[1].set_ylabel('Heater Output')
    axs[1].set_xlabel('Time (seconds)')

    axs[0].set_title(f'IAE: {states_[0]:.0f}', loc='left')
    axs[0].set_title(f'Kp: {Ks[0]:.2f}, Ki: {Ks[1]:.4f}', loc='right')

ani = FuncAnimation(fig, animate, fargs=(PV_mem, SP_mem, OP_mem, time_mem), interval=1000)
except KeyboardInterrupt:

Thanks is what I have and I don’t know if it captures all the information I need to deploy my RL model. Any advice would be much appreciated. Thanks!