I guess your use case should use torch.distributions instead of a sampled tensor.
However, I’m not sure what the current status of distributions support is in libtorch, but @yf225 might know.
Thanks for your reply! yeah I knew that the python framework had the normal distribution object but apparently libtorch doesn’t, so what I have done is this:
To select an action I do:
torch::Tensor out = policy_mu.forward(state);
this->mu = out[0];
auto action = torch::normal(this->mu.item(), sigma, {1,1});
action = torch::clamp(action, -2.0, 2.0);
To update my policy I do:
The first bit is basically what TF does in the log_prob() function
auto log_unnormalized = -0.5 * pow((action / this->sigma) - (this->mu / this->sigma),2);
auto log_normalization = 0.5 * std::log(2. * M_PI) + std::log(this->sigma);
auto log_prob = log_unnormalized - log_normalization;
auto loss = log_prob * (reward + gamma*next_state_value- state_value);
//auto loss = -this->dist.log_prob(action) * ((reward + gamma*next_state_value) - state_value);
policy_mu.zero_grad();
//sigma_optim.zero_grad();
loss.backward();
for (auto& p : policy_mu.parameters())
{
//std::cout << "----- p grad " << p.grad() << std::endl;
auto tmp = 0.001*p.grad();
p = p.detach();
p -= tmp;
}
This isn’t working tho maybe I’m doing something wrong in the algorithm idk
If anyone can spot something wrong would be much appreciated!
Hi Ptrblck,
Sorry, I need to compute the CDF of a distribution. which function can do it for me? I search but found torch.distributions.cauchy which is confusing for me
torch.distributions provide a cdf function, which you could pass a value in and would get the cdf output. Cauchy also provides it in torch.distributions.cauchy.Cauchy().cdf().