bitrl & cuberl Documentation
Simulation engine for reinforcement learning agents
Loading...
Searching...
No Matches
a2c_config.h
Go to the documentation of this file.
1#ifndef A2C_CONFIG_H
2#define A2C_CONFIG_H
3
5//#include "bitrl/rlenvs_consts.h"
7
8#include <ostream>
9#include <string>
10
11namespace cuberl {
12namespace rl {
13namespace algos {
14namespace pg {
15
16//using namespace rlenvscpp::consts;
17
22{
23
28
33
38
44
49
53 bool clip_policy_grad{false};
54
58 bool clip_critic_grad{false};
59
64
69
74
79
84
89
93 DeviceType device_type{DeviceType::CPU};
94
98 std::string save_model_path{""};
99
105 std::ostream& print(std::ostream& out)const;
106
110 void load_from_json(const std::string& filename);
111};
112
113
114inline
115std::ostream& operator<<(std::ostream& out, const A2CConfig& opts){
116 return opts.print(out);
117}
118
119}
120}
121}
122}
123
124#endif
double real_t
real_t
Definition bitrl_types.h:23
std::size_t uint_t
uint_t
Definition bitrl_types.h:43
DeviceType
Enumeration of various device types.
Definition bitrl_types.h:159
std::ostream & operator<<(std::ostream &out, const A2CConfig &opts)
Definition a2c_config.h:115
Various utilities used when working with RL problems.
Definition cuberl_types.h:16
The A2CConfig struct. Configuration for A2CSolver class.
Definition a2c_config.h:22
real_t lambda
GAE lambda.
Definition a2c_config.h:32
std::string save_model_path
Definition a2c_config.h:98
DeviceType device_type
Definition a2c_config.h:93
real_t beta
Coefficient for accounting for entropy contribution.
Definition a2c_config.h:37
uint_t buffer_size
Definition a2c_config.h:83
bool normalize_advantages
Definition a2c_config.h:88
real_t gamma
Discount factor.
Definition a2c_config.h:27
real_t max_grad_norm_policy
The value to clip the gradient for the policy.
Definition a2c_config.h:63
std::ostream & print(std::ostream &out) const
print
bool clip_critic_grad
Flag indicating whether to clip the critic grad.
Definition a2c_config.h:58
bool clip_policy_grad
Flag indicating whether to clip the policy grad.
Definition a2c_config.h:53
real_t value_loss_weight
Definition a2c_config.h:48
real_t policy_loss_weight
policy_loss_weight. How much weight to give on the policy loss when forming the global loss
Definition a2c_config.h:43
void load_from_json(const std::string &filename)
Load the configuration from the given json file.
uint_t n_episodes
Number of training episodes.
Definition a2c_config.h:73
real_t max_grad_norm_critic
The value to clip the gradient for the actor.
Definition a2c_config.h:68
uint_t max_itrs_per_episode
Number of iterations per episode.
Definition a2c_config.h:78