The ReinforceOpts struct. Holds various configuration options for the Reinforce algorithm. More...

#include <reinforce_config.h>

Public Member Functions
std::ostream &	print (std::ostream &out) const
	print

void	load_from_json (const std::string &filename)
	Load the configuration from the given json file.

Public Attributes
bool	normalize_rewards {false}

cuberl::utils::TrainEnumType	train_type {cuberl::utils::TrainEnumType::BATCH}
	How to train the algorithm.

BaselineEnumType	baseline_type {BaselineEnumType::NONE}
	The baseline to use.

DeviceType	device_type
	The device type that PyTorch calculations take place.

uint_t	n_episodes
	The number of episodes.

uint_t	max_itrs_per_episode
	Max number of iterations per episode.

real_t	gamma
	The discount factor.

real_t	baseline_constant {0.0}
	The constant to use when baseline_type = BaselineEnumType::CONSTANT.

real_t	eps {bitrl::consts::TOLERANCE}
	Small constant to use as tolerance Used when baseline_type = BaselineEnumType::STANDARDIZE.

Detailed Description

The ReinforceOpts struct. Holds various configuration options for the Reinforce algorithm.

for REINFORCE algorithm

Member Function Documentation

◆ load_from_json()

void cuberl::rl::algos::pg::ReinforceConfig::load_from_json ( const std::string & filename )

Load the configuration from the given json file.

◆ print()

std::ostream & cuberl::rl::algos::pg::ReinforceConfig::print ( std::ostream & out ) const

print

Parameters

out

Returns

Member Data Documentation

◆ baseline_constant

real_t cuberl::rl::algos::pg::ReinforceConfig::baseline_constant {0.0}

The constant to use when baseline_type = BaselineEnumType::CONSTANT.

◆ baseline_type

BaselineEnumType cuberl::rl::algos::pg::ReinforceConfig::baseline_type {BaselineEnumType::NONE}

The baseline to use.

◆ device_type

DeviceType cuberl::rl::algos::pg::ReinforceConfig::device_type

The device type that PyTorch calculations take place.

◆ eps

real_t cuberl::rl::algos::pg::ReinforceConfig::eps {bitrl::consts::TOLERANCE}

Small constant to use as tolerance Used when baseline_type = BaselineEnumType::STANDARDIZE.

◆ gamma

real_t cuberl::rl::algos::pg::ReinforceConfig::gamma

The discount factor.

◆ max_itrs_per_episode

uint_t cuberl::rl::algos::pg::ReinforceConfig::max_itrs_per_episode

Max number of iterations per episode.

◆ n_episodes

uint_t cuberl::rl::algos::pg::ReinforceConfig::n_episodes

The number of episodes.

◆ normalize_rewards

bool cuberl::rl::algos::pg::ReinforceConfig::normalize_rewards {false}

◆ train_type

cuberl::utils::TrainEnumType cuberl::rl::algos::pg::ReinforceConfig::train_type {cuberl::utils::TrainEnumType::BATCH}

How to train the algorithm.

The documentation for this struct was generated from the following file:

libs/cuberl/include/cuberl/rl/algorithms/pg/reinforce_config.h

Public Member Functions

Public Attributes