Chat Image Generator Video Music Voice Chat Photo Editor

Policy Gradient Methods for Reinforcement Learning with Function Approximation and Action-Dependent Baselines

06/20/2017

∙

We show how an action-dependent baseline can be used by the policy gradient theorem using function approximation, originally presented with action-independent baselines by (Sutton et al. 2000).

READ FULL TEXT

Success!

An error occurred

Policy Gradient Methods for Reinforcement Learning with Function Approximation and Action-Dependent Baselines

Sign in with Google

Consider DeepAI Pro