Variance reduction for policy gradient with action-dependent factorized baselines

8 years ago 6
Add to circle
Read Entire Article