Expected Policy Gradients
55 بار بازدید -
7 سال پیش
-
This talk is meant to
This talk is meant to accompany the AAAI-18 paper "Expected Policy Gradients" by K. Ciosek and S. Whiteson.
*Be sure to turn on the sound to hear the voiceover*
In the paper, we propose expected policy gradients (EPG), a new policy gradient method which integrates across the action when estimating the gradient, instead of relying only on the action in the sampled trajectory.
There are three main take-aways. First, EPG has reduced variance compared to SPG. Second, it leads to a new superior exploration s
7 سال پیش
در تاریخ 1396/12/17 منتشر شده
است.
55
بـار بازدید شده