Expected Policy Gradients

ثاقب
ثاقب
55 بار بازدید - 7 سال پیش - This talk is meant to
This talk is meant to accompany the AAAI-18 paper "Expected Policy Gradients" by K. Ciosek and S. Whiteson. *Be sure to turn on the sound to hear the voiceover* In the paper, we propose expected policy gradients (EPG), a new policy gradient method which integrates across the action when estimating the gradient, instead of relying only on the action in the sampled trajectory. There are three main take-aways. First, EPG has reduced variance compared to SPG. Second, it leads to a new superior exploration s
7 سال پیش در تاریخ 1396/12/17 منتشر شده است.
55 بـار بازدید شده
... بیشتر