Dota 2 Reinforcement Learning

BeyondGodlike Bot
BeyondGodlike Bot
2.3 هزار بار بازدید - 7 سال پیش - The challenge of creating a
The challenge of creating a bot for Dota 2 is the vast amount of information available at every frame, and the continuous set of actions possible.

This video shows one model I am testing where I discretize the actions into 8 directions + 1 hold, and I use only as input/state data the (x,y) offset of the creeps relative to the hero. The objective is to block the creeps as much as possible. Every 5 episodes, I use a hardcoded bot to "bootstrap" the training in an effort to get the model to learn faster.

I trained this model using actor-critic where I have a value and policy network. Both networks takes as input the state data. The value network outputs a single value which estimates the score of a certain state, and the policy network outputs 9 values from which estimates the probability of each action for a certain state.

Link to Thread: http://dev.dota2.com/showthread.php?t...
Link to Code: https://github.com/BeyondGodlikeBot/C...
7 سال پیش در تاریخ 1396/05/30 منتشر شده است.
2,302 بـار بازدید شده
... بیشتر