With the rapid advent of video games recently and the increasing numbers of players and gamers, only a tough game with high\npolicy, actions, and tactics survives.Howthe game responds to opponent actions is the key issue of popular games.Many algorithms\nwere proposed to solve this problem such as Least-Squares Policy Iteration (LSPI) and State-Action-Reward-State-Action (SARSA)\nbut they mainly depend on discrete actions, while agents in such a setting have to learn from the consequences of their continuous\nactions, in order to maximize the total reward over time. So in this paper we proposed a new algorithm based on LSPI called Least-\nSquares Continuous Action Policy Iteration (LSCAPI). The LSCAPI was implemented and tested on three different games: one\nboard game, the 8 Queens, and two real-time strategy (RTS) games, StarCraft Brood War and Glest. The LSCAPI evaluation proved\nsuperiority over LSPI in time, policy learning ability, and effectiveness.
Loading....