We introduce a reinforcement learning architecture designed for problems with an infinite number of states, where each state can\r\nbe seen as a vector of real numbers and with a finite number of actions, where each action requires a vector of real numbers as\r\nparameters. The main objective of this architecture is to distribute in two actors the work required to learn the final policy. One\r\nactor decideswhat actionmust be performed; meanwhile, a second actor determines the right parameters for the selected action.We\r\ntested our architecture and one algorithmbased on it solving the robot dribbling problem, a challenging robot control problem taken\r\nfrom the RoboCup competitions. Our experimental work with three different function approximators provides enough evidence\r\nto prove that the proposed architecture can be used to implement fast, robust, and reliable reinforcement learning algorithms
Loading....