DNT-RL: Soccer Behavior Learning in Parallel Environments

Macha Meijer1,2, Gijs de Jong1,2, Joshua Rosenthal1,2
1University of Amsterdam, 2Dutch Nao Team

DNT-RL learning a walk to point policy with 64 parallel agents.

PAGE IS UNDERCONSTRUCTION

Abstract

Reinforcement Learning (RL) has shown significant potential in the RoboCup Soccer leagues, for example in the SPL.

Unlike tranditional hand-crafted behaviors, the usage of RL enables the development a more flexible behavior engine, allowing for a more dynamic gameplay. However, training RL behaviors is resource-intensive, both in terms of training time and the development time of each behavior.

In this paper, we introduce DNT-RL, a framework designed for the efficient training of 2D RL behaviors using parallel training with multiple environments. The framework integrates a physics-based simulation with a standardized way of training behaviors, supporting parallelized and GPU-accelerated training to reduce the training time. We show the effectiveness of DNT-RL through several behaviors that are trained using the framework, which have been succesfully deployed in RoboCup tournaments. These results show the potential of the framework for development of scalable and usable RL-based behavior training.

Related Links

There's a lot of excellent work that was introduced around the same time as ours.

Reinforcement Learning Within the Classical Robotics Stack: A Case Study in Robot Soccer introduces an idea similar to our classic robotics stack enhanced with RL behavior.

BibTeX

@article{meijer2025dntrl,
  author    = {Meijer, Macha and de Jong, Gijs and Rosenthal, Joshua},
  title     = {DNT-RL: Soccer Behavior Learning in Parallel Environments},
  journal   = {% TODO}
  year      = {2025},
}