Abstract: This paper explores the use of multi-agent reinforcement learning (MARL), specifically Proximal Policy Optimization (PPO), to improve the coordination and exploration efficiency of ...