Abstract:
This paper investigates the optimal synchronization control problem for heterogeneous multi-robot systems with unknown dynamics under a model-free reinforcement learning framework. Heterogeneous multi-robot systems have attracted increasing attention due to their broad applications in intelligent manufacturing, cooperative transportation, environmental monitoring, and search-and-rescue missions. However, because different robots often possess distinct physical structures, dynamic characteristics, and actuation capabilities, the resulting system is usually featured by strong nonlinearity, parameter uncertainty, and complex coupling effects. These characteristics make the design of high-performance synchronization controllers particularly challenging. In addition, most existing control approaches rely heavily on accurate mathematical models of the controlled plants. In practical engineering scenarios, however, it is often difficult or even impossible to obtain precise dynamic models for all robots in the network. Consequently, traditional model-based control schemes may suffer from degraded performance or limited applicability. Motivated by these challenges, this paper develops a novel model-free optimal synchronization control approach for heterogeneous multi-robot systems without requiring exact knowledge of the system dynamics. First, the dynamic model of a multi-degree-of-freedom robotic system is established and then transformed into the standard control-affine nonlinear system form. This transformation provides a convenient theoretical basis for subsequent controller design and learning algorithm development. Compared with conventional formulations, the adopted representation is more suitable for integrating adaptive approximation techniques and reinforcement learning methods into the synchronization control framework. On this basis, the considered heterogeneous multi-robot system can be described in a unified manner, which facilitates both theoretical analysis and algorithm implementation. Next, to address the difficulties caused by unknown dynamics and the lack of accurate state-related information during the synchronization process, a novel identification network together with its corresponding weight update law is proposed. The designed identification mechanism is capable of approximating the unknown nonlinear dynamics online and effectively capturing the essential behavior of each robot without relying on prior model knowledge. Meanwhile, by incorporating the leader-following synchronization objective into the network design, the proposed identifier not only improves the estimation accuracy of the system dynamics, but also drives each follower robot to gradually approach the leader’s motion trajectory. In this way, the proposed scheme realizes online learning and effective identification of the system states, laying the foundation for optimal control design under unknown dynamic environments. Furthermore, a critic-network-based model-free optimal synchronization control algorithm is developed for the heterogeneous multi-robot system. Different from conventional optimal control methods that require the Hamilton–Jacobi–Bellman equation to be solved based on exact system models, the proposed approach employs reinforcement learning to approximate the performance index function online and derive the corresponding optimal control policy in a data-driven manner. As a result, the algorithm can achieve an effective balance between synchronization performance and control energy consumption, while avoiding dependence on precise model information. This feature significantly enhances the applicability of the method in practical robotic systems with uncertain or partially unknown dynamics. To guarantee the reliability of the proposed approach, rigorous stability analysis is carried out within the closed-loop framework. It is proven that all signals in the resulting closed-loop system remain uniformly bounded throughout the learning and control process. More importantly, the synchronization errors among the robots asymptotically converge to zero, indicating that all follower robots can ultimately achieve coordinated motion with the leader. These theoretical results demonstrate the feasibility, stability, and optimality of the proposed control strategy from a solid analytical perspective. Finally, simulation examples are provided to validate the effectiveness of the proposed method. The simulation results show that, even in the presence of unknown dynamics and heterogeneity among robots, the developed algorithm can successfully realize optimal synchronization tracking of the leader. Moreover, the proposed approach exhibits strong learning capability, satisfactory synchronization accuracy, and desirable control performance. These results confirm that the presented model-free reinforcement learning method offers a promising and effective solution for the optimal synchronization control of heterogeneous multi-robot systems.