In response to the challenge of unmanned vehicles capturing land escape target, a multi-agent reinforcement learning-based capture algorithm is proposed. First, the environment and motion models are established within the context of cooperative capture by unmanned vehicles, and the criteria for direct capture success are defined to meet the requirements of safety and coordination. Then, the Soft Actor Critic (SAC) algorithm is subsequently adopted as the training framework. However, as the number of agents increases, the multi-agent environment becomes more complex and unstable, which may lead to problems such as dimension explosion or convergence failures. SAC encounters difficulties when handling high-dimensional state spaces. The traditional SAC’s Critic network treats all state features equally during the processing of state information, which may lead to overestimation. In this study, an attention mechanism is introduced into the SAC’s Critic network to focus on the most crucial state features for the task and selectively process various state features. This enables capturing agents to concentrate on the behavior and location of the target agent, enhancing coordination and cooperation during pursuit. This focus on the target agent maximizes capture effectiveness and ensures accurate estimation of the true value function. Unnecessary activities and wasteful situations are minimized, thus improving efficiency and robustness. The attention weights adapt dynamically to environmental changes, enhancing their adaptability to unmanned vehicle behavior and state alterations. Furthermore, designing an appropriate reward function is crucial in reinforcement learning, directly impacting the performance and effectiveness of unmanned vehicles in the learning process. To tackle the challenge of sparse rewards in multi-vehicle capture scenarios, the paper employs a strategy of decoupling the reward function. This entails segmenting the reward function into individual rewards and cooperative rewards, aiming to maximize both global and localized incentives. The secondary objective is to minimize the target unmanned vehicle's action space through cooperative cooperation among capturing unmanned vehicles, ultimately achieving the target unmanned vehicle's capture. Compared with existing SAC algorithms, the proposed method significantly improves the capture success rate. Finally, through simulation experiments comparing it with other learning methods, the study verifies the effectiveness and superiority of the proposed algorithm and reward function design methodology.