Abstract:
Cross-source point cloud registration is a key technique for multisensor fusion in mobile robotics and intelligent perception, where point clouds acquired from heterogeneous modalities (e.g., LiDAR, depth cameras, and structure-from-motion reconstructions) must be aligned into a unified coordinate system to support reliable pose estimation and 3D scene understanding. In practical indoor and industrial environments, cross-source registration remains challenging owing to large differences in sampling density and spatial distribution, distinct noise patterns, limited field-of-view overlap, and the frequent presence of repeated structures or multiple similar objects. These factors make correspondence search highly ambiguous and cause existing optimization-based pipelines or learning-based methods to struggle in simultaneously achieving global consistency and high local accuracy, particularly in complex multi-object scenes. To address these issues, this paper proposes adaptive instance segmentation hierarchical cross-source registration (AIS-HCSR), a hierarchical cross-source point cloud registration method based on adaptive instance segmentation, which performs progressive registration from the scene level to the object level and finally to the point cloud level. At the scene level, an adaptive geometric feature encoding scheme is designed. This scheme jointly models pairwise distance relations and triplet angle relations and dynamically reweights the two types of geometric embeddings according to local geometric complexity. The resulting geometric structural embedding is injected into a transformer-based geometric perception network to compute the self- and cross-attention to enable robust feature extraction and initial matching across modalities. Based on the optimized correlation matrix, the top-ranked correspondences are selected to estimate an initial rigid transformation, providing a globally consistent prior for subsequent refinement. At the object level, a density-aware adaptive Euclidean clustering algorithm is introduced to segment each point cloud into instances with explicit physical meaning. An instance correspondence mechanism is then constructed by propagating superpoint matches to instance pairs, computing matching frequency and fusing it with centroid-based positional similarity to form an instance similarity matrix. The optimal instance associations are obtained by solving a bipartite matching problem. To further suppress ambiguities caused by multiple similar objects and improve robustness under imperfect clustering (e.g., over-segmentation or under-segmentation in contact scenarios), a spatial layout consistency verification strategy is proposed. The strategy evaluates triangle-based configurations of instances and filters out instance correspondences that violate global spatial relations to prevent incorrect matches from propagating to later optimization. At the point cloud level, for each matched instance pair, point-to-plane residual minimization is performed to obtain a locally refined transformation with improved convergence under cross-source density gaps. The set of local transformations is then integrated through a global least-squares formulation and solved efficiently to yield a single final transformation, achieving fine registration while preserving global consistency across instances. Experiments on the 3DCSR dataset (including LiDAR–Kinect and Kinect–SfM modality pairs) demonstrate that AIS-HCSR achieves a recall of 81.19%, outperforming the previous state-of-the-art FF-LOGO by 5.45 percentage points, with translation and rotation errors of 0.08 m and 2.42°, respectively. The average end-to-end runtime is 1.33 s per point cloud pair. Ablation studies further verify that the scene-level registration and object-level matching with local-to-global optimization are complementary and jointly contribute to the performance gain. Overall, AIS-HCSR improves registration robustness in low-overlap and large density-difference settings by explicitly combining scene-scale structural priors with instance-scale geometric constraints to provide an effective solution for cross-source point cloud registration in complex multi-object environments.