Robotic grasping has extensive applications in fields such as logistics sorting, automated assembly, and medical surgery. Grasping detection is an important step in robotic grasping. In recent years, with the decrease in the cost of 3D sensors, more and more depth cameras are being used for grasping detection. Meanwhile, Pose estimation-based methods are employed in robotic grasping. However, most publicly available RGB-D image-based pose estimation datasets require the use of expensive 3D scanning devices to obtain the 3D models of target objects. Moreover, the annotation process relies on manual operation, which is time-consuming and labor-intensive, not conducive to the production of large-scale datasets. To address this issue, this paper implements a dataset automatic acquisition and annotation system for pose estimation, which does not require a 3D scanning device. It only needs to capture and analyze RGB-D image sequences obtained by a depth camera to reconstruct the 3D model of the target object and automatically annotate the pose information and 2D image segmentation mask. During the experiments, a dataset containing 84 objects and 8400 RGB-D images is created by the system. the automatically annotated data and manually annotated data is compared, revealing a segmentation mask overlap rate of 98%. Additionally, the automatically annotated pose information can be used to align the model point cloud with the entire scene point cloud, which demonstrates the accuracy and reliability of the proposed system.