-
摘要: 基于高性能的YOLOv3目标检测算法,提出一种分阶段高效火车号识别算法。整个识别过程分为两个阶段:第一阶段在低分辨率全局图像中检测出火车号区域位置;第二阶段在局部高分辨率图像中检测出组成火车号的字符,根据字符的空间位置关系搜索得到12位火车号,并利用每个字符的识别置信度及火车号编码规则进行校验得到最终火车号。另外,本文提出一种结合批一化因子和滤波器相关度的剪枝算法,通过对两个阶段检测模型的剪枝,在保证识别准确率不降(实验中略有提升)的条件下降低了存储空间占用率和计算复杂度。在现场采集的1072幅火车号图像上的实验结果表明,本文提出的火车号识别算法达到了96.92%的整车号识别正确率,平均识别时间仅为191 ms。Abstract: The automatic recognition of a wagon number plays an important role in railroad transportation systems. However, the wagon number character only occupies a very small area of the entire wagon image, and it is often accompanied by uneven illumination, a complex background, image contamination, and character stroke breakage, which makes the high-precision automatic recognition difficult. In recent years, object detection algorithm based on deep learning has made great progress, and it provides a solid technical basis for us to improve the performance of the train number recognition algorithm. This paper proposes a two-phase efficient wagon number recognition algorithm based on the high-performance YOLOv3 object detection algorithm. The entire recognition process is divided into two phases. In the first phase, the region of the wagon number in an image is detected from a low-resolution global image; in the second stage, the characters are detected in a high-resolution local image, formed into the wagon number according to their spatial position, and the final wagon number is obtained after verification based on the recognition confidence of each character and international wagon number coding rules. In addition, we proposed a new deep learning network-pruning algorithm based on the batch normalize scale factor and filter correlation. The importance of every filter was computed by considering the correlation between filter weights and the scale factor generated via batch normalization. By pruning and retraining the region detection model and character detection model, the storage space occupation and computational complexity were reduced without sacrificing recognition accuracy (which is even slightly improved in our experiment). Finally, we tested the proposed two-phase wagon number recognition algorithm on 1072 images from practical engineering application scenarios, and the results show that the proposed algorithm achieves 96.9% of the overall correct ratio (here, “correct” means all 12 characters are detected and recognized correctly), and the average recognition time is only 191 ms.
-
Key words:
- pattern recognition /
- wagon number recognition /
- deep learning /
- neural network /
- object detection /
- model pruning
-
表 1 火车号区域检测和字符检测的实验结果
Table 1. Results of wagon number region detection and character detection
Phase Detectionmodel Pruning mAP/% Model size/MB Runtime memory/MB Mean time/ms Region detection YOLOv3 N 95.31 241 1625 44.06 Y 95.37 93 1155 31.41 Faster-RCNN 95.38 323 1121 103.16 SSD 95.20 192 833 62.01 Character detection YOLOv3 N 90.39 241 1307 29.22 Y 90.69 84 881 18.43 Faster-RCNN 90.68 323 1161 110.09 SSD 90.40 201 851 62.49 表 2 校验和模型剪枝对识别结果的影响
Table 2. Influence of model pruning and verification on the recognition results
Verification & correction Pruning Accuracy rate/% Error rate/% Rejection rate/% Mean time/ms N N 93.56 4.76 1.68 221 Y N 96.36 1.96 1.68 224 Y Y 96.92 2.15 0.93 191 -
[1] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks // 26th Annual Conference on Neural Information Processing Systems. Lake Tahoe, 2012: 1097 [2] LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521: 436 doi: 10.1038/nature14539 [3] 廖健. 基于深度卷积神经网络的货车车号识别研究. 交通运输工程与信息学报, 2016, 14(4):64 doi: 10.3969/j.issn.1672-4747.2016.04.010Liao J. Research on recognition of railway wagon numbers based on deep convolutional neural networks. J Transp Eng Inf, 2016, 14(4): 64 doi: 10.3969/j.issn.1672-4747.2016.04.010 [4] Li H, Wang P, You M Y, et al. Reading car license plates using deep neural networks. Image Vision Comput, 2018, 72: 14 doi: 10.1016/j.imavis.2018.02.002 [5] Li H, Wang P, Shen C H. Toward end-to-end car license plate detection and recognition with deep neural networks. IEEE Trans Intell Transp Syst, 2019, 20(3): 1126 doi: 10.1109/TITS.2018.2847291 [6] Montazzolli S, Jung C. Real-time Brazilian license plate detection and recognition using deep convolutional neural networks // 2017 30th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI). Niterói, 2017: 52 [7] Laroca R, Severo E, Zanlorensi L A, et al. A robust real-time automatic license plate recognition based on the YOLO detector // 2018 International Joint Conference on Neural Networks (IJCNN). Rio de Janeiro, 2018: 1 [8] 张强、李嘉锋、卓力. 车辆识别技术综述. 北京工业大学学报, 2018, 44(3):382Zhang Q, Li J F, Zhuo L. Review of Vehicle Recognition Technology. J Beijing Univ Technol, 2018, 44(3): 382 [9] Zhao Z Q, Zheng P, Xu S T, et al. Object detection with deep learning: a review. IEEE Trans Neural Networks Learning Syst, 2019, 30(11): 3212 doi: 10.1109/TNNLS.2018.2876865 [10] Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell, 2017, 39(6): 1137 doi: 10.1109/TPAMI.2016.2577031 [11] Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, 2017: 2117 [12] He K M, Gkioxari G, Dollár P, et al. Mask R-CNN. [J/OL]. arXiv preprint (2018-01-24) [2019-12-15]. https://arxiv.org/abs/1703.06870 [13] Lu X, Li B Y, Yue Y X, et al. Grid R-CNN plus: faster and better [J/OL]. arXiv preprint (2019-06-13) [2019-12-15]. https://arxiv.org/abs/1906.05688v1 [14] Cai Z W, Vasconcelos N. Cascade R-CNN: high quality object detection and instance segmentation[J/OL]. arXiv preprint (2019-06-24) [2019-11-12]. https://arxiv.org/abs/1906.09756v1 [15] Liu W, Anguelov D, Erhan D, et al. SSD: Single shot multibox detector // European Conference on Computer Vision. Amsterdam, 2016: 21 [16] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, 2016: 779 [17] Rezatofighi H, Tsoi N, Gwak J Y, et al. Generalized intersection over union: a metric and a loss for bounding box regression // 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, 2019: 2961 [18] Zheng Z H, Wang P, Liu W, et al. Distance-IoU Loss: faster and better learning for bounding box regression [J/OL]. arXiv preprint (2019-11-19) [2019-12-15]. https://arxiv.org/abs/1911.08287 [19] Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection. IEEE Trans Pattern Anal Mach Intell, 2020, 42(2): 318 doi: 10.1109/TPAMI.2018.2858826 [20] Shen Z Q, Liu Z, Li J G, et al. DSOD: Learning deeply supervised object detectors from scratch // Proceedings of the IEEE International Conference on Computer Vision. Venice, 2017: 1919 [21] Law H, Deng J. CornerNet: detecting objects as paired keypoints [J/OL]. arXiv preprint (2019-03-18) [2019-12-15]. https://arxiv.org/abs/1808.01244v2 [22] Duan K W, Bai S, Xie L X, et al. CenterNet: Keypoint triplets for object detection [J/OL]. arXiv preprint (2019-04-19) [2019-12-15]. https://arxiv.org/abs/1904.08189v3 [23] Rashwan A, Agarwal R, Kalra A, et al. MatrixNets: a new scale and aspect ratio aware architecture for object detection[J/OL]. arXiv preprint (2020-01-09) [2020-01-15]. https://arxiv.org/abs/2001.03194v1 [24] Redmon J, Farhadi A. YOLOv3: an incremental improvement [J/OL]. arXiv preprint (2018-04-08) [2019-11-12]. https://arxiv.org/abs/1804.02767 [25] Liu Z, Li J G, Shen Z Q, et al. Learning efficient convolutional networks through network slimming // 2017 IEEE International Conference on Computer Vision. Venice, 2017: 2755 [26] Chen K, Wang J Q, Pang J M, et al. MMDetection: Open MMLab detection toolbox and benchmark [J/OL]. arXiv preprint (2019-06-17) [2019-11-12]. https://arxiv.org/abs/1906.07155v1 -