苏越, 唐朝晖, 谢永芳, 高小亮(通讯作者), 张虎, 马炜烨, 汤海玚. 基于稀疏注意力卷积ViT模型的锌浮选工况识别[J]. 工程科学学报. DOI: 10.13374/j.issn2095-9389.2024.05.13.004
引用本文: 苏越, 唐朝晖, 谢永芳, 高小亮(通讯作者), 张虎, 马炜烨, 汤海玚. 基于稀疏注意力卷积ViT模型的锌浮选工况识别[J]. 工程科学学报. DOI: 10.13374/j.issn2095-9389.2024.05.13.004
Sparse Attention Convolution-ViT Model for Working Condition Recognition in Zinc Flotation[J]. Chinese Journal of Engineering. DOI: 10.13374/j.issn2095-9389.2024.05.13.004
Citation: Sparse Attention Convolution-ViT Model for Working Condition Recognition in Zinc Flotation[J]. Chinese Journal of Engineering. DOI: 10.13374/j.issn2095-9389.2024.05.13.004

基于稀疏注意力卷积ViT模型的锌浮选工况识别

Sparse Attention Convolution-ViT Model for Working Condition Recognition in Zinc Flotation

  • 摘要: 准确识别锌浮选工况并用于指导锌浮选操作,可以提高浮选效率、优化选矿过程。目前浮选现场主要依靠人工肉眼观察泡沫,依据经验判断工况,这种方法容易受主观因素影响,难以客观准确地评价浮选工况。针对该问题,本文通过研究锌浮选泡沫视觉特征和浮选工况的密切联系,提出基于稀疏注意力卷积ViT模型的锌浮选工况识别方法。首先,所提模型结合卷积神经网络(Convolutional Neural Networks, CNN)和视觉Transformer(Vision Transformer,ViT)结构和优点,同时感知泡沫局部空间信息和全局信息,完备表征泡沫图像。其次,模型引入稀疏的多头注意力机制,每个注意力头以不同的稀疏程度处理特征,感知不同尺度下的全局信息,并引入注意力门控单元优化特征传递,最终实现锌浮选工况识别。实验结果表明,本文所提工况识别方法在锌浮选泡沫图像数据集上的准确率达到88.62%,解决了传统CNN和ViT模型不能充分利用泡沫图像全局信息,且无法自适应捕捉泡沫图像重要特征的问题,为浮选流程优化提供有力支持。

     

    Abstract: Accurate recognition of zinc flotation working conditions can optimize the process of zinc flotation and improve the efficiency. Currently, the recognition of zinc flotation working conditions rely on manual visual observation of froth with experience, which is easily influenced by subjective factors, making it difficult to objectively and accurately recognize flotation working conditions. To solve this problem, utilizing machine vision to investigate the relationship between visual features of zinc flotation froth and working conditions, a sparse attention convolution-ViT model is proposed to recognize the working condition of zinc flotation. Firstly, considering that convolutional neural networks (CNN) which is good at extracting local features, and vision transformer (ViT) which is good at extracting global features, the sparse attention convolution-ViT model combines the structure of CNN and ViT, which is good at extracting both global features such as froth size, texture, color, and global features such as the distribution of froth sizes. Secondly, a sparse multi-head attention mechanism is introduced into ViT to fully process the global feature of froth images. In the sparse multi-head attention mechanism, each attention head processes the global features of the froth image with different levels of sparsity, which can obtain more generalized froth features. Finally, an attention gated unit is proposed to process the features further, allowing for adaptive adjustment of the weight assigned to each feature in the image, which enhances the interpretability of the model, optimizes feature transfer, and captures essential features effectively. Experimental result shows that the sparse attention convolution-ViT model can recognize zinc flotation conditions accurately, and the recognition accuracy on the zinc flotation froth image data set reached 88.62%, which is higher than the traditional CNN and ViT models. Ablation experiments were conducted to verify the necessity of the sparse multi-head attention mechanism and attention gated unit, which improved the recognition accuracy by 0.92% and 2.63% respectively. Grad-CAM (Gradient-weighted Class Activation Mapping) is used to visualize the weights of different features to verify that the proposed sparse attention convolution-ViT model can characterize the froth image with both local features and global features and accurately recognize zinc flotation conditions. The above results show that the proposed sparse attention convolution-ViT model can accurately recognize zinc flotation working conditions and has good application value.

     

/

返回文章
返回