基于ALBERT与双向GRU的中医脏腑定位模型

Localization model of traditional Chinese medicine Zang-fu based on ALBERT and Bi-GRU

  • 摘要: 脏腑定位,即明确病变所在的脏腑,是中医脏腑辨证的重要阶段。本文旨在通过神经网络模型搭建中医脏腑定位模型,输入症状文本信息,输出对应的病变脏腑标签,为实现中医辅助诊疗的脏腑辨证提供支持。将中医的脏腑定位问题建模为自然语言处理中的多标签文本分类问题,基于中医的医案数据,提出一种基于预训练模型ALBERT和双向门控循环单元(Bi-GRU)的脏腑定位模型。对比实验和消融实验的结果表明,本文提出的方法在中医脏腑定位的问题上相比于多层感知机模型、决策树模型具有更高的准确性,与Word2Vec文本表示方法相比,本文使用的ALBERT预训练模型的文本表示方法有效提升了模型的准确率。在模型参数上,ALBERT预训练模型相比BERT模型降低了模型参数量,有效减小了模型大小。最终,本文提出的脏腑定位模型在测试集上F1值达到了0.8013。

     

    Abstract: The rapid development of artificial intelligence (AI) has injected new vitality into various industries and provided new ideas for the development of traditional Chinese medicine (TCM). The combination of AI and TCM provides more technical support for TCM auxiliary diagnosis and treatment. In the history of TCM, many methods of syndrome differentiation have been observed, among which the differentiation of Zang-fu organs is one of the important methods. The purpose of this paper is to provide support for the localization of Zang-fu in TCM through AI technology. Localization of Zang-fu organs is a method of determining the location of lesions in such organs and is an important stage in the differentiation of Zang-fu organs in TCM. In this paper, the localization model of TCM Zang-fu organs through the neural network model was established. Through the input of symptom text information, the corresponding Zang-fu label for a lesion could be output to provide support for the realization of Zang-fu syndrome differentiation in TCM-assisted diagnosis and treatment. In this paper, the localization of Zang-fu organs was abstracted as multi-label text classification in natural language processing. Using the medical record data of TCM, a Zang-fu localization model based on pretraining models a lite BERT (ALBERT) and bidirectional gated recurrent unit (Bi-GRU) was proposed. Comparison and ablation experiments finally show that the proposed method is more accurate than multilayer perceptron and the decision tree. Moreover, using an ALBERT pretraining model for text representation effectively improves the accuracy of the localization model. In terms of model parameters, the ALBERT pretraining model greatly reduces the number of model parameters compared with the BERT model and effectively reduces the model size. Finally, the F1-value of the Zang-fu localization model proposed in this paper reaches 0.8013 on the test set, which provided certain support for the TCM auxiliary diagnosis and treatment.

     

/

返回文章
返回