Abstract:
Sleep takes approximately 1/3 of a person’s lifetime; therefore, its quality profoundly affects learning, physical recovery, and metabolism. Clinically relevant human physiological data are collected using polysomnography, which is analyzed by sleep technologists to determine sleep stages. However, the manual method is prone to having a cumbersome workload due to a large amount of data analysis and different data formats. Simultaneously, manually analyzed results are influenced by doctors’ medical clinical experience, which may cause inconsistent diagnoses. Recently, with the development of artificial intelligence, computer science, other technologies, and their interdisciplinarity, a series of typical achievements have been accomplished in intelligent diagnosis, laying the foundation for medical artificial intelligence in the sleep medicine field. In sleep research, realizing automatic sleep signal analysis and recognition assists doctors in diagnosis and reduces their workload, thus having important clinical significance and application value. Although deep neural networks are becoming popular for automatic sleep stage classification with supervised learning, large-scale, labeled datasets remain difficult to acquire. Learning from raw polysomnography signals and derived time-frequency image representations has been an interesting solution. However, extracting features from only a single dimension leads to inadequate feature extraction and, thus, limited accuracy. Hence, this paper aims to learn multi-view representations for physiological signals with semi-supervised learning. Specifically, we make the following contributions: (1) We propose a multi-view, hybrid neural network model containing a multichannel view time-frequency domain feature extraction mechanism, an attention mechanism, and a feature fusion module. Among these aspects, the multichannel view time-frequency domain mechanism extracts time domain and frequency domain signal features to achieve multi-view feature extraction. The attention mechanism module enhances salience features and achieves interclass feature extraction in the frequency domain. The feature fusion module fuses and classifies the above features. (2) A semi-supervised learning strategy is used to learn unlabeled electroencephalogram (EEG) data, which solves the problem of sleep data underutilization due to insufficient labeling of EEG signals in clinical practice. (3) Extensive experiments conducted on sleep stage classification demonstrate state-of-the-art performance compared with supervised learning and a semi-supervised baseline. Experimental results on three public databases (Sleep−EDF, DOD−H, and DOD−O) and one private database show that our semi-supervised method achieves accuracies of 81.6%, 81.5%, 79.2%, and 75.4%. The results show that our proposed model is comparable to a fully supervised sleep staging model while substantially reducing the technician’s workload in data labeling.