U-Time:A Fully Convolutional Network for Time Series Segmentation Applied to Sleep Staging Summary

From statwiki
Revision as of 22:36, 25 November 2021 by A2mossma (talk | contribs)
Jump to navigation Jump to search

Introduction

During sleep, the brain goes through different sleep stages, each characterised by brain and body activity patterns. Stages can be determined by measurements in a so called polysomnography study (PSG), which includes measurements of brain activity by EEG, eye movement and facial muscle activity. The process of mapping the transitions between sleep stages is called sleep staging and provides the basis for diagnosis of sleeping disorders. Traditionally, sleep staging is done manually by splitting the measurements of a PSG into 30s segments, each containing multiple channels of data, and classifying the segments individually. Since this requires a lot of expertise and time, automatization is of interest. Fast and reliable automated sleep staging could help with diagnosis and help find novel biomarkers for disorders.

State of the art sleep staging classifiers employ convolutional and recurrent layers. The problem with recurrent neural nets is that they can be difficult to tune and optimize and might need hyperparameter tuning to be suitable for different data sets. This means they are often specially trained to be applied on one dataset alone and might be difficult to use for non-experts in a more general setting.

This paper introduces U-Time, a feed-forward convolutional network for sleep staging, which treats segmentation similar to how the popular image classifier U-net treats image segmentation. It does not need hyperparameter or architectural tuning to be applied to variable data sets, and it is able to classify sleep stages at any temporal resolution.


Previous Work

Recently, there has been much interest in using machine learning techniques for analyzing physiological time series (Faust et al., 2018). Multiple neural network-based systems have been developed to classify different sleep-wake stages in humans (Principe and Tome, 1989), babies (Pfurtscheller et al., 1992), and even cats (Mamelak et al., 1991). However, a drawback of recurrent neural networks is that they are difficult to tune and optimize in practice, resulting in many being replaced with feed-forward systems without losing accuracy (Bai et al, 2018; Chen and Wu, 2017; Vaswani et al., 2017). Here, U-net is a feed-forward convolutional neural network that does not require hyperparameter or architectural tuning; in particular, it uses dilated convolutions to aggregate multi- scale contextual information without losing resolution or requiring the images to be rescaled.

Methods

Results

U-net was applied to 7 different PSG datasets with fixed architecture and hyperparameters, so there was no data-specific tuning. Furthermore, U-net only received one EEG channel as input.

The performance of U-net was compared to known models trained for use on a specific data set where available. As a baseline measure, the authors use an improved version of DeepSleepNet, which employs convolutional and recurrent layers and was designed to be applicable to different data sets. In the table summarising the results, this model is denoted by CNN-LSTM (LSTM stands for long short term memory, an example of a recurrent architecture).

Across all datasets, U-net had a high performance score similar to or higher than any known state of the art automated method specifically designed for that data set and the baseline.


Conclusion

Across all seven different PSG datasets, the same U-Time network architecture and hyperparameter settings were used. The benefits of this are such that one can avoid parameter overfitting this way, and robustness from U-Time's fully convolutional, feed-forward only architecture allows for it to be readily used by non-experts across health-related disciplines. The U-Time network architecture also has desirable properties such as computational efficiency, flexibility for the input window T to be dynamically adjusted (i.e. an entire PSG record can be scored in a single pass), and high temporal resolution in the sleep stage output. While the authors chose to consider only a single EEG channel, it would be of interest to have U-Time receive multiple input channels for sleep staging, including EOG (eye movement) which often provides important information for distinguishing between wake and REM sleep stages. Overall, U-Time network architecture is a robust and efficient approach for time series segmentation that can be implemented with ease by health and computational researchers.