With >50 million people affected globally and roughly 5 million new cases per year, epilepsy is one of the most common neurological disorders affecting the human brain [1]. Despite the recently reported decrease in epilepsy-related death and disability-adjusted life-years [2], it still represents one of the most burdensome brain disorders worldwide, affecting people of all ages and often being associated with social stigma, psychiatric comorbidities, and high economic costs [3].
According to the latest International League Against Epilepsy (ILAE), the diagnosis of epilepsy requires: 1) at least two unprovoked seizures occurring at an interval >24 h apart or 2) one unprovoked or reflex seizure and a probability of having another seizure similar to the general recurrence risk after two unprovoked seizures over the next 10 years or 3) an epilepsy syndrome [4,5]. From a clinical standpoint, an epileptic seizure can be defined as a transient occurrence of signs and/or symptoms due to abnormal excessive or synchronous neuronal activity in the brain [4]. The clinical features of seizures might be various, ranging from subtle, transitory sensory (e.g., olfactory) misperception up to a few minutes long tonic-clonic events with loss of consciousness, the latter eventually occurring also in absence of tonic-clonic manifestations [6]. Due to the seizure’s intrinsic paroxysmal nature and short duration of an epileptic seizure, its diagnosis is often supported by the identification of abnormal oscillatory activity patterns on electroencephalographic (EEG) recordings.
EEG is a neurophysiological tool that, through the positioning of electrodes on the scalp or, alternatively, intracranially, along the cortical surface of the brain (i.e., electrocorticography), or in deep brain structures (i.e., stereotactic EEG), can record the summation of post-synaptic graduated potentials generated in the superficial layers of the cerebral cortex occurring in dendrites of pyramidal cells [7]. This allows clinicians to detect changes in neuronal activity due to several neurological disorders [8]. Qualitative analysis and interpretation of specific EEG features that are usually present before, during, or right after a seizure (i.e., pre-ictal, ictal, and post-ictal abnormalities, respectively) or between seizures (i.e., inter-ictal abnormalities), plays a pivotal role in the diagnosis and classification of different epilepsy forms [9]. The detection of specific epileptic abnormalities during EEG recordings is regarded as an indispensable diagnostic step for epilepsy in the appropriate clinical setting. Furthermore, EEG examination contributes to the characterization and classification of seizures and epilepsy type (i.e., focal or generalized), thus providing pivotal therapeutic and prognostic information for the management of the disease in clinical practice. Finally, EEG is routinely used for epileptic patients’ follow-up, for instance, to assess patients’ response/adherence to pharmacological therapy and/or for surgical planning of drug-resistant epilepsy [10].
Despite the pivotal supportive role of EEG for the diagnosis of epilepsy, its sensitivity varies between 29 % and 55 % in patients presenting a suspected, first-time-ever seizure [11]. Hence, almost half of the epileptic patients have normal routine EEGs after their first clinical episode. This results in an increased time from symptom onset to diagnosis and treatment initiation, further exposing epileptic patients to the risk of additional, subsequent seizures. Even though EEG inter-ictal abnormalities are highly specific when detected, artifacts or non-epileptic activity are often erroneously misread as epileptic [12]. This considerably increases the rate of misdiagnosis among patients presenting a suspected, first-time seizure which ranges between 20 % and 30 % of adults [13]. To further improve the sensitivity and specificity of EEG recordings for the diagnosis of epilepsy, as well as reduce the time spent by physicians in analyzing recordings for diagnostic purposes, novel approaches such as artificial intelligence (AI) methods have been recently applied to EEG data. These techniques mainly rely on huge amounts of prelabeled data, (including deep learning) for obtaining results, which most of the time are difficult or impossible to obtain. Therefore, the study of unsupervised or semi-supervised methods shows promise in solving medical classification problems with little to no information on labels.
Unsupervised learning is a branch of machine learning where data is provided as input without supervised target outputs (or labels), external rewards, or constraints [14]. The model learns to find patterns within the EEG data provided as input. These patterns may be later used for decision-making. Classes are assigned based on the extracted characteristics from the data. Self-supervised or semi-supervised learning is an implementation of unsupervised learning where the input data itself is used for training and supervision. Either parts or the full dataset are used as labels or targets for the applied algorithm to extract features. The output of self-supervised methods depends on the applied algorithm and the data itself and can be used together or in a similar fashion to standard classification techniques. The ability of unsupervised algorithms to learn from the data itself without any external information brings the advantage of gaining and adapting knowledge from the data, without any prior biases drawn from past expert knowledge [15].
A wide range of classic supervised machine learning paradigms have been previously proposed for problems in epilepsy using a variety of preprocessing, feature extraction and classification techniques [16]. These methods extract specific features from the data using various time and frequency transforms and train algorithms based on ground truths as provided by epileptologists [17]. Thus the output of these classic paradigms is susceptible to how well the features represent the data for the specific problem as well as to issue related to the inter-rater reliability and the ground truth provided by the annotators [18]. Unsupervised learning technique might bypass these issues by focusing on extracting information from the provided input data, rather than human expertise. This could lead to further insights not commonly extracted with supervised learning techniques or feature extraction methods based on expert knowledge. By combining unsupervised and supervised learning into algorithms with self-supervision or with a semi-supervision, the potential for new insights is further increased.
This paper aims to provide an overview of the state-of-the-art of unsupervised learning methods used on EEG data for epilepsy. Previous reviews have already addressed the topic of AI in epilepsy, with some of them specifically looking at: big data for seizure detection and forecasting [19], theoretical and methodological analysis of EEG seizure detection [20], seizure detection for closed loop therapeutic neurostimulation [21], graph theory in epilepsy [22], seizure prediction using deep learning [23], seizure onset identification from invasive EEG [24], published EEG databases for epilepsy detection [25], performance evaluation [26], seizure classification on the Temple University Hospital Seizure Corpus [27]. Nevertheless, a systematic investigation concerning the application and potential clinical advantages of unsupervised learning on EEG data for epilepsy is still lacking. Hence, we systematically reviewed the literature aiming to: (i) identify trends in the use of unsupervised methods on EEG data in epilepsy patients (ii) identify gaps and points of improvement (iii) carve out perspectives for future work.
The paper is organized as follows: the second section after the introduction describes the methods for the systematic literature review; the third section presents the results of the review with a focus on differences in (i) traning datasets, (ii) algorithm architectures, (iii) validation and performance metrics, and (iv) clinical applications of previous studies using unsupervised learning on EEG data in epilepsy; the fourth and fifth sections discuss the results and identify opportunities for further development of this AI method when applied to EEG data.