EEG Stress Level Detector

This project is an EEG Stress Level Detector, a machine learning application that predicts stress levels from EEG signals by analyzing beta wave intensity. It provides insights into stress levels and identifies specific time segments where beta waves are amplified or less amplified, using a Random Forest Classifier and Butterworth filtering

Introduction

The EEG Stress Level Detector is a machine learning-based application designed to predict stress levels from EEG (Electroencephalogram) signals. EEG signals are electrical activities of the brain recorded using electrodes placed on the scalp. Stress detection using EEG is a growing field in neuroscience and mental health, as stress is a critical factor affecting cognitive performance and overall well-being. This project focuses on analyzing beta waves (13–30 Hz), which are associated with active thinking, focus, and stress. By preprocessing EEG data, training a machine learning model, and visualizing results, the application provides an interactive way to detect stress levels.

Objectives

The primary objective of this project is to:

Predict Stress Levels:

Detect stress levels on a scale of 1 to 10 based on EEG signals.

Analyze Beta Waves:

Focus on beta wave intensity to identify stress levels.

Provide Insights:

Highlight sample index ranges where beta waves are amplified or less amplified.

Interactive GUI:

Allow users to load EEG data, visualize signals, and view stress predictions.

EEG Waves Classification

Gamma

Above 25hz Frequency

Beta

12-25hz Frequency

Alpha

8-12Hz Frequency

Theta

4-8Hz Frequency

Delta

1-4Hz Frequency

Description of Deep Learning and Machine Learning Methods

1. Machine Learning Model: Random Forest Classifier

Random Forest is a robust ensemble learning method that combines multiple decision trees to improve accuracy and reduce overfitting.
It works well with high-dimensional data like EEG signals and is resistant to noise.
It provides feature importance, which can help in understanding which EEG channels are most relevant for stress detection.

How It Works?

The model is trained on preprocessed EEG data to classify stress levels.
It uses multiple decision trees, each trained on a random subset of the data, and aggregates their predictions.
During training, each tree in the forest learns to classify the data based on a random subset of features.
For prediction, each tree "votes" for a class, and the class with the most votes becomes the final prediction.

Advantages of Random Forest:

High accuracy and robustness
Resistance to overfitting
Ability to handle large datasets with many features
Provides feature importance measures
Can handle both categorical and numerical data

Disadvantages of Random Forest:

Can be computationally expensive for very large datasets
Less interpretable than single decision trees

2. Preprocessing with Butterworth Low-Pass Filter

Why Butterworth Filter?

EEG signals are often noisy due to artifacts (e.g., eye blinks, muscle movements) and external interference (e.g., power line noise).
The Butterworth low-pass filter effectively removes high-frequency noise while preserving the signal of interest, in this case, beta waves (13–30 Hz).
Butterworth filters are known for their flat passband response, meaning they don't introduce significant distortion to the frequencies within the passband.

How It Works?

The filter is applied to each EEG feature to retain only the beta wave frequencies.
The Butterworth filter is defined by its order and cutoff frequency.

The order determines the steepness of the filter's roll-off (i.e., how quickly it attenuates frequencies outside the passband).
The cutoff frequency is the frequency at which the filter starts to attenuate the signal. In this case, the cutoff frequency would be 30Hz.

The filter works by attenuating frequencies above the cutoff frequency, allowing the frequencies below the cutoff to pass through with minimal attenuation.

Filter Characteristics

Passband: Frequencies that are allowed to pass through the filter with minimal attenuation (0 to 30 Hz in this case).
Stopband: Frequencies that are significantly attenuated by the filter (frequencies above 30 Hz).
Cutoff Frequency: The frequency at which the filter transitions from the passband to the stopband (30 Hz).
Roll-off: The rate at which the filter attenuates frequencies in the stopband (determined by the filter order).

3. Weighted Averaging for Stress Prediction

Why Weighted Averaging?

Beta wave intensity varies across different time segments (sample indexes) of the EEG signal.
Weighted averaging ensures that time segments with higher beta wave intensity, which are more indicative of stress, contribute more significantly to the final stress level prediction.
This approach provides a more nuanced and accurate assessment of stress compared to a simple average.

How It Works?

Stress levels are calculated for specific sample index ranges where beta waves are amplified or less amplified.
For each range, the intensity of the beta waves is used as the weight.
The weighted average is computed as follows:

Weighted Average Stress = (Sum of (Stress Level * Beta Intensity)) / (Sum of Beta Intensities)

This calculation is performed separately for ranges of high beta wave intensity and low beta wave intensity to provide a more detailed analysis.

Figure 2 - Waveform Visualisation of The SAM40 database contains EEG recordings from 40 subjects performing cognitive tasks (like Stroop tests and arithmetic problems) designed to induce stress. It aims to provide data for researchers to identify EEG patterns associated with stress.

Step 1: Data Loading

EEG data is loaded from a CSV file.
The dataset contains 18 features (EEG channels) and a target column (Stress Level). The features represent the electrical activity recorded from different locations on the scalp.

Step 2: Preprocessing

Handling Missing Values:

Missing values are replaced with the mean of the respective column. This ensures that the model can handle incomplete data without introducing bias.
Other strategies for handling missing data could include:

Median imputation
K-nearest neighbors (KNN) imputation
Deletion of rows with missing values (if the amount of missing data is small)

Feature Handling:

Ensures the data has exactly 18 features. If there are fewer than 18 features, the data is padded with zeros. If there are more than 18 features, the data is truncated.
This step is crucial for ensuring compatibility with the machine learning model.

Denoising:

A Butterworth low-pass filter is applied to remove noise and isolate beta waves.
This step is crucial for improving the signal-to-noise ratio and focusing on the relevant frequency band for stress detection.

Filter parameters:

Cutoff Frequency: 30 Hz
Filter Type: Low-pass
Filter Order: (The order needs to be chosen, e.g., 4)

Scaling:

The data is normalized using StandardScaler to ensure all features are on the same scale.
StandardScaler standardizes the features by subtracting the mean and dividing by the standard deviation.
This is important because EEG signals can have different amplitudes, and scaling prevents features with larger amplitudes from dominating the model.

Step 3: Model Training

Train-Test Split:

The data is split into training (80%) and testing (20%) sets.
This ensures that the model is evaluated on unseen data, providing a more accurate estimate of its performance.
The training set is used to train the Random Forest model, while the test set is used to evaluate its performance.

Random Forest Classifier:

The model is trained on the training set to classify stress levels.
Key hyperparameters of the Random Forest model include:

Number of trees in the forest
Maximum depth of the trees
Minimum number of samples required to split an internal node
Minimum number of samples required to be at a leaf node

Accuracy Calculation:

The model's accuracy is evaluated on the test set.
Accuracy is calculated as the proportion of correctly classified samples.
Other relevant metrics for evaluating the model's performance could include:

Precision
Recall
F1-score
Confusion matrix
Area Under the Receiver Operating Characteristic Curve (AUC-ROC)

Step 4: Stress Prediction

Beta Wave Analysis:

Beta wave intensity is calculated as the mean absolute value of the EEG features within the beta frequency range (13-30 Hz).
The mean absolute value provides a measure of the average amplitude of the beta waves, which is indicative of their intensity.

Thresholding:

Thresholds are used to categorize beta wave intensity as high or low.
High beta waves: Intensity > 70% of the maximum intensity observed in the data.
Low beta waves: Intensity < 30% of the maximum intensity observed in the data.
These thresholds help to identify time segments where beta wave activity is significantly elevated or suppressed.

Weighted Averaging:

Stress levels are calculated separately for high and low beta wave ranges using weighted averaging.
The beta wave intensity is used as the weight for each stress level in the average.
This gives more importance to stress levels associated with higher beta wave intensity when calculating the average stress level for the high beta range, and vice-versa for the low beta range.

Sample Index Ranges:

The application identifies and reports the sample index ranges corresponding to amplified (high) and less amplified (low) beta waves.
This provides insights into the specific time periods where the model detects significant stress-related brain activity.

Step 5: Visualization

The GUI visualizes the EEG signals and displays the predicted stress levels.
Visualizations may include:

Plots of the raw EEG signals over time, with highlighted regions indicating high and low beta wave activity.
Bar graphs or numerical displays of the average stress levels for high and low beta wave ranges.
A plot of the beta wave intensity over time.

Step-1 Run the Python File

Step- 2 Load CSV EEG data File from Drive

Step - 3 EEG Data Waveform Visualisation

Step - 4 High Intensity and Low intensity Stress Level Interpolation of Beta Waves

Conclusion and Future Prospectus

This project demonstrates the potential of EEG signal analysis for effective stress detection. By employing advanced signal processing techniques, feature extraction, and machine learning algorithms, it's possible to identify patterns in brain activity that correlate with varying stress levels. The findings contribute to the development of objective stress assessment methods, moving beyond subjective self-reporting.

Future Prospects:

Real-time Stress Monitoring: Develop wearable EEG devices integrated with AI to provide real-time stress feedback and alerts, enabling timely intervention.
Personalized Stress Management: Tailor stress management techniques (e.g., biofeedback, meditation guidance) based on individual EEG patterns and stress responses.
Integration with Virtual/Augmented Reality (VR/AR): Combine EEG-based stress detection with VR/AR environments to create immersive experiences that promote relaxation or simulate stressful scenarios for training purposes.
Longitudinal Stress Tracking: Use the technology for long-term monitoring of stress levels in individuals, providing insights into the impact of lifestyle, work, and interventions on mental well-being.
Enhance accuracy by combining other signals: Combine EEG signals with other physiological data, such as heart rate variability (HRV), galvanic skin response (GSR), and respiration rate, to improve the accuracy and robustness of stress detection.

Improving Accuracy and Application Methods:

Advanced Signal Processing: Utilize more sophisticated signal processing techniques, such as wavelet transform, independent component analysis (ICA), and adaptive filtering, to remove noise and artifacts from EEG signals.
Deep Learning Models: Implement deep learning architectures like convolutional neural networks (CNNs) and recurrent neural networks (RNNs) to automatically extract relevant features from EEG data and improve classification accuracy.
Feature Engineering: Explore novel EEG features beyond traditional frequency bands (e.g., connectivity measures, non-linear dynamics) that may provide more sensitive indicators of stress.
Personalized Models: Develop machine learning models that are personalized to individual differences in brain activity and stress responses, potentially improving accuracy and reliability.
Context-Awareness: Integrate contextual information, such as time of day, task type, and environmental factors, to enhance the accuracy and relevance of stress detection in real-world settings.
Hybrid Approaches: Combine machine learning with traditional statistical methods to leverage the strengths of both and improve the robustness of the system.

Advantages of this Technique

Provides an objective and non-invasive method for stress assessment.