This repository contains a MATLAB-based implementation for lie detection using EEG signals, leveraging wavelet-based feature extraction, advanced feature selection techniques, and ensemble classification methods. The project processes EEG data from multiple channels, applies Variational Mode Decomposition (VMD), extracts a comprehensive set of features, performs feature ranking, and evaluates classification performance using various methods.
Deception detection using physiological signals has been an active area of research in neuroscience and machine learning. This project focuses on analyzing EEG signals from multiple brain regions to identify patterns associated with truthful vs. deceptive responses.
- Multi-channel EEG Analysis: Processes signals from 5 EEG channels (AF3, T7, PZ, T8, AF4)
- Advanced Signal Decomposition: Uses Variational Mode Decomposition (VMD) to extract 9 Intrinsic Mode Functions (IMFs)
- Comprehensive Feature Extraction: Extracts 16 different features per IMF per channel (720 total features)
- Multiple Feature Selection Methods: Implements 4 different ranking algorithms
- Ensemble Learning: Combines multiple classifiers and feature selection methods
- Data Augmentation: Applies noise addition, scaling, and SMOTE-like techniques
- Cross-Validation: 10-fold cross-validation for robust performance evaluation
EEG Data
↓
VMD Decomposition
↓
Feature Extraction
↓
Feature Selection
↓
Classification
Details:
- EEG Data → 5 Channels
- VMD Decomposition → 9 IMFs per Channel
- Feature Extraction → 16 Features per IMF
- Feature Selection → Multiple Methods
- Classification → Ensemble Models
├── src/ # Source code for the main analysis and supporting functions
│ ├── main_analysis.m # Main script orchestrating the analysis pipeline
│ ├── feature_extraction/ # Functions for feature extraction and VMD decomposition
│ ├── feature_selection/ # Feature ranking and selection algorithms
│ ├── classification/ # Cross-validation and classifier implementations
│ ├── visualization/ # Visualization functions for results analysis
│ └── utils/ # Utility functions for data augmentation and preprocessing
│
├── data/ # Placeholder for EEG data files and data description
├── results/ # Output files including feature rankings and visualizations
├── docs/ # Documentation on methodology, features, and results
├── examples/ # Example scripts for quick start and custom analysis
├── full_code.m # Complete single executable code for all of these
└── requirements.txt # MATLAB toolbox requirements
- MATLAB (R2020a or later recommended)
- Required toolboxes (see
requirements.txt)
Clone the repository:
git clone https://github.com/diptiman-mohanta/eeg-lie-detection.git
cd eeg-lie-detectionEnsure MATLAB is installed with the required toolboxes.
Place your EEG data (CSV files) in the data/sample_data/ directory.
- Update the
dataDirpath insrc/main_analysis.mto point to your EEG data directory. - Run the main analysis script in MATLAB:
run('src/main_analysis.m')Results will be saved in the results/ directory, including:
- Feature rankings
- Visualizations
- A comprehensive results file:
enhanced_analysis_results.mat
EEG data should be in CSV format with columns corresponding to the following channels:
AF3,T7,PZ,T8,AF4
File names must include either truth or lie to indicate the ground truth label.
I have modified the file names for this project orginal datset contains different file name modify this according to this.
The analysis pipeline includes the following stages:
- Data Loading: Reads EEG data from CSV files.
- VMD Decomposition: Decomposes each channel into 9 Intrinsic Mode Functions (IMFs).
- Feature Extraction: Computes 16 statistical features (Mean, Mode, Median, etc.) per IMF per channel.
- Data Augmentation: Applies techniques like Gaussian noise, scaling, and SMOTE-like methods to balance and expand the dataset.
- Feature Selection: Utilizes Correlation, Chi-square, Relief-F, and Mutual Information to rank and select optimal features.
- Classification: Evaluates multiple classifiers (KNN, SVM, Decision Tree, Naive Bayes, Ensemble methods) using 10-fold cross-validation.
- Visualization: Produces plots for accuracy trends, training curves, confusion matrices, feature importance, and ROC curves.
The best model achieved a classification accuracy of 99.07% using the Relief-F feature selection method.
For detailed results, refer to:
results/feature_rankings.csvdocs/results_analysis.md
If you use this work in your research, please cite:
@misc{eeg-lie-detection,
title={EEG Lie Detection using Wavelet based Feature Extraction and Ensemble Methods},
author={Diptiman Mohanta},
year={2025},
url={https://github.com/diptiman-mohanta/eeg-lie-detection.git}
}This LieWaves dataset is used in this work:
Aslan, Musa; Baykara, Muhammet; Alakuş, Talha Burak (2024).
LieWaves: dataset for lie detection based on EEG signals and wavelets, Mendeley Data, V2.
DOI: 10.17632/5gzxb2bzs2.2
Used for reference and basic understanding:
Aslan M, Baykara M, Alakus TB. LieWaves: dataset for lie detection based on EEG signals and wavelets. Med Biol Eng Comput. 2024 May;62(5):1571–1588.
doi: 10.1007/s11517-024-03021-2. PMID: 38311647