Tech Stack
Tech Stack
Autoclean EEG leverages Python for core pipeline development, ensuring compatibility with a broad range of signal processing tools. It supports common EEG file formats (EEGLAB .set, BIDS) and outputs results in JSON and PDF reports. Data is managed using a NoSQL document database, with integration to REDCap via serverless functions. Minimal dependencies are ensured through modern Python packaging by using uv.
Visit our Tech Stack Page link (Placeholder link - replace with actual link)
Table of Contents
- Introduction
- Core Technologies
- File Formats Supported
- Data Management
- Integration
- Dependencies
- Additional Resources
Introduction
Autoclean EEG employs a robust and modern technology stack to ensure efficient, scalable, and reproducible EEG processing. This document details the key technologies and their roles within the pipeline.
Core Technologies
- Programming Language: Python - A versatile, high-level programming language with a rich ecosystem of scientific computing libraries, making it ideal for signal processing and data analysis.
- Database: NoSQL document database (e.g., MongoDB) - NoSQL databases offer flexible schema design, crucial for handling diverse EEG data structures and metadata. Note: Specify the exact NoSQL database used if possible.
- Packaging:
uv- An extremely fast Python package installer and resolver, written in Rust. Usinguvenables very fast installs and ensures reproducible builds.
Dependencies: A Focus on Signal Processing Libraries
Autoclean EEG leverages a comprehensive set of Python libraries to ensure robust and efficient signal processing capabilities. Here, we'll delve into some of the key dependencies crucial for EEG data manipulation and analysis:
Core Preprocessing
- Pylossless: As mentioned earlier, Pylossless plays a crucial role in core preprocessing tasks within Autoclean EEG. It offers efficient and lossless implementations of common EEG pre-processing algorithms.
Core Scientific Libraries
- NumPy: The fundamental library for numerical computing in Python. It provides efficient data structures (arrays) and mathematical operations essential for EEG signal processing tasks.
- SciPy: Building upon NumPy, SciPy offers a collection of advanced algorithms for scientific computing, including signal processing functions like filtering, spectral analysis, and statistical analysis.
EEG-Specific Libraries
- EEGLab: Although implemented in MATLAB, Autoclean EEG integrates with EEGLab through eeglabio. This library allows access to EEGLab's extensive functionalities for EEG data loading, pre-processing (e.g., artifact removal, filtering), and feature extraction within the Python environment.
- MNE (MNE-Python): A powerful Python library specifically designed for working with magnetoencephalography (MEG) and EEG data. MNE provides a comprehensive suite of tools for data loading, pre-processing, source localization, connectivity analysis, and visualization. Autoclean EEG utilizes MNE for advanced EEG analysis tasks.
- Autoreject: This library offers automated algorithms for identifying and rejecting artifacts commonly encountered in EEG data, such as blinks, muscle activity, and line noise. Autoclean EEG leverages autoreject to ensure cleaner and more reliable data for further analysis.
Data Handling and Visualization
- Pandas: A versatile library for data manipulation and analysis. Pandas provides data structures (DataFrames) well-suited for organizing and handling EEG metadata and results within Autoclean EEG.
- Matplotlib: The cornerstone library for creating static, animated, and interactive visualizations in Python. Autoclean EEG utilizes Matplotlib to generate informative plots and figures for EEG signal inspection and analysis results.
File Formats Supported
- EEGLAB
.set: The standard file format for EEGLAB, a widely used MATLAB toolbox for EEG analysis, ensuring compatibility with existing EEG datasets. - MNE
.fif: The file format used by MNE for storing raw data, epochs, and evoked data. - BIDS (Brain Imaging Data Structure): A standardized format for organizing and describing neuroimaging data, promoting data sharing and interoperability.
Data Management
Data within Autoclean EEG is managed using the NoSQL document database. This allows for flexible storage of EEG data, associated metadata, processing logs, and results. This architecture allows for efficient querying and retrieval of information.
Integration
Autoclean EEG integrates with REDCap, a secure web application for building and managing online surveys and databases. In addition, we have implemented Azure backend cloud storage for files (with alternative S3 storage providers in development). This enables the preprocessing, analysis, and storage of raw data files within a fully integrated system - encompassing output files, metadata, and automated analysis results.
uvby Astral link
Dependencies
Autoclean EEG minimizes external dependencies by utilizing modern Python packaging practices, primarily using uv. This ensures a streamlined installation process and reduces the risk of dependency conflicts. A requirements.txt or pyproject.toml file captures all dependencies for reproducibility. Consider linking to your requirements file if publicly available.
Additional Resources
- Python Documentation
- BIDS Specification
- EEGLAB Wiki
- Add links to other relevant resources here, such as your project's GitHub repository or documentation.