Skip to content

Mobiwn/multimodal-dsp-challenges

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

13 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐ŸŽฏ Multimodal DSP Challenges

Python License Status

An advanced suite for image restoration, audio denoising, and phonetic formant analysis using Python 3.14.

GitHub Repo

Author: Mobi


โœจ Overview

This project showcases cutting-edge Digital Signal Processing (DSP) techniques applied across multiple modalities including image processing and audio analysis. Built with high-performance Python 3.14, it demonstrates sophisticated algorithms for noise removal, spectral analysis, and phonetic characterization.

๐Ÿ“‹ Table of Contents

๐Ÿš€ Features

  • ๐Ÿ–ผ๏ธ Image Processing - Frequency-domain noise removal and blur estimation
  • ๐Ÿ”Š Audio Analysis - IIR filtering, spectral leakage, and formant extraction
  • ๐Ÿ“Š Spectral Visualization - Comprehensive plots and analysis reports
  • ๐ŸŒ Interactive Report - Live Persian dashboard with real-time simulations
  • โšก High Performance - Optimized with NumPy and SciPy
  • ๐ŸŽฏ Modular Design - Clean, reusable components for each task

๐Ÿ“ฆ Installation

Prerequisites

  • Python 3.12 or higher
  • uv package manager

Setup

# Clone the repository
git clone https://github.com/Mobiwn/multimodal-dsp-challenges.git
cd multimodal-dsp-challenges

# Install dependencies
uv sync

๐Ÿƒ Quick Start

Run the challenges individually to generate analysis results in the results/ folder:

# Image Processing Tasks
uv run src/image_dsp.py

# Audio Processing Tasks
uv run src/audio_dsp.py

# Spectral Analysis
uv run src/spectral_utils.py

๐ŸŒ Interactive Report

Explore the DSP challenges interactively with a Persian-language dashboard hosted on GitHub Pages.

  • Live Demo: View Interactive Report
  • Features include real-time notch filter adjustments, motion blur analysis, spectral leakage comparisons, and vowel formant visualizations.

๐Ÿ“ Project Structure

multimodal-dsp-challenges/
โ”œโ”€โ”€ src/
โ”‚   โ”œโ”€โ”€ audio_dsp.py       # Audio processing and IIR filtering
โ”‚   โ”œโ”€โ”€ data_utils.py      # Data loading and utilities
โ”‚   โ”œโ”€โ”€ image_dsp.py       # Image processing in frequency domain
โ”‚   โ””โ”€โ”€ spectral_utils.py  # Spectral analysis tools
โ”œโ”€โ”€ data/                  # Input audio files
โ”‚   โ”œโ”€โ”€ input_speech.wav
โ”‚   โ””โ”€โ”€ test_speech.wav
โ”œโ”€โ”€ docs/
โ”‚   โ””โ”€โ”€ index.html         # Interactive Persian report (hosted on GitHub Pages)
โ”œโ”€โ”€ results/
โ”‚   โ”œโ”€โ”€ audio/            # Processed audio output
โ”‚   โ””โ”€โ”€ figures/          # Analysis visualizations
โ”œโ”€โ”€ pyproject.toml        # Project configuration
โ”œโ”€โ”€ requirements.txt      # Python dependencies
โ””โ”€โ”€ README.md            # This file

๐ŸŽฎ Challenges

Challenge 1: Notch Filtering in Image Processing ๐Ÿ–ผ๏ธ

Goal: Remove periodic noise from images using 2D frequency domain filtering.

Notch Filter Analysis

Techniques:

  • 2D FFT transformation
  • Frequency domain notch filter design
  • Inverse FFT reconstruction

Challenge 2: Motion Blur Estimation ๐Ÿ“ธ

Goal: Estimate blur length in motion-blurred images using spectral nulls.

Motion Blur Analysis

Techniques:

  • Spectral null detection
  • Radial averaging in frequency domain
  • Blur parameter estimation

Challenge 3: IIR Notch Filter ๐Ÿ”Š

Goal: Design and implement IIR notch filter for 50Hz hum removal.

Notch Filter Report

Techniques:

  • IIR filter design
  • Pole-zero plot analysis
  • Stability assessment
  • Real-time audio filtering

Challenge 4: Spectral Leakage Analysis ๐Ÿ“Š

Goal: Understand and visualize spectral leakage effects with different windows.

Spectral Leakage Report

Techniques:

  • Windowing functions (Hann, Hamming, Blackman)
  • FFT bin analysis
  • Leakage quantification

Challenge 5: Vowel Formant Extraction ๐Ÿ—ฃ๏ธ

Goal: Extract and analyze vowel formants using LPC modeling.

Vowel Formant Report Vowel Formant Detail

Techniques:

  • Linear Predictive Coding (LPC)
  • Formant frequency extraction
  • Phonetic vowel characterization

๐Ÿ“ˆ Results

All analysis results are automatically generated and saved to:

  • Audio Outputs: results/audio/ - Cleaned and processed audio files
  • Visualizations: results/figures/ - High-quality analysis plots

๐Ÿ›  Dependencies

Core dependencies managed by uv:

Package Version Purpose
NumPy โ‰ฅ2.4.0 Numerical computing
SciPy โ‰ฅ1.16.3 Signal processing
Matplotlib โ‰ฅ3.10.8 Data visualization
OpenCV โ‰ฅ4.11.0 Image processing
scikit-image โ‰ฅ0.26.0 Image algorithms
SoundFile โ‰ฅ0.13.1 Audio I/O
Streamlit โ‰ฅ1.52.2 Web interface

See pyproject.toml for complete dependency list.

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ‘จโ€๐Ÿ’ป Author

Created with โค๏ธ by Mobi

๐Ÿค Contributing

Contributions, issues, and feature requests are welcome! Feel free to check the issues page.

๐Ÿ“œ Acknowledgments

  • Built with modern Python 3.14 features
  • Powered by scientific Python ecosystem
  • Inspired by advanced DSP challenges

Made with ๐Ÿ’™ by Mobi | GitHub

About

๐ŸŽ› A comprehensive suite of DSP challenges featuring Image Restoration (Notch/Blur) and Audio Analysis (IIR/LPC). Built with Python 3.14 + uv for high-performance signal processing.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages