Skip to content

Latest commit

 

History

History
68 lines (43 loc) · 3.38 KB

File metadata and controls

68 lines (43 loc) · 3.38 KB

Enhanced Classification Performance in Diabetic Retinopathy Detection Using Knowledge Distillation

Full Paper:

Introduction

This repository demonstrates an approach to improve diabetic retinopathy detection by leveraging Transformer-based knowledge distillation. The primary objective is to boost classification performance while maintaining efficient inference. By combining pre-trained models, Transformer architectures, and best practices such as early stopping, selective layer updating, and hyperparameter tuning, this project aims to offer a more reliable diagnostic tool for retinal image analysis.


Key Features

  • Transformer-Based Knowledge Distillation: Transfer knowledge from a high-capacity teacher model to a lighter student network.
  • Early Stopping & Layer Freezing: Prevent overfitting and reduce training time by selectively freezing layers.
  • Hyperparameter Optimization: Systematic search to find optimal learning rates, batch sizes, and regularization parameters.
  • Robust Performance Metrics: Evaluated using weighted Kappa Score and other classification metrics to ensure clinical relevance.

Technologies Used

  • Programming Language: Python
  • Deep Learning Framework: PyTorch
  • Machine Learning Libraries: scikit-learn, Keras
  • Plotting & Visualization: Matplotlib

Methodology & Architecture

Below is a high-level overview of the knowledge distillation process and training setups:

Architecture of KD
  1. Teacher Model MobileVitV2)
  2. Student Model (GoogleNet)
  3. Distillation Loss combining teacher outputs and ground truth.

Results & Analysis

We compared multiple models (ResNet34, MobileVitV2, DenseNet121, GoogleNet, DeiT3) under three different strategies (Imbalance, Augmented, and Cost-Sensitive Learning). The bar chart below shows training times for each combination:

Training Time of All The Models
  • Cost-Sensitive Learning offers an average of ~5% reduction in training time compared to Imbalance approaches, and ~25% faster than Augmented methods.

Explainable AI with SHAP

We employed SHAP (SHapley Additive exPlanations) to interpret and visualize the model’s predictions. This helps identify the critical regions in retinal images that drive classification decisions:

Explainable AI

License

This project is licensed under the MIT License. Feel free to use, modify, and distribute it as per the terms of this license.


Contact

If you have any questions or suggestions, please reach out:


Thank you for your interest! If you find this project useful or have ideas for improvement, consider starring the repository or contributing.