#

interpretability-and-explainability

Here are 30 public repositories matching this topic...

ruizheliUOA / Awesome-Interpretability-in-Large-Language-Models

This repository collects all relevant resources about interpretability in LLMs

dictionary-learning sparse-autoencoder interpretability-and-explainability mechanistic-interpretability

Updated Nov 1, 2024

HennyJie / IBGNN

MICCAI 2022 (Oral): Interpretable Graph Neural Networks for Connectome-Based Brain Disorder Analysis

healthcare brain graph-neural-networks miccai2022 interpretability-and-explainability

Updated Apr 29, 2023
Python

liugangcode / GREA

[KDD'22] Source codes of "Graph Rationalization with Environment-based Augmentations"

graph-neural-networks rationalization molecular-property-prediction interpretability-and-explainability polymer-property-prediction

Updated Mar 27, 2025
Python

Wuyxin / DISC

(ICML 2023) Discover and Cure: Concept-aware Mitigation of Spurious Correlation

pytorch data-augmentation generalization interpretability-and-explainability stable-diffusion icml-2023 spurious-correlation

Updated Nov 17, 2025
Python

WanyuGroup / CVPR2022-OrphicX

Official code for the CVPR 2022 (oral) paper "OrphicX: A Causality-Inspired Latent Variable Model for Interpreting Graph Neural Networks."

causality graph-neural-networks interpretability-and-explainability

Updated Apr 2, 2022
Python

cwangrun / ST-ProtoPNet

[ICCV 2023] Learning Support and Trivial Prototypes for Interpretable Image Classification

image-recognition fine-grained-classification interpretable-deep-learning interpretable-ai explainable-ai interpretable-neural-networks interpretability-and-explainability

Updated Mar 27, 2025
Python

cwangrun / MGProto

[TPAMI 2025] Mixture of Gaussian-distributed Prototypes with Generative Modelling for Interpretable and Trustworthy Image Recognition

prototypical-networks game-theory-algorithms out-of-distribution-detection generative-modelling trustworthy-ai interpretability-and-explainability intepretable-machine-learning horse-racing-ai interpretable-image-recognition

Updated May 25, 2025
Python

Trustworthy-ML-Lab / posthoc-generative-cbm

[CVPR 2025] Concept Bottleneck Autoencoder (CB-AE) -- efficiently transform any pretrained (black-box) image generative model into an interpretable generative concept bottleneck model (CBM) with minimal concept supervision, while preserving image quality

computer-vision deep-learning interpretable-deep-learning concept-bottleneck-models interpretability-and-explainability generative-ai mechanistic-interpretability

Updated Mar 3, 2026
Jupyter Notebook

vdlad / Remarkable-Robustness-of-LLMs

Codebase the paper "The Remarkable Robustness of LLMs: Stages of Inference?"

machine-learning ai-safety interpretability interpretability-and-explainability

Updated Jun 11, 2025
Jupyter Notebook

tail-unica / hopwise

hopwise: A Python Library for Explainable Recommendation based on Path Reasoning over Knowledge Graphs, ACM CIKM '25

recommender-systems link-prediction knowledge-graph-embeddings interpretability-and-explainability path-reasoning

Updated Apr 16, 2026
Python

warisgill / TraceFL

TraceFL is a novel mechanism for Federated Learning that achieves interpretability by tracking neuron provenance. It identifies clients responsible for global model predictions, achieving 99% accuracy across diverse datasets (e.g., medical imaging) and neural networks (e.g., GPT).

testing debugging machine-learning software-engineering accountability differential-privacy interpretability federated-learning explainable-ai explainability interpretability-and-explainability

Updated Nov 12, 2024
Python

interpretable-ml-class / interpretable-ml-class.github.io

Explainable AI: From Simple Rules to Complex Generative Models

ai ml interpretability explainable-ai explainable-ml explainability interpretability-and-explainability

Updated Jun 28, 2023
HTML

amir-hameed-mir / Sirraya_LSD_Code

Layer-wise Semantic Dynamics (LSD) is a model-agnostic framework for hallucination detection in Large Language Models (LLMs). It analyzes the geometric evolution of hidden-state semantics across transformer layers, using contrastive alignment between model activations and ground-truth embeddings to detect factual drift and semantic inconsistency.

nlp machine-learning deep-learning pytorch representation-learning language-model-evaluation contrastive-learning trustworthy-ai large-language-models interpretability-and-explainability geometric-learning hallucination-detection hallucination-mitigation factuality-evaluation semantic-geometry transformer-analysis semantic-drift-analysis

Updated Mar 12, 2026
Python

KU-HJH / CoIBA

This repository contains the official code of the paper: "Comprehensive Information Bottleneck for Unveiling Universal Attribution to Interpret Vision Transformers", which is published in CVPR 2025.

interpretable-deep-learning interpretable-ai interpretable-ml explainable-ai explainable-ml interpretable-machine-learning interpretability-and-explainability

Updated Aug 18, 2025
Python

bgreenwell / ebm

Explainable Boosting Machines

machine-learning ai blackbox interpretability interpretable-ai interpretable-ml explainable-ai explainable-ml xai interpretable interpretable-machine-learning iml glassbox interpretable-models explainable-machine-learning interpretability-and-explainability

Updated Mar 6, 2025
R

Skyyyy0920 / SSCBM

Semi-supervised Concept Bottleneck Models (SSCBM)

concept-bottleneck-models interpretability-and-explainability

Updated Oct 24, 2025
Python

qqqqqqqzx / The-Achilles-Heel-of-LLMs

Code for locating "critical neurons" in LLMs. We show that masking as few as 3 neurons can cripple a model's capabilities (ICLR 2026).

neural-networks ai-safety model-robustness interpretability-and-explainability llms mechanistic-interpretability iclr-2026 critical-neurons

Updated Mar 27, 2026
Python

Imenbaa / BA-LR

Explainable Speaker Recognition

forensics speaker-recognition resnet34 likelihood-ratio x-vector-pytorch interpretability-and-explainability automatic-voice-comparison forensic-speaker-recognition

Updated Oct 26, 2022
Python

VectorInstitute / crisp-nam

An intepretable model for survival prediction in competing risks settings. Checkout our blog!! https://vectorinstitute.github.io/crisp-nam/blog/

survival-analysis interpretability interpretability-and-explainability neural-additive-models

Updated Apr 17, 2026
Python

VictorNico / NNs_from_scratch

Build a Neural net from scratch without keras or pytorch just by using numpy for calculus, pandas for data loading.

neural-network numpy eda pandas mnist oop-principles docstrings simulated-annealing-algorithm interpretability-and-explainability gradiantdescent

Updated Dec 8, 2022
Jupyter Notebook

Improve this page

Add a description, image, and links to the interpretability-and-explainability topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the interpretability-and-explainability topic, visit your repo's landing page and select "manage topics."