Skip to content

antdragiotis/Fairlearn-Credit-Quality-Bias-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bank Loan Credit Quality – Fairness & Bias Analysis

A Python/Jupyter notebook that detects and mitigates age-based bias in bank loan credit quality predictions using the Fairlearn library.


Overview

The notebook trains a Decision Tree classifier to predict the credit quality (Credit_Mix) of bank loans — labelled as Bad or Good/Standard — and then investigates whether the model treats borrowers of different age groups fairly. It applies two state-of-the-art bias mitigation strategies and compares their results as shown in the following chart:

Bias Mitigation Strategies Results


Dataset

An Excel file (~4,300 records) containing individual loan applicant data with fields such as Age, Annual_Income, Outstanding_Debt, Num_of_Delayed_Payment, Interest_Rate, and Credit_Mix (the target variable).
The excel file is a data clean subset of the "Credit score classification" file of Kaggle (https://www.kaggle.com/datasets/parisrohan/credit-score-classification) which contains persons’ credit-related information.


Process Pipeline

1. Data Loading & Preprocessing

  • Validates required columns and coerces numeric types
  • Builds a binary target: Bad (1) vs Good/Standard (0)
  • Bins Age into five groups: ≤25, 26–35, 36–45, 46–60, 60+

2. Exploratory Data Analysis

  • Visualises age distribution, credit mix per age group, bad-loan rates, and sample representation — revealing that older borrowers are underrepresented and show near-zero bad-loan rates, introducing bias against younger age groups.

3. Baseline Model & Bias Detection

  • Trains a Decision Tree and evaluates it using Fairlearn's MetricFrame
  • Measures Demographic Parity (selection rate equality) and Equal Opportunity (true positive rate equality) across age groups

4. Bias Mitigation (two strategies)

Method Type Mechanism
ExponentiatedGradient In-processing Re-weights training to satisfy EqualizedOdds constraints
ThresholdOptimizer Post-processing Applies group-specific classification thresholds

5. Comparative Analysis

  • Summary table of Accuracy, DPD, DPR, EOD, EOR across all models
  • Bar charts, radar/spider chart, and heatmaps for visual comparison

Requirements

fairlearn · scikit-learn · pandas · numpy · matplotlib · seaborn

Usage

jupyter notebook bias_analysis.ipynb

About

A comprehensive bias analysis on bank loan data, examining potential unfairness in credit quality predictions across age groups

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors