ML Lifecycle

Turbofan Predictive Maintenance

This repository implements a portfolio Machine Learning project for predictive maintenance on turbofan engine data. It demonstrates the full ML lifecycle end-to-end: training, deployment, monitoring, drift detection, and automated retraining — all runnable locally with Docker and Python. The focus is on lifecycle robustness and reproducibility, not squeezing the last decimals of accuracy.

Project Overview

Problem: predict Remaining Useful Life (RUL) / failure risk for turbofan engines.
Lifecycle stages: training, model serving, feedback collection, monitoring, drift detection, automated retraining.
Production-like aspects: microservices (BentoML), metrics (Prometheus), dashboards (Grafana), experiment tracking (MLflow), Docker Compose orchestration, local demo traffic generator.
Positioning: end-to-end ML eng/MLOps project, not state-of-the-art RUL. The project is thought to be fast, and run on any computer. Not to produce competitive RUL predictions.
Code quality assurance: Tox, Mypy and Github Actions CI/CD.

System Architecture

flowchart TD
    %% User Layer
    Demo[Demo Users<br/>continuous_predict.py]
    
    %% API Services Layer
    PredAPI[Prediction API<br/>:3000]
    FeedAPI[Feedback API<br/>:3001]
    DriftAPI[Drift Detection<br/>:3003]
    RetrainAPI[Retraining Service<br/>:3004]
    
    %% Storage Layer
    FeedStore[(Feedback Store<br/>rul_feedback.jsonl)]
    ModelStore[(Model Store<br/>models/*.joblib)]
    
    %% External Services
    Monitor[Prometheus :9090<br/>Grafana :3002]
    MLflow[MLflow :5000]
    
    %% Main Flow
    Demo -->|HTTP requests| PredAPI
    PredAPI -->|predictions| FeedAPI
    FeedAPI --> FeedStore
    FeedStore --> DriftAPI
    DriftAPI -->|trigger| RetrainAPI
    RetrainAPI --> ModelStore
    ModelStore -.->|hot reload| PredAPI
    
    %% Monitoring
    DriftAPI -.-> Monitor
    Monitor -.-> DriftAPI
    
    %% Experiment Tracking
    RetrainAPI --> MLflow
    
    %% Styling - Much Stronger Colors
    classDef service fill:#1976d2,stroke:#0d47a1,color:#ffffff
    classDef storage fill:#f57c00,stroke:#e65100,color:#ffffff
    classDef external fill:#7b1fa2,stroke:#4a148c,color:#ffffff
    classDef user fill:#388e3c,stroke:#1b5e20,color:#ffffff
    
    class Demo user
    class PredAPI,FeedAPI,DriftAPI,RetrainAPI service
    class FeedStore,ModelStore storage
    class Monitor,MLflow external

Demo serves as simulating users calling prediction endpoints concurrently.
Serving: BentoML microservices for prediction, feedback, drift detection, and retraining.
Feedback: JSONL storage (in a file to simplify) and compute basic RUL stats.
Monitoring: Prometheus metrics from services + Grafana dashboards.
Drift detection: PSI/KS style feature drift metrics and RMSE deltas with model baseline; can trigger retraining.
Local-first: designed to run on a single machine via Docker Compose.

Tech Stack

Python (pydantic, pandas, numpy, scikit-learn)
Serving: BentoML
Monitoring: Prometheus + Grafana (pre-provisioned dashboards)
Experiment tracking: MLflow (for runs/metrics; not using Model Registry)
Orchestration: Docker & Docker Compose

Dashboards and explanations

Overview

Shows health of each service.
The number of predictions made and errors (generated from bad input in the demo script).
Each different color is a different trained model.
Vertical purple lines are retraining triggers.

Drift

Grafana dashboard (10 minutes) for drift metrics and retraining trigger.

Metrics

RMSE baseline: RMSE of the last minute prediction and the actual RUL vs baseline for the last trained model. Warning (orange line) triggers retraining.
KS: Kolmogorov-Smirnov statistic for feature drift (KS > 0.1). Warning (orange line) triggers retraining.
PSI: Percentage of features that have changed significantly (PSI > 0.1). Warning (orange line) triggers retraining.

Each orange rectange signifies a drift signal -> calling for a retraining.
Each purple line is a newly trained model (v1.0, v2.0, v3.0, etc).

Without any retraining, it looks like this :

Data & Modeling

Dataset: NASA C-MAPSS turbofan engine degradation time series (multiple units, cycles, sensor readings).
Source: https://www.kaggle.com/datasets/behrad3d/nasa-cmaps

Features: per-engine time-based features on multiple cycles (3 settings, 21 sensors).
Already delivered with a train split, and a test split. We have ground truth RUL for both of them.
Model: Random Forest for speed/simplicity and quick iterations. Model bundles and feature names are tracked and hot-reloaded by the prediction service.
Results: RUL prediction performance for demonstrating lifecycle behaviors. The emphasis is on the system performances rather than SOTA metrics. (for this portfolio project)
Future: explore LSTM/GRU or other sequence models for improved sequence modeling.

Current performances with RandomForest, simple hyperparameters, all FE, and training and all train data and test on all test data:

Demo explanation - drift strategy

Train and Test set both have the same number of Engine Units (but different time series for each of them). The Test set simply has different 'flight conditions' (different distributions of sensor data through time).
We train on a subset of Train data and evaluate the corresponding unit of the Test set.
The demo strats with the first 10% of the Train set to train the first model. (and RMSE baseline made with the corresponding 10% of the Test set)
The demo script then send data (from Test set) to the Prediction API and collect feedback (RUL predictions and ground truths)
The feedback is used to compute metrics and trigger drift detection in another service. (see drift dashboard above)
If a drift is detected, the retraining service retrain on the Train set corresponding to the units of the Test set used in 'production.'
As soon as a new model is trained, the prediction service is updated with the new model bundle and hot-reloaded.
The demo continues until the end of the Test set.. which is programmed to last 10 minutes.

Quickstart

Prereqs: macOS/Linux, Python 3.13+, Docker, Kaggle Legacy API Key (.json), and a populated .env file.

Clone and enter the repo

git clone https://github.com/<you>/Turbofan-ML-lifecycle.git
cd Turbofan-ML-lifecycle

Create .env from example (mandatory)

cp .env.example .env

Create a venv and install deps via uv

python -m venv .venv && source .venv/bin/activate
python -m pip install uv && uv sync

Configure Kaggle "Legacy API Key". Download & prepare data:
- How to get token (while connected to your kaggle account):
  - https://www.kaggle.com/settings
  - Create Legacy API Key → this will automatically download a kaggle.json file.
  - save it to ~/.kaggle/kaggle.json
- Then run:
```
uv run initialize
```
  This download data, prepare it, and make feature engineering (mandatory).

If, for any reason, kaggle authentification fails, just paste:
{"username": "your_kaggle_username","key": "your_api_key_here"}
in your ~/.kaggle/kaggle.json file.

Start the stack (build all images)

docker compose up --build

Dashboards:

Grafana Overview (anonymous access enabled locally)
Grafana Drift Dashboard (anonymous access enabled locally)

Then, wait for all services to be up (see grafana 'Overview dashboard') and in another terminal, run the demo script:

source .venv/bin/activate && uv run continuous

From here, the demo takes 10 minutes to complete.

After that, you can stop the demo script with Ctrl+C in the terminal running it, and then stop the stack with docker compose down.

What this project contains

Time-series predictive maintenance feature engineering
Reproducible training/evaluation with experiment tracking (MLflow)
Model serving with a proper API (BentoML)
Metrics instrumentation and dashboards (Prometheus + Grafana)
Drift detection and automated retraining loop
Containerized, local, production-like environment (Docker Compose)

Roadmap / Future Work

Sequence models (LSTM/GRU) for better temporal dynamics
Hardening for production: CI/CD, more tests, container hardening, k8s (mini kube) deployment

Name		Name	Last commit message	Last commit date
Latest commit History 115 Commits
.github/workflows		.github/workflows
_plan		_plan
images		images
monitoring		monitoring
notebooks		notebooks
services		services
src		src
tests		tests
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile.driftdetector		Dockerfile.driftdetector
Dockerfile.feedback		Dockerfile.feedback
Dockerfile.prediction		Dockerfile.prediction
Dockerfile.retraining		Dockerfile.retraining
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
tox.ini		tox.ini
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML Lifecycle

Turbofan Predictive Maintenance

Project Overview

System Architecture

Tech Stack

Dashboards and explanations

Overview

Drift

Metrics

Without any retraining, it looks like this :

Data & Modeling

Demo explanation - drift strategy

Quickstart

What this project contains

Roadmap / Future Work

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ML Lifecycle

Turbofan Predictive Maintenance

Project Overview

System Architecture

Tech Stack

Dashboards and explanations

Overview

Drift

Metrics

Without any retraining, it looks like this :

Data & Modeling

Demo explanation - drift strategy

Quickstart

What this project contains

Roadmap / Future Work

About

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages