Skip to content

nsia-henry/ussd-analytics-etl-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

USSD Analytics ETL Pipeline

A portfolio project demonstrating how raw USSD session events can be transformed into analytics-ready datasets and dashboards for product, operations, and business decision-making.

This project is inspired by a production analytics workflow I previously worked on, where I designed and supported real-time data pipelines and dashboards for USSD applications. For the production architecture I used Kafka → Flink SQL → ClickHouse → Superset + CubeJS. For this portfolio version, I recreated the core concepts in a simplified environment suitable for public sharing and demonstration.

#ETL #Kafka #Flink #ClickHouse #ApacheSuperset #SQL #DataEngineering #DataAnalytics #BusinessIntelligence #PortfolioProject

Project Goal

USSD applications generate high volumes of interaction data, but raw session events are not immediately useful for analytics teams. This project shows how those events can be modeled, transformed, and delivered into dashboards that answer key business questions such as:

  • How many sessions are being initiated over time?
  • How many users have visited the app?
  • Which menu paths are most used?
  • What is the completion vs timeout rate?
  • How do different telcos compare in usage patterns?

Architecture Overview

Production-inspired architecture

Kafka → Flink SQL → ClickHouse → Superset / CubeJS

This portfolio's implementation

Modeled event data → SQL-based transformations → analytics-ready database tables → Preset dashboard

Key Features

  • Designed a USSD session event/object model for analytics processing
  • Defined transformation logic for session-level analytics
  • Built analytics-ready tables for dashboard consumption
  • Created dashboard wireframes before implementation
  • Developed a live USSD analytics dashboard
  • Documented the technical design, metrics, and tradeoffs

Dashboard Preview

image

Live Dashboard

Public dashboard link: here

Tech Stack

Portfolio stack

  • PostgreSQL
  • SQL
  • AWS
  • Preset (Superset Dashboard)

Production-inspired stack

  • Apache Kafka
  • Flink SQL
  • ClickHouse
  • Apache Superset
  • CubeJS

Repository Structure

ussd-analytics-etl-pipeline/
│
├── README.md
├── requirements.md
│
├── docs/
│   ├── project-overview.md
│   ├── architecture.md
│   ├── data-model.md
│   ├── dashboard-metrics.md
│   └── artifacts-guide.md
│
├── production-artifacts/
│   ├── flink-session-transformations.sql
│   ├── kafka-to-clickhouse-config.sql
│   └── cubejs-session-history-schema.js
│
├── portfolio-demo/
│   └── postgres-setup-and-sample-data.sql
│
├── wireframes/
│   ├── ussd-analytics-dashboard.bmpr

Releases

No releases published

Packages

 
 
 

Contributors