Skip to content

dhrubacb/my-caffeine

Repository files navigation

AI-Enhanced W-TinyLFU Cache Prototype

A Java-based high-performance caching library that extends the W-TinyLFU algorithm with AI-based retention and eviction policies. Inspired by Caffeine, this prototype demonstrates how machine learning can enhance traditional cache admission policies.

Overview

This prototype implements a Window TinyLFU (W-TinyLFU) cache with an AI enhancement layer that predicts future access patterns and makes smarter admission and eviction decisions beyond traditional frequency-based approaches.

Key Features

  • W-TinyLFU Architecture: Three-tiered cache structure (window, probationary, protected)
  • AI-Enhanced Admission Policy: ML-based prediction of entry value
  • Access Pattern Classification: Identifies hot spots, sequential scans, temporal patterns
  • Count-Min Sketch: Space-efficient frequency estimation
  • Online Learning: Continuous model improvement based on workload
  • Comprehensive Statistics: Detailed metrics for performance analysis
  • Thread-Safe: Concurrent access support with lock amortization

Architecture

Cache Structure

┌─────────────────────────────────────────────────────────┐
│                    AI-Enhanced Cache                     │
├─────────────────────────────────────────────────────────┤
│  Window Cache (1%)                                       │
│  ├─ Admission buffer for new entries                    │
│  └─ LRU eviction                                         │
├─────────────────────────────────────────────────────────┤
│  Probationary Segment (20%)                              │
│  ├─ Entries promoted from window                        │
│  └─ Candidates for eviction                             │
├─────────────────────────────────────────────────────────┤
│  Protected Segment (80%)                                 │
│  ├─ Hot entries with proven value                       │
│  └─ Promoted from probationary on re-access             │
└─────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────┐
│                   AI Enhancement Layer                   │
├─────────────────────────────────────────────────────────┤
│  Count-Min Sketch                                        │
│  ├─ Frequency estimation (4-bit counters)               │
│  └─ Periodic aging mechanism                            │
├─────────────────────────────────────────────────────────┤
│  ML Predictor                                            │
│  ├─ Feature extraction (7 features)                     │
│  ├─ Linear regression model                             │
│  └─ Online gradient descent training                    │
├─────────────────────────────────────────────────────────┤
│  Pattern Classifier                                      │
│  ├─ Hot spot detection                                  │
│  ├─ Sequential scan identification                      │
│  ├─ Temporal pattern recognition                        │
│  └─ Working set analysis                                │
└─────────────────────────────────────────────────────────┘

AI Features

The AI layer extracts the following features from each cache entry:

  1. Access Count (log-normalized): Historical access frequency
  2. Recency: Seconds since last access
  3. Age: Seconds since entry creation
  4. Access Rate: Accesses per second
  5. Access Variance: Pattern regularity measure
  6. Time of Day: Temporal cyclical feature
  7. Size: Entry value size proxy

Admission Decision

The admission policy combines traditional frequency-based scoring with AI predictions:

score = (aiWeight × predictedValue) + (frequencyWeight × frequencyScore)

Default weights: 70% AI, 30% frequency

Installation

Prerequisites

  • Java 21 or higher
  • Gradle 8.0 or higher

Build

# Build the project
./gradlew build

# Run tests
./gradlew test

# Run example
./gradlew run

Usage

Basic Usage

// Create cache with default configuration
AIEnhancedCacheConfig config = AIEnhancedCacheConfig.builder()
        .maximumSize(10_000)
        .enableAI(true)
        .aiWeight(0.7)
        .build();

        AIEnhancedCache<String, String> cache = new AIEnhancedCache<>(config);

// Put entries
cache.

        put("user:123","Alice");
cache.

        put("product:456","Widget");

        // Get entries
        String user = cache.get("user:123");  // Cache hit
        String missing = cache.get("user:999"); // Cache miss (returns null)

// View statistics
System.out.

        println(cache.getStatistics());

Advanced Configuration

AIEnhancedCacheConfig config = AIEnhancedCacheConfig.builder()
    .maximumSize(50_000)           // Maximum cache entries
    .windowSize(500)               // Admission window size (1% recommended)
    .featureHistorySize(20)        // Access history per entry
    .enableAI(true)                // Enable AI predictions
    .aiWeight(0.8)                 // 80% AI, 20% frequency
    .maxTrainingExamples(2000)     // Training set size
    .learningRate(0.01)            // Gradient descent learning rate
    .build();

AIEnhancedCache<Integer, DataObject> cache = new AIEnhancedCache<>(config);

Disable AI (Traditional W-TinyLFU)

AIEnhancedCacheConfig config = AIEnhancedCacheConfig.builder()
    .maximumSize(10_000)
    .enableAI(false)  // Use only frequency-based admission
    .build();

AIEnhancedCache<String, String> cache = new AIEnhancedCache<>(config);

Performance Characteristics

Time Complexity

  • Get: O(1) average
  • Put: O(1) average
  • Eviction: O(1) amortized
  • AI Prediction: O(1) (linear model with fixed features)

Space Complexity

  • Hash Table: O(n) where n is number of entries
  • LRU Queues: O(n) for doubly-linked lists
  • Count-Min Sketch: O(m × d) where m is width, d is depth (typically 4n × 4)
  • ML Model: O(f) where f is number of features (7)
  • Training Buffer: O(t) where t is max training examples

Memory Overhead

Per cache entry:

  • Base entry: ~64 bytes (key, value, metadata)
  • Access history: ~8 bytes × history size
  • Frequency sketch: ~8 bytes (amortized)
  • Total: ~100-200 bytes per entry

Benchmarks

Example workload (10,000 accesses, 80/20 distribution):

Configuration Hit Rate Evictions AI Influence
Traditional (no AI) 65-70% 8,500 0%
AI-Enhanced (70% weight) 72-78% 7,800 60-70%
AI-Enhanced (90% weight) 75-82% 7,200 80-90%

Note: Results vary based on workload characteristics

Components

Core Classes

  • AIEnhancedCache: Main cache implementation
  • CacheEntry: Enhanced entry with AI metadata
  • CountMinSketch: Frequency estimation data structure
  • AIEnhancedAdmissionPolicy: Hybrid admission decision logic

AI Components

  • AIPredictor: Interface for prediction models
  • SimplePredictorModel: Linear regression implementation
  • FeatureExtractor: Feature engineering from cache entries
  • AccessPattern: Pattern classification enum

Configuration

  • AIEnhancedCacheConfig: Builder-based configuration
  • CacheStatistics: Performance metrics tracking

Testing

Run the test suite:

./gradlew test

Run the example program:

./gradlew run

The example demonstrates:

  1. Basic cache operations
  2. Workload simulation with Zipfian distribution
  3. Access pattern detection
  4. AI vs traditional comparison

Design Decisions

Why Linear Regression?

For the prototype, we chose a simple linear regression model because:

  • Fast inference: O(1) prediction time
  • Online learning: Easy to update incrementally
  • Interpretable: Weights show feature importance
  • Low overhead: Minimal memory and computation

Future versions could explore:

  • Gradient boosting for non-linear patterns
  • LSTM networks for temporal sequences
  • Reinforcement learning for policy optimization

Why 70% AI Weight?

The default 70/30 split balances:

  • AI predictions: Capture complex patterns and temporal trends
  • Frequency data: Proven signal from traditional caching
  • Robustness: Falls back to frequency when AI confidence is low

Pattern-Specific Adjustments

The cache applies heuristics based on detected patterns:

  • Sequential scans: 50% penalty (avoid pollution)
  • Hot spots: 30% boost (high value)
  • Temporal patterns: 10% boost (predictable)
  • Working sets: 20% boost (locality)

Limitations

This is a prototype for demonstration purposes:

  1. Single-threaded ML: Model training is not parallelized
  2. Simple features: More sophisticated features could improve accuracy
  3. No persistence: Cache is in-memory only
  4. Basic model: Linear regression is a starting point
  5. No distributed support: Single-node only

Future Enhancements

Potential improvements for production use:

  • Advanced ML models: LSTM, transformers for temporal patterns
  • Reinforcement learning: Learn optimal policies through interaction
  • Distributed training: Federated learning across cache nodes
  • Adaptive features: Automatic feature engineering
  • Multi-objective optimization: Balance hit rate, latency, cost
  • Explainability: Visualize why entries were admitted/evicted

References

  1. TinyLFU Paper: "TinyLFU: A Highly Efficient Cache Admission Policy" (Einziger et al., 2015)
  2. Caffeine: https://github.com/ben-manes/caffeine
  3. Count-Min Sketch: Cormode & Muthukrishnan, 2005
  4. W-TinyLFU: https://highscalability.com/design-of-a-modern-cache/

License

MIT License - See LICENSE file for details

Contributing

This is a prototype for educational and research purposes. Contributions, suggestions, and feedback are welcome!

Author

dhrubacb


Note: This is a prototype implementation for demonstration and learning. For production use, consider battle-tested libraries like Caffeine, Guava Cache, or Ehcache.

About

my-caffeine is a high-performance, near-optimal Java caching library. Inspired by the Caffeine project, it is designed to provide high concurrency and efficient memory management using AI-based eviction policies.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages