A Java-based high-performance caching library that extends the W-TinyLFU algorithm with AI-based retention and eviction policies. Inspired by Caffeine, this prototype demonstrates how machine learning can enhance traditional cache admission policies.
This prototype implements a Window TinyLFU (W-TinyLFU) cache with an AI enhancement layer that predicts future access patterns and makes smarter admission and eviction decisions beyond traditional frequency-based approaches.
- W-TinyLFU Architecture: Three-tiered cache structure (window, probationary, protected)
- AI-Enhanced Admission Policy: ML-based prediction of entry value
- Access Pattern Classification: Identifies hot spots, sequential scans, temporal patterns
- Count-Min Sketch: Space-efficient frequency estimation
- Online Learning: Continuous model improvement based on workload
- Comprehensive Statistics: Detailed metrics for performance analysis
- Thread-Safe: Concurrent access support with lock amortization
┌─────────────────────────────────────────────────────────┐
│ AI-Enhanced Cache │
├─────────────────────────────────────────────────────────┤
│ Window Cache (1%) │
│ ├─ Admission buffer for new entries │
│ └─ LRU eviction │
├─────────────────────────────────────────────────────────┤
│ Probationary Segment (20%) │
│ ├─ Entries promoted from window │
│ └─ Candidates for eviction │
├─────────────────────────────────────────────────────────┤
│ Protected Segment (80%) │
│ ├─ Hot entries with proven value │
│ └─ Promoted from probationary on re-access │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ AI Enhancement Layer │
├─────────────────────────────────────────────────────────┤
│ Count-Min Sketch │
│ ├─ Frequency estimation (4-bit counters) │
│ └─ Periodic aging mechanism │
├─────────────────────────────────────────────────────────┤
│ ML Predictor │
│ ├─ Feature extraction (7 features) │
│ ├─ Linear regression model │
│ └─ Online gradient descent training │
├─────────────────────────────────────────────────────────┤
│ Pattern Classifier │
│ ├─ Hot spot detection │
│ ├─ Sequential scan identification │
│ ├─ Temporal pattern recognition │
│ └─ Working set analysis │
└─────────────────────────────────────────────────────────┘
The AI layer extracts the following features from each cache entry:
- Access Count (log-normalized): Historical access frequency
- Recency: Seconds since last access
- Age: Seconds since entry creation
- Access Rate: Accesses per second
- Access Variance: Pattern regularity measure
- Time of Day: Temporal cyclical feature
- Size: Entry value size proxy
The admission policy combines traditional frequency-based scoring with AI predictions:
score = (aiWeight × predictedValue) + (frequencyWeight × frequencyScore)
Default weights: 70% AI, 30% frequency
- Java 21 or higher
- Gradle 8.0 or higher
# Build the project
./gradlew build
# Run tests
./gradlew test
# Run example
./gradlew run// Create cache with default configuration
AIEnhancedCacheConfig config = AIEnhancedCacheConfig.builder()
.maximumSize(10_000)
.enableAI(true)
.aiWeight(0.7)
.build();
AIEnhancedCache<String, String> cache = new AIEnhancedCache<>(config);
// Put entries
cache.
put("user:123","Alice");
cache.
put("product:456","Widget");
// Get entries
String user = cache.get("user:123"); // Cache hit
String missing = cache.get("user:999"); // Cache miss (returns null)
// View statistics
System.out.
println(cache.getStatistics());AIEnhancedCacheConfig config = AIEnhancedCacheConfig.builder()
.maximumSize(50_000) // Maximum cache entries
.windowSize(500) // Admission window size (1% recommended)
.featureHistorySize(20) // Access history per entry
.enableAI(true) // Enable AI predictions
.aiWeight(0.8) // 80% AI, 20% frequency
.maxTrainingExamples(2000) // Training set size
.learningRate(0.01) // Gradient descent learning rate
.build();
AIEnhancedCache<Integer, DataObject> cache = new AIEnhancedCache<>(config);AIEnhancedCacheConfig config = AIEnhancedCacheConfig.builder()
.maximumSize(10_000)
.enableAI(false) // Use only frequency-based admission
.build();
AIEnhancedCache<String, String> cache = new AIEnhancedCache<>(config);- Get: O(1) average
- Put: O(1) average
- Eviction: O(1) amortized
- AI Prediction: O(1) (linear model with fixed features)
- Hash Table: O(n) where n is number of entries
- LRU Queues: O(n) for doubly-linked lists
- Count-Min Sketch: O(m × d) where m is width, d is depth (typically 4n × 4)
- ML Model: O(f) where f is number of features (7)
- Training Buffer: O(t) where t is max training examples
Per cache entry:
- Base entry: ~64 bytes (key, value, metadata)
- Access history: ~8 bytes × history size
- Frequency sketch: ~8 bytes (amortized)
- Total: ~100-200 bytes per entry
Example workload (10,000 accesses, 80/20 distribution):
| Configuration | Hit Rate | Evictions | AI Influence |
|---|---|---|---|
| Traditional (no AI) | 65-70% | 8,500 | 0% |
| AI-Enhanced (70% weight) | 72-78% | 7,800 | 60-70% |
| AI-Enhanced (90% weight) | 75-82% | 7,200 | 80-90% |
Note: Results vary based on workload characteristics
- AIEnhancedCache: Main cache implementation
- CacheEntry: Enhanced entry with AI metadata
- CountMinSketch: Frequency estimation data structure
- AIEnhancedAdmissionPolicy: Hybrid admission decision logic
- AIPredictor: Interface for prediction models
- SimplePredictorModel: Linear regression implementation
- FeatureExtractor: Feature engineering from cache entries
- AccessPattern: Pattern classification enum
- AIEnhancedCacheConfig: Builder-based configuration
- CacheStatistics: Performance metrics tracking
Run the test suite:
./gradlew testRun the example program:
./gradlew runThe example demonstrates:
- Basic cache operations
- Workload simulation with Zipfian distribution
- Access pattern detection
- AI vs traditional comparison
For the prototype, we chose a simple linear regression model because:
- Fast inference: O(1) prediction time
- Online learning: Easy to update incrementally
- Interpretable: Weights show feature importance
- Low overhead: Minimal memory and computation
Future versions could explore:
- Gradient boosting for non-linear patterns
- LSTM networks for temporal sequences
- Reinforcement learning for policy optimization
The default 70/30 split balances:
- AI predictions: Capture complex patterns and temporal trends
- Frequency data: Proven signal from traditional caching
- Robustness: Falls back to frequency when AI confidence is low
The cache applies heuristics based on detected patterns:
- Sequential scans: 50% penalty (avoid pollution)
- Hot spots: 30% boost (high value)
- Temporal patterns: 10% boost (predictable)
- Working sets: 20% boost (locality)
This is a prototype for demonstration purposes:
- Single-threaded ML: Model training is not parallelized
- Simple features: More sophisticated features could improve accuracy
- No persistence: Cache is in-memory only
- Basic model: Linear regression is a starting point
- No distributed support: Single-node only
Potential improvements for production use:
- Advanced ML models: LSTM, transformers for temporal patterns
- Reinforcement learning: Learn optimal policies through interaction
- Distributed training: Federated learning across cache nodes
- Adaptive features: Automatic feature engineering
- Multi-objective optimization: Balance hit rate, latency, cost
- Explainability: Visualize why entries were admitted/evicted
- TinyLFU Paper: "TinyLFU: A Highly Efficient Cache Admission Policy" (Einziger et al., 2015)
- Caffeine: https://github.com/ben-manes/caffeine
- Count-Min Sketch: Cormode & Muthukrishnan, 2005
- W-TinyLFU: https://highscalability.com/design-of-a-modern-cache/
MIT License - See LICENSE file for details
This is a prototype for educational and research purposes. Contributions, suggestions, and feedback are welcome!
dhrubacb
Note: This is a prototype implementation for demonstration and learning. For production use, consider battle-tested libraries like Caffeine, Guava Cache, or Ehcache.