This competition is part of the September Data Science Training Camp of Sapienza University, organized by Target Reply Roma.
Our challenge was predict the Remaining Useful Life (RUL) of aircraft turbofan engines using time-series sensor data.
The dataset comes from the well-known NASA C-MAPSS simulator, which models how aircraft engines degrade under varying operational conditions. Each engine is run until failure and multivariate time-series data from onboard sensors are collected at each cycle.
In this challenge, our task is to implement and train models that can learn from historical engine runs and generalize to unseen engines, accurately predicting their RUL from sensor data.
For this challenge we work with two specific datasets:
train_challenge.txttest_challenge.txt.
These datasets contain 26 columns in total, representing multivariate time series. Each row corresponds to a single time step for a specific engine unit.
| Column | Description |
|---|---|
| 1 | Unit number (engine identifier) |
| 2 | Time cycles (operational time) |
| 3–5 | Operational settings (3 variables) |
| 6–26 | Sensor measurements (21 sensors) |
Submissions are scored using Mean Squared Error (MSE) between predicted and true RUL values:
where:
-
$N$ = number of engines in the test set -
$\hat{RUL}_i$ = predicted Remaining Useful Life for engine i -
$RUL_i$ = true Remaining Useful Life for engine i
A lower MSE means better predictions.
├── Dataset/ # Dataset folder
│ ├── train_challenge.txt
│ └── test_challenge.txt
├── Models & Functions/
│ ├── models # Hybrid attention-LSTM model for predictions
│ └── preprocessing
│ └── training
├── Notebook/
│ ├── Reply_Challenge.ipynb # Notebook for the submission
├── README.md
└── LICENSE
We propose a hybrid architecture for Remaining Useful Life (RUL) prediction:
- Feature-level Self-Attention (multi-head, residual, LayerNorm) to capture dependencies across sensors.
- Stacked LSTM layers to model temporal dynamics.
- MLP head for final regression output.
- The target variable
RULwas limited to 170. Because the average of the test RULs was 105 in the training set. Other clipping thresholds were tested, but they yielded worse results. - Normalization was applied by performing K-Means clustering on the three operating settings. This allowed the data to be normalized within homogeneous operating conditions, reducing the impact of varying environmental and operational factors across different engines.
- Sliding windows of 70 time steps (stride = 1) were used to segment the multivariate time series. Each window captures the temporal dynamics of sensor signals over 70 consecutive cycles, enabling the model to learn degradation patterns.
The stride of 1 ensures that the windows overlap, maximizing the amount of training data available.
- Adam optimizer with cosine Learning Rate warmup.
- MSE loss, gradient clipping to one to prevent exploding during training, 50% of dropout to prevent model overfitting.
- Exponential Moving Average (EMA) of weights.
- Early stopping based on validation loss.
- Exponential smoothing over the last 8 windows per unit (higher weight to recent data).
- Negative RUL values clipped to 0.
Output: submission.csv with predicted RUL for 259 units.
