Skip to content

fix: use endoflife.date for all EOL data, fix Temporal payload limit, and improve classification#17

Merged
bakayolo merged 8 commits intomainfrom
fix/snapshot-payload-size
Apr 15, 2026
Merged

fix: use endoflife.date for all EOL data, fix Temporal payload limit, and improve classification#17
bakayolo merged 8 commits intomainfrom
fix/snapshot-payload-size

Conversation

@bakayolo
Copy link
Copy Markdown
Collaborator

@bakayolo bakayolo commented Apr 15, 2026

Summary

Fixes the detection pipeline to produce real, usable compliance data for EKS and ElastiCache using only endoflife.date for EOL data — no AWS API credentials needed.

What changed

Temporal payload fix

  • Removed RetrieveFindings activity that returned all findings through Temporal gRPC (12K+ findings = 10MB, exceeds 4MB limit)
  • CreateSnapshot now reads directly from the in-memory store — findings never transit Temporal
  • Fails hard on empty snapshots (all workflows failed) or store read errors

EOL data from endoflife.date only

  • ProductCycle.EOL and .Support changed from string to any (endoflife.date returns booleans or date strings)
  • Enabled EKS via endoflife.date (was blocked by ProductsWithNonStandardSchema)
  • Fixed aurora-postgresql mapping to amazon-aurora-postgresql
  • Added aurora-mysql mapping (pending endoflife.date#9534)
  • Removed ProductsWithNonStandardSchema blocklist and dead code

Version matching

  • EOL provider uses prefix matching: cycle 8.0 matches resource version 8.0.35
  • Policy uses prefix matching with k8s- prefix normalization

Classification

  • Extended support is now YELLOW, not RED (check order fix + policy fix)
  • EKS 1.30/1.32 correctly show "in extended support (6x standard cost)"
  • ElastiCache Redis 5.0.6 correctly shows extended support

Validated with docker-compose (real Wiz data)

Resource Findings Yellow Green Unknown
EKS 155 90 (1.30, 1.32 in ext. support) 65 0
ElastiCache 3,974 138 (Redis 5.0.6) 3,739 97
Aurora MySQL 12,238 0 0 12,238 (no EOL source yet)

Single-pod assumption

The in-memory store is shared between detection and snapshot activities because everything runs on a single Temporal worker. This is intentional — we don't anticipate scaling this service to multiple pods.

@bakayolo bakayolo force-pushed the fix/snapshot-payload-size branch from 429a645 to e96b271 Compare April 15, 2026 20:05
…d limit

The RetrieveFindings activity returned all findings as a single Temporal
activity result, which exceeded the 4MB gRPC message limit for large
inventories (12K+ Aurora clusters = ~10MB serialized).

Instead of passing findings through Temporal payloads, CreateSnapshot now
reads directly from the in-memory store. Findings stay within the worker
process and never transit Temporal's gRPC layer.

- Remove RetrieveFindings activity and its registration
- CreateSnapshot accepts ResourceTypes and reads from store itself
- Orchestrator passes successful resource types instead of findings
- All detection workflows still run in parallel via child workflows

Amp-Thread-ID: https://ampcode.com/threads/T-019d92b6-b80d-731a-8a83-64e6442ae52c
Co-authored-by: Amp <amp@ampcode.com>
@bakayolo bakayolo force-pushed the fix/snapshot-payload-size branch from e96b271 to a69cdaa Compare April 15, 2026 20:20
bakayolo and others added 5 commits April 15, 2026 14:35
endoflife.date returns EOL and Support as either date strings or booleans.
The ProductCycle struct typed them as string, causing JSON unmarshal errors
for products like ElastiCache Redis where eol: false.

Also, endoflife.date uses coarse cycles (e.g., '8.0', '7') while Wiz
reports full versions (e.g., '8.0.35', '7.1.0'). Both the EOL provider
and policy now use prefix matching to find the right cycle.

- Change ProductCycle.EOL and .Support from string to any
- Add anyToDateString helper to extract dates from any-typed fields
- Add prefix-based version matching in GetVersionLifecycle
- Add versionMatches to policy for cycle-to-version comparison
- Fix EKS provider to handle EOL as any type

Results: ElastiCache went from 0% to 94.1% compliance with real EOL data.
Amp-Thread-ID: https://ampcode.com/threads/T-019d92b6-b80d-731a-8a83-64e6442ae52c
Co-authored-by: Amp <amp@ampcode.com>
- Unblock EKS from ProductsWithNonStandardSchema — endoflife.date data
  works correctly (eol=end of standard support, extendedSupport=true EOL)
- Fix aurora-postgresql mapping to amazon-aurora-postgresql (was wrong)
- Remove aurora-mysql mapping (no endoflife.date product exists; needs AWS API)
- Handle k8s- version prefix in policy version matching
- EKS: 41.9% compliance (65 GREEN, 90 RED), 0 UNKNOWN
- ElastiCache: 94.1% compliance

Amp-Thread-ID: https://ampcode.com/threads/T-019d92b6-b80d-731a-8a83-64e6442ae52c
Co-authored-by: Amp <amp@ampcode.com>
Resources past standard support but still in extended support were
incorrectly classified as RED because:
1. convertCycle checked EOL date before extended support window
2. Policy treated IsDeprecated as RED even when IsExtendedSupport was set

Now: extended support check runs first, and policy excludes extended
support resources from RED status. EKS k8s 1.30/1.32 correctly show
as YELLOW with 'in extended support (6x standard cost)' message.

Amp-Thread-ID: https://ampcode.com/threads/T-019d92b6-b80d-731a-8a83-64e6442ae52c
Co-authored-by: Amp <amp@ampcode.com>
Keep aurora-mysql in ProductMapping with a TODO referencing the pending
endoflife.date PR #9534. Remove the now-empty ProductsWithNonStandardSchema
blocklist and its guard code. All EOL data comes from endoflife.date.

Amp-Thread-ID: https://ampcode.com/threads/T-019d92b6-b80d-731a-8a83-64e6442ae52c
Co-authored-by: Amp <amp@ampcode.com>
- Fail if all detection workflows fail (no empty snapshots)
- Fail on store read errors in CreateSnapshot (was silently skipping)
- Remove unused SignalActWorkflowInput
- Remove stale Stage 3 / non-standard schema / EKSEOLProvider comments
- Add kubernetes to required engines test

Amp-Thread-ID: https://ampcode.com/threads/T-019d92b6-b80d-731a-8a83-64e6442ae52c
Co-authored-by: Amp <amp@ampcode.com>
@bakayolo bakayolo force-pushed the fix/snapshot-payload-size branch from 164a1d8 to 6dc70cc Compare April 15, 2026 22:46
- Replace 'hybrid EOL' (AWS APIs + endoflife.date) with endoflife.date only
- Add supported resources table with status, inventory source, and EOL source
- Fix workflow start command (OrchestratorWorkflow, not VersionGuardOrchestratorWorkflow)
- Remove AWS credentials from prerequisites
- Document how to add new resource types (2 steps)

Co-authored-by: Amp <amp@ampcode.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019d92b6-b80d-731a-8a83-64e6442ae52c
@bakayolo bakayolo force-pushed the fix/snapshot-payload-size branch from 6dc70cc to 4900fb5 Compare April 15, 2026 22:47
@bakayolo bakayolo changed the title fix: eliminate RetrieveFindings activity to avoid Temporal 4MB payload limit fix: use endoflife.date for all EOL data, fix Temporal payload limit, and improve classification Apr 15, 2026
@bakayolo bakayolo merged commit 484b10a into main Apr 15, 2026
7 checks passed
@bakayolo bakayolo deleted the fix/snapshot-payload-size branch April 15, 2026 22:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants