You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[AURON #2163] Support native Iceberg scans with residual filters via scan pruning and post-scan native filter (#2164)
# Which issue does this PR close?
Closes#2163
# Rationale for this change
The previous behavior was too conservative for Iceberg scans with
residual filters. Even when the scan could still be executed natively
and the remaining filter logic could be handled above the scan, the
planner would fall back entirely.
This PR improves native coverage for Iceberg reads by:
- preserving correctness for unsupported predicates
- increasing native scan applicability for common filter patterns
- reusing the existing native filter path instead of requiring full
scan-level predicate support up front
This is an incremental improvement to Iceberg native execution, not full
Iceberg feature coverage.
# What changes are included in this PR?
This PR:
- removes the unconditional fallback for Iceberg scans with
non-`alwaysTrue` residual filters
- extends `IcebergScanPlan` to carry `pruningPredicates`
- extracts Iceberg scan filter expressions and converts a supported
subset into Spark expressions
- converts those Spark expressions into native scan pruning predicates
- passes pruning predicates down through `NativeIcebergTableScanExec`
- keeps unsupported predicates on the upper `NativeFilter` path
- adds integration coverage for:
- equality-based pruning
- `IN`-based pruning
- partial pushdown where only part of the predicate is pushed to scan
pruning
## Supported predicate scope in this PR
The scan-pruning conversion added here supports a limited subset of
Iceberg expressions, including:
- `AND`
- `OR`
- `NOT`
- `IS NULL`
- `IS NOT NULL`
- `IS NAN`
- `NOT NAN`
- comparison predicates such as `=`, `!=`, `<`, `<=`, `>`, `>=`
- `IN`
- `NOT IN`
The current implementation intentionally avoids pushing some types
through scan pruning, including:
- `StringType`
- `BinaryType`
- `DecimalType`
Unsupported predicates are not pushed into scan pruning and are instead
left for post-scan native filtering.
# How was this patch tested?
Integration coverage was added in `AuronIcebergIntegrationSuite`
Copy file name to clipboardExpand all lines: thirdparty/auron-iceberg/src/main/scala/org/apache/spark/sql/execution/auron/plan/NativeIcebergTableScanExec.scala
+3-4Lines changed: 3 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -59,6 +59,7 @@ case class NativeIcebergTableScanExec(basedScan: BatchScanExec, plan: IcebergSca
0 commit comments