Xyseries command implementation by asifabashar · Pull Request #5343 · opensearch-project/sql

asifabashar · 2026-04-14T02:06:10Z

Description

Problem Statement
xyseries command is missing in PPL which is part of todo in roadmap

This RFC proposes adding a new PPL transforming command, xyseries, with syntax and behavior aligned with SPL .

xyseries converts row-oriented grouped results into a wide table where:

One field is the X axis (x-field) and stays as a row key.
One field (y-name-field) provides a part of column name from proivded paramter values to choose from and used as pivot values in conjuction with (y-data-field) field name , where data for this column are pivoted cells that are agrregated value fields.

OpenSearch PPL already supports stats, chart, and timechart, but there is no direct equivalent for SPL xyseries.

Adding xyseries improves:

Query readability for pivot-style result shaping.
Dashboard interoperability for charts requiring wide-table series layout.

Current State
There is no xyseries compatiable command.

Long-Term Goals
Provide xyseries functionality.

Proposal
Summarizes the suggested solution or improvement.

Approach

User Syntax

xyseries  [sep=<string>] [format=<string>] <x-field> <y-name-field> in (<value1>, <value2>, ...) <y-data-field>[,<y-data-field2>

Arguments

x-field (required): row key in output.
y-name-field (required): Number of values used to generate output series column names based on provided in paramter values such as <value1>, <value2> etc
y-data-field... (required, at least one): value field(s) used to fill cells.

Options

Option	Default	Behavior
`sep`	`":"`	Separator between `y-data-field` name and 'in' parameter values , etc.
`format`	None	Naming template for output series columns represented as `$AGG$<sep>$VAL$` . For each data row pivot value for `y-name-field` and $AGG$ (`y-data-field` name). The format parameter inserts the (`y-data-field` name) uses “:” as the built-in separator and then pivot value. $VAL$

If format is omitted:

Single y-data-field: output column is based row number .
Multiple y-data-field: output column is $AGG$<sep>$VAL$ where $VAL$ represent provided pivot values in 'in' operator.

Examples

Actual data:
ppl
 source=weblogs| stats count(host) as host_cnt, count(method) as method_cnt by url, response

host_cnt	method_cnt	response	url
3	3	200	/page1
1	1	404	/page1
5	5	200	/page2
2	2	500	/page2

After xyseries transformation:
ppl

source=weblogs
| stats count(host) as host_cnt, count(method) as method_cnt by url, response
| xyseries url response in ("200","404","500") host_cnt, method_cnt

url	host_cnt: 200	host_cnt: 404	host_cnt: 500	method_cnt: 200	method_cnt: 404	method_cnt: 500
/page1	3	1	null	3	1	null
/page2	5	null	2	5	null	2

If format is not specified, by default, you’d see something like:
url:count(host) and url:count(method) as column names.

If you provide your own separator through the format option, it overrides anything defined with sep. If for example, sep is set to “-”, but format specifies “+”, and because format has higher priority, the “+” is used.
Here, $VAL$ and $AGG$ are placeholders represent and the respectively. In the output, you can see that the name field (url) and the data field count(host) may appear in the position of $VAL$ and $AGG$, depending on how the format string arranges them.

Semantics

Input shape and type rules

Support xyseries only when pivot values are explicitly provided.

x-field and y-name-field must exist in input schema..
y-name-field values are converted to string for output column naming.
y-data-field One or more fields that contain the data to chart. If there are multiple fields specified, separate the field names with commas.

Null handling

Rows with null in a given y-data-field do not contribute a value for that generated series column.

Implementation Plan

Validate required fields/options and defaults.
Project to x, y_name selected from provided pivot values in 'in' operator, and selected y_data fields.
Build generated series column name using format or default naming with sep .
Pivot to wide schema based on provided pivot values passed as parameter.

Today our direction is to keep PPL commands translatable to SQL/Calcite plans as much as possible. We also plan to run PPL on Spark; SQL-native xyseries (with explicit in (...) values) is portable to Spark SQL, while post-processing would require separate Spark-side custom implementation.
Composability: if xyseries is implemented after query execution, it effectively must be terminal and cannot be reliably chained with downstream commands. (Reference: Comments from penghuo )

out of scope

grouping option which does not apply for multifile input in OpenSearch.
Alternative
chart , timeseries can be used for similar use cases but not exactly same.

Limitations

As the rows will be transposed to columns, a limit is required field as calcite planning phase number of rows are not known.

Related Issues

Resolves [ #5142 ]

Check List

[ x] New functionality includes testing.
[x ] New functionality has been documented.
[x ] New functionality has javadoc added.
New functionality has a user manual doc added.
[x ] New PPL command checklist all confirmed.
API changes companion pull request created.
[ x] Commits are signed per the DCO using --signoff or -s.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Asif Bashar <asif.bashar@gmail.com>

github-actions · 2026-04-14T02:07:52Z

PR Reviewer Guide 🔍

(Review updated until commit `1858a98`)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes Sub-PR theme: Xyseries AST node, grammar, and parser Relevant files: core/src/main/java/org/opensearch/sql/ast/tree/Xyseries.java core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java core/src/main/java/org/opensearch/sql/analysis/Analyzer.java ppl/src/main/antlr/OpenSearchPPLLexer.g4 ppl/src/main/antlr/OpenSearchPPLParser.g4 ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java ppl/src/test/java/org/opensearch/sql/ppl/parser/AstBuilderTest.java Sub-PR theme: Xyseries Calcite execution, integration tests, and documentation Relevant files: core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteXySeriesCommandIT.java integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java integ-test/src/test/java/org/opensearch/sql/ppl/NewAddedCommandsIT.java integ-test/src/test/java/org/opensearch/sql/security/CrossClusterSearchIT.java docs/user/ppl/cmd/xyseries.md
⚡ Recommended focus areas for review Empty Pivot Values If `pivotValues` is empty (no values in the IN clause), the pivot operation will produce only the x-field column with no pivoted columns. There is no validation or error thrown for an empty pivot values list, which could lead to confusing results or downstream errors. List<String> pivotValues = node.getPivotValues() != null ? node.getPivotValues() : List.of(); String separator = node.getSeparator(); String format = node.getFormat(); // Build the pivot axis - cast to VARCHAR if needed for string comparison RexNode yNameRef = b.field(yNameFieldName); RexNode axis; if (!SqlTypeUtil.isCharacter(yNameRef.getType())) { RelDataType varchar = rx.getTypeFactory() .createTypeWithNullability( rx.getTypeFactory().createSqlType(SqlTypeName.VARCHAR), true); axis = rx.makeCast(varchar, yNameRef, true); } else { axis = yNameRef; } // Build aggregate calls - MAX for each y-data field List<AggCall> aggCalls = yDataFieldNames.stream() .map(name -> b.max(b.field(name)).as(name)) .collect(Collectors.toList()); // Build pivot value entries: alias -> [literal(value)] // LinkedHashMap preserves insertion order for deterministic column ordering LinkedHashMap<String, List<RexNode>> pivotValueMap = new LinkedHashMap<>(); for (String val : pivotValues) { pivotValueMap.put(val, ImmutableList.of(b.literal(val))); } // Execute pivot: decomposes into GROUP BY x-field with FILTER-based aggregation // Produces columns: x-field, {val1}_{agg1}, {val1}_{agg2}, {val2}_{agg1}, ... b.pivot( b.groupKey(b.field(xFieldName)), aggCalls, ImmutableList.of(axis), pivotValueMap.entrySet()); Pivot Column Name Assumption The reorder projection step assumes pivot output columns are named `{value}_{agg}` (line 3458). This relies on Calcite's internal pivot column naming convention, which may not be stable across versions or may differ for special characters in values/field names. If the naming convention changes or values contain underscores, column lookup by `b.field(pivotColName)` will fail at runtime. for (String aggName : yDataFieldNames) { for (String pivotVal : pivotValues) { // Reference pivot output column by its generated name: {value}_{agg} String pivotColName = pivotVal + "_" + aggName; reorderProjections.add(b.field(pivotColName)); reorderNames.add(generateColumnName(aggName, pivotVal, separator, format)); } } Weak Test Assertions The `testXyseriesCommand` tests in `NewAddedCommandsIT` catch `ResponseException` and call `verifyQuery(result)` on the error response, meaning the test passes even if the command fails with an error. The tests do not assert on actual output schema or data values, making them ineffective for validating correctness. public void testXyseriesCommand() throws IOException { JSONObject result; try { result = executeQuery( StringEscapeUtils.escapeJson( String.format( "search source=%s \| stats avg(balance) as avg_balance by gender, state" + " \| xyseries state gender in (\"F\", \"M\") avg_balance", TEST_INDEX_BANK))); } catch (ResponseException e) { result = new JSONObject(TestUtils.getResponseBody(e.getResponse())); } verifyQuery(result); } @Test public void testXyseriesCommandMultipleDataFields() throws IOException { JSONObject result; try { result = executeQuery( StringEscapeUtils.escapeJson( String.format( "search source=%s \| stats avg(balance) as avg_balance, count() as cnt by" + " gender, state \| xyseries state gender in (\"F\", \"M\") avg_balance," + " cnt", TEST_INDEX_BANK))); } catch (ResponseException e) { result = new JSONObject(TestUtils.getResponseBody(e.getResponse())); } verifyQuery(result); } @Test public void testXyseriesCommandWithSep() throws IOException { JSONObject result; try { result = executeQuery( StringEscapeUtils.escapeJson( String.format( "search source=%s \| stats avg(balance) as avg_balance by gender, state" + " \| xyseries sep=\"-\" state gender in (\"F\", \"M\") avg_balance", TEST_INDEX_BANK))); } catch (ResponseException e) { result = new JSONObject(TestUtils.getResponseBody(e.getResponse())); } verifyQuery(result); } @Test public void testXyseriesCommandWithFormat() throws IOException { JSONObject result; try { result = executeQuery( StringEscapeUtils.escapeJson( String.format( "search source=%s \| stats avg(balance) as avg_balance by gender, state" + " \| xyseries format=\"$VAL$_$AGG$\" state gender in (\"F\", \"M\")" + " avg_balance", TEST_INDEX_BANK))); } catch (ResponseException e) { result = new JSONObject(TestUtils.getResponseBody(e.getResponse())); } verifyQuery(result); } Null Separator Handling* In `generateColumnName`, if `format` is null the method uses `separator` directly. However, if `separator` is also null (e.g., due to unexpected AST construction), a NullPointerException will occur. The `Xyseries` AST node does not enforce non-null separator at construction time. private String generateColumnName( String yDataFieldName, String pivotValue, String separator, String format) { if (format != null) { return format.replace("$AGG$", yDataFieldName).replace("$VAL$", pivotValue); } return yDataFieldName + separator + pivotValue; }

github-actions · 2026-04-14T02:08:29Z

PR Code Suggestions ✨

Latest suggestions up to 1858a98

Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Use actual pivot output column names instead of assumed names The pivot column name is constructed as `pivotVal + "_" + aggName`, but Calcite's pivot output column naming convention may differ (e.g., it may use a different delimiter or ordering). If the generated pivot column name doesn't match what Calcite actually produces, `b.field(pivotColName)` will throw a runtime exception. You should verify the exact column names produced by Calcite's pivot operation and use those names, or use positional references instead of name-based lookups. core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [3455-3462] +RelNode pivotRel = b.peek(); +List<String> pivotOutputNames = pivotRel.getRowType().getFieldNames(); +// Skip the x-field (first column) +int colIdx = 1; for (String aggName : yDataFieldNames) { for (String pivotVal : pivotValues) { - // Reference pivot output column by its generated name: {value}_{agg} - String pivotColName = pivotVal + "_" + aggName; + String pivotColName = pivotOutputNames.get(colIdx++); reorderProjections.add(b.field(pivotColName)); reorderNames.add(generateColumnName(aggName, pivotVal, separator, format)); } } Suggestion importance[1-10]: 7 __ Why: The assumption that Calcite's pivot produces columns named `{value}_{agg}` may not hold in all versions or configurations. If the naming convention differs, `b.field(pivotColName)` will throw a runtime exception. Using actual output column names from `pivotRel.getRowType().getFieldNames()` is more robust.	Medium
General	Don't silently swallow query execution failures in tests Catching `ResponseException` and continuing to call `verifyQuery(result)` on an error response means the test will pass even when the query fails with an error. The test should either not catch the exception (letting it fail the test) or explicitly assert that the result is not an error response before verifying it. integ-test/src/test/java/org/opensearch/sql/ppl/NewAddedCommandsIT.java [538-552] -JSONObject result; -try { - result = - executeQuery( - StringEscapeUtils.escapeJson(...)); -} catch (ResponseException e) { - result = new JSONObject(TestUtils.getResponseBody(e.getResponse())); -} +JSONObject result = + executeQuery( + StringEscapeUtils.escapeJson( + String.format( + "search source=%s \| stats avg(balance) as avg_balance by gender, state" + + " \| xyseries state gender in (\"F\", \"M\") avg_balance", + TEST_INDEX_BANK))); verifyQuery(result); Suggestion importance[1-10]: 6 __ Why: Catching `ResponseException` and calling `verifyQuery(result)` on an error response means the test can pass even when the query fails. Removing the try-catch would make the test correctly fail on query errors, improving test reliability.	Low
	Validate that pivot values list is not empty If `pivotValues` is empty, the inner loop body never executes, but `b.project(reorderProjections, reorderNames, true)` will still be called with only the x-field projection. This results in a plan that silently drops all data columns rather than failing with a meaningful error. An empty pivot values list should be validated and rejected early with a clear error message. core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [3409] List<String> pivotValues = node.getPivotValues() != null ? node.getPivotValues() : List.of(); -... -for (String aggName : yDataFieldNames) { - for (String pivotVal : pivotValues) { - String pivotColName = pivotVal + "_" + aggName; - reorderProjections.add(b.field(pivotColName)); +if (pivotValues.isEmpty()) { + throw new SemanticCheckException("xyseries requires at least one pivot value in the IN clause"); +} Suggestion importance[1-10]: 5 __ Why: An empty `pivotValues` list would silently produce a plan with only the x-field column, which is misleading. Adding an early validation with a clear error message improves user experience, though the grammar already requires at least one value in the `IN` clause.	Low
	Guard against null pivot values context If `ctx.xyseriesPivotValues()` returns `null` (e.g., due to a parse error or optional grammar rule), calling `.stringLiteral()` on it will throw a `NullPointerException`. A null check or grammar-level enforcement should be added to handle this case gracefully. ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java [1674-1677] List<String> pivotValues = - ctx.xyseriesPivotValues().stringLiteral().stream() - .map(s -> StringUtils.unquoteText(s.getText())) - .collect(Collectors.toList()); + ctx.xyseriesPivotValues() == null + ? List.of() + : ctx.xyseriesPivotValues().stringLiteral().stream() + .map(s -> StringUtils.unquoteText(s.getText())) + .collect(Collectors.toList()); Suggestion importance[1-10]: 3 __ Why: The grammar rule `xyseriesPivotValues` is required (not optional) in the `xyseriesCommand` rule, so `ctx.xyseriesPivotValues()` should never be null in practice. This is a low-impact defensive check.	Low

Previous suggestions

Suggestions up to commit 13ddcb5

Category	Suggestion	Impact
Possible issue	Avoid fragile pivot column name assumptions The pivot output column name is hardcoded as `pivotVal + "_" + aggName`, but Calcite's `RelBuilder.pivot()` may generate column names in a different format (e.g., `aggName_pivotVal` or with different delimiters depending on the Calcite version). If the generated name doesn't match, `b.field(pivotColName)` will throw a runtime exception. The actual generated column name should be verified against Calcite's pivot output naming convention, or the columns should be referenced by index rather than by name. core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [3457-3459] -// Reference pivot output column by its generated name: {value}_{agg} -String pivotColName = pivotVal + "_" + aggName; -reorderProjections.add(b.field(pivotColName)); +// Reference pivot output column by index to avoid dependency on Calcite's internal naming +// Pivot produces: xField, then for each pivotVal: agg1, agg2, ... +// So column index = 1 + (pivotValIndex * yDataFieldNames.size()) + aggIndex +int pivotValIndex = pivotValues.indexOf(pivotVal); +int aggIndex = yDataFieldNames.indexOf(aggName); +int colIndex = 1 + pivotValIndex * yDataFieldNames.size() + aggIndex; +reorderProjections.add(b.field(colIndex)); Suggestion importance[1-10]: 6 __ Why: The hardcoded `pivotVal + "_" + aggName` naming convention for pivot output columns may not match Calcite's actual generated column names, which could cause runtime exceptions. However, the integration tests appear to pass with this naming, suggesting it may work with the current Calcite version, making this a moderate concern.	Low
	Test silently swallows query execution failures The test silently catches `ResponseException` and continues to call `verifyQuery(result)` on the error response body. This means if the query fails (e.g., because Calcite is not enabled), the test will pass as long as `verifyQuery` doesn't fail on the error JSON, masking real failures. The exception should either be rethrown or the test should assert that no error occurred. integ-test/src/test/java/org/opensearch/sql/ppl/NewAddedCommandsIT.java [538-552] -JSONObject result; -try { - result = - executeQuery( - StringEscapeUtils.escapeJson( - String.format( - "search source=%s \| stats avg(balance) as avg_balance by gender, state" - + " \| xyseries state gender in (\"F\", \"M\") avg_balance", - TEST_INDEX_BANK))); - -} catch (ResponseException e) { - result = new JSONObject(TestUtils.getResponseBody(e.getResponse())); -} +JSONObject result = + executeQuery( + StringEscapeUtils.escapeJson( + String.format( + "search source=%s \| stats avg(balance) as avg_balance by gender, state" + + " \| xyseries state gender in (\"F\", \"M\") avg_balance", + TEST_INDEX_BANK))); verifyQuery(result); Suggestion importance[1-10]: 5 __ Why: The catch block silently swallows `ResponseException` and passes the error JSON to `verifyQuery`, which could mask real test failures. However, looking at the context, this pattern appears intentional to handle both Calcite-enabled and non-Calcite environments, so removing it entirely may break the test's design intent.	Low
	Guard against null separator causing NullPointerException The `separator` field is declared as `final` and set via the constructor, but the default value `": "` is only enforced in the `AstBuilder` parser. If `Xyseries` is constructed directly (e.g., in tests or other code paths) with a `null` separator, `generateColumnName` in `CalciteRelNodeVisitor` will produce a `NullPointerException` when concatenating `yDataFieldName + separator + pivotValue`. A null-check or default fallback should be added in `generateColumnName`. core/src/main/java/org/opensearch/sql/ast/tree/Xyseries.java [3482-3488] -/** Separator between y-data-field name and pivot value in column names. Default ":". */ -private final String separator; +private String generateColumnName( + String yDataFieldName, String pivotValue, String separator, String format) { + if (format != null) { + return format.replace("$AGG$", yDataFieldName).replace("$VAL$", pivotValue); + } + String sep = separator != null ? separator : ": "; + return yDataFieldName + sep + pivotValue; +} Suggestion importance[1-10]: 5 __ Why: The `separator` field could be `null` if `Xyseries` is constructed directly without going through `AstBuilder`, which would cause a `NullPointerException` in `generateColumnName`. Adding a null-check fallback is a valid defensive measure, though the risk is limited since the parser enforces the default.	Low
General	Document aggregation assumption to prevent silent data errors Using `MAX` hardcoded for all y-data fields may produce incorrect results when the upstream data has multiple rows per group (e.g., for `count` fields, `MAX` instead of `SUM` would give wrong answers). The aggregation function used in the pivot should match the semantic intent, or the command should document that it only works correctly when the input already has at most one row per (x-field, y-name-field) combination. core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [3427-3430] +// Note: xyseries assumes input is already aggregated to at most one row per +// (xField, yNameField) group (e.g., output of a preceding stats command). +// MAX is used as a passthrough aggregation; if multiple rows exist per group, +// results may be incorrect. List<AggCall> aggCalls = yDataFieldNames.stream() .map(name -> b.max(b.field(name)).as(name)) .collect(Collectors.toList()); Suggestion importance[1-10]: 2 __ Why: This suggestion only adds a comment to document an existing assumption without changing any code logic. The `improved_code` is functionally identical to the `existing_code`, making it a documentation-only change with minimal impact.	Low

Suggestions up to commit a72ffe1

Category	Suggestion	Impact
Possible issue	Fix tests silently swallowing query failures Silently catching `ResponseException` and continuing to call `verifyQuery` on the error response means the test will pass even when the query fails with an error. This pattern masks failures — if the non-Calcite engine throws `getOnlyForCalciteException`, the test should either assert the error or skip, not silently accept it as a valid result. integ-test/src/test/java/org/opensearch/sql/ppl/NewAddedCommandsIT.java [538-552] JSONObject result; try { result = executeQuery( StringEscapeUtils.escapeJson(...)); + verifyQuery(result); } catch (ResponseException e) { result = new JSONObject(TestUtils.getResponseBody(e.getResponse())); + // xyseries is only supported with Calcite; verify the error is the expected one + Assert.assertTrue( + result.optString("details", "").contains("xyseries") + \|\| result.optString("type", "").contains("SemanticCheckException")); } -verifyQuery(result); Suggestion importance[1-10]: 5 __ Why: The concern is valid — catching `ResponseException` and calling `verifyQuery` on an error response could mask failures. However, the `improved_code` uses `...` placeholders and doesn't accurately reflect the actual code structure, making it hard to evaluate precisely. The pattern of accepting both success and expected error responses is intentional for non-Calcite engine compatibility tests.	Low
Possible issue	Guard against missing pivot column references The pivot output column name is hardcoded as `{value}_{agg}` with an underscore separator, but this naming convention depends on Calcite's internal pivot implementation and may not always match. If Calcite uses a different delimiter or ordering, `b.field(pivotColName)` will throw a runtime exception. The actual generated column name should be verified against Calcite's pivot output, or the column should be referenced by position rather than by name. core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [3457-3460] // Reference pivot output column by its generated name: {value}_{agg} +// Calcite pivot generates columns as: {value}_{agg} String pivotColName = pivotVal + "_" + aggName; -reorderProjections.add(b.field(pivotColName)); +try { + reorderProjections.add(b.field(pivotColName)); +} catch (IllegalArgumentException e) { + throw new IllegalStateException( + "Pivot column '" + pivotColName + "' not found in pivot output. " + + "Available fields: " + b.peek().getRowType().getFieldNames(), e); +} Suggestion importance[1-10]: 4 __ Why: The concern about Calcite's internal pivot column naming is valid, but wrapping with a try-catch for `IllegalArgumentException` is not the cleanest approach. The improved code adds error handling that would help diagnose issues, but this is a minor defensive improvement rather than a critical bug fix.	Low
General	Throw on unsupported field expression types The fallback `expr.toString()` for unrecognized expression types will silently produce an incorrect field name (e.g., the full expression string), causing a runtime failure when `b.field(...)` is called with that name. An explicit exception should be thrown for unsupported expression types to surface the problem clearly. core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [3472-3480] private String resolveFieldName(UnresolvedExpression expr) { if (expr instanceof Field) { return ((Field) expr).getField().toString(); } if (expr instanceof Alias) { return ((Alias) expr).getName(); } - return expr.toString(); + throw new IllegalArgumentException( + "Unsupported expression type for xyseries field resolution: " + + expr.getClass().getSimpleName() + " (" + expr + ")"); } Suggestion importance[1-10]: 5 __ Why: Replacing the silent `expr.toString()` fallback with an explicit exception is a good defensive practice that surfaces bugs early. The improved code accurately reflects the change and would make debugging easier when unsupported expression types are encountered.	Low
General	Validate non-empty pivot values list If `pivotValues` is empty, the pivot operation will produce no pivoted columns, and the subsequent `b.field(pivotColName)` calls in the reorder loop will never execute — but the `b.pivot(...)` call itself may fail or produce an unexpected result. An explicit check for an empty pivot values list should be added to provide a clear error message. core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [3409-3437] List<String> pivotValues = node.getPivotValues(); +if (pivotValues == null \|\| pivotValues.isEmpty()) { + throw new IllegalArgumentException("xyseries requires at least one pivot value in the IN clause"); +} ... LinkedHashMap<String, List<RexNode>> pivotValueMap = new LinkedHashMap<>(); for (String val : pivotValues) { pivotValueMap.put(val, ImmutableList.of(b.literal(val))); } Suggestion importance[1-10]: 4 __ Why: Adding a guard for empty `pivotValues` is a reasonable defensive check, but the grammar already requires at least one `stringLiteral` in `xyseriesPivotValues`, making this scenario unlikely in practice. The `improved_code` also uses `...` placeholders which don't accurately reflect the actual code.	Low

Suggestions up to commit a72ffe1

Category	Suggestion	Impact
Possible issue	Avoid fragile hardcoded pivot column name assumption The pivot output column name is hardcoded as `{value}_{agg}` with an underscore separator, but this naming convention depends on Calcite's internal pivot implementation and may not always match. If Calcite uses a different delimiter or ordering, `b.field(pivotColName)` will throw a runtime exception. The actual generated column name should be verified against Calcite's pivot output, or the column should be referenced by position rather than by name. core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [3457-3459] // Reference pivot output column by its generated name: {value}_{agg} -String pivotColName = pivotVal + "_" + aggName; -reorderProjections.add(b.field(pivotColName)); +// Calcite pivot generates columns in order: for each pivot value, for each agg +// Use positional reference to avoid dependency on Calcite's internal naming +int pivotColIndex = 1 + yDataFieldNames.indexOf(aggName) + pivotValues.indexOf(pivotVal) * yDataFieldNames.size(); +reorderProjections.add(b.field(pivotColIndex)); Suggestion importance[1-10]: 7 __ Why: The hardcoded `{value}_{agg}` naming convention for pivot output columns depends on Calcite's internal implementation. If Calcite changes its naming or uses a different delimiter, `b.field(pivotColName)` will throw a runtime exception. However, the improved code using positional indexing also has potential issues with ordering assumptions, so the score is moderate.	Medium
Possible issue	Silent exception swallowing masks test failures Catching `ResponseException` and continuing to call `verifyQuery` on the error response silently masks test failures. If the query fails (e.g., because Calcite is not enabled or the command is unsupported), the test will pass with an error response body instead of failing. The exception should either be rethrown or the test should assert that no error occurred. integ-test/src/test/java/org/opensearch/sql/ppl/NewAddedCommandsIT.java [538-552] -JSONObject result; -try { - result = - executeQuery( - StringEscapeUtils.escapeJson(...)); -} catch (ResponseException e) { - result = new JSONObject(TestUtils.getResponseBody(e.getResponse())); -} +JSONObject result = + executeQuery( + StringEscapeUtils.escapeJson( + String.format( + "search source=%s \| stats avg(balance) as avg_balance by gender, state" + + " \| xyseries state gender in (\"F\", \"M\") avg_balance", + TEST_INDEX_BANK))); verifyQuery(result); Suggestion importance[1-10]: 6 __ Why: Catching `ResponseException` and continuing with `verifyQuery` on the error response body silently masks test failures. If the query fails, the test would pass incorrectly, reducing test reliability. However, the `existing_code` in the suggestion uses a placeholder `...` rather than the actual code, making the match imprecise.	Low
General	Throw explicit error for unsupported expression types The fallback `expr.toString()` for unrecognized expression types will likely produce an incorrect field name (e.g., the full expression string representation), causing a runtime failure when `b.field(...)` is called with that name. An explicit error should be thrown for unsupported expression types to provide a clear and actionable error message. core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [3472-3480] private String resolveFieldName(UnresolvedExpression expr) { if (expr instanceof Field) { return ((Field) expr).getField().toString(); } if (expr instanceof Alias) { return ((Alias) expr).getName(); } - return expr.toString(); + throw new IllegalArgumentException( + "xyseries: unsupported expression type for field resolution: " + + expr.getClass().getSimpleName() + " (" + expr + ")"); } Suggestion importance[1-10]: 5 __ Why: The fallback `expr.toString()` in `resolveFieldName` could produce incorrect field names for unsupported expression types, leading to confusing runtime errors. Throwing an explicit `IllegalArgumentException` would provide a clearer error message, improving debuggability.	Low
General	Document aggregation assumption for pivot correctness Using `MAX` as the aggregation function for all y-data fields is incorrect in the general case. The input to `xyseries` is already aggregated (typically from a preceding `stats` command), so the pivot should use the actual value rather than applying `MAX`. This could produce wrong results if multiple rows share the same x-field and pivot value combination, and silently discards data. Consider using `MAX` only as a fallback and documenting this assumption clearly, or using a more appropriate aggregation like `ANY_VALUE` if supported. core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [3427-3430] +// Use MAX as the pivot aggregation - input is expected to be pre-aggregated by stats, +// so each (xField, yNameField) combination should have at most one row. +// MAX is used as a standard SQL-compatible way to select the single value. List<AggCall> aggCalls = yDataFieldNames.stream() .map(name -> b.max(b.field(name)).as(name)) .collect(Collectors.toList()); Suggestion importance[1-10]: 2 __ Why: The suggestion only adds comments to document the existing `MAX` aggregation behavior without changing the logic. This is a documentation-only improvement with minimal functional impact.	Low

Suggestions up to commit c7b7536

Category	Suggestion	Impact
Possible issue	Resolve unresolved merge conflict in code The code contains unresolved Git merge conflict markers (`<<<<<<< HEAD`, `=======`, `>>>>>>> 59ad4dc...`). This will cause a compilation failure. The conflict must be resolved by choosing one of the two implementations (or merging them) and removing all conflict markers before merging this PR. core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [3457-3473] for (String aggName : yDataFieldNames) { for (String pivotVal : pivotValues) { -<<<<<<< HEAD - // CASE WHEN CAST(y_name_field AS VARCHAR) = 'pivotVal' THEN y_data_field ELSE NULL END - RexNode condition = b.call(SqlStdOperatorTable.EQUALS, yNameAsString, b.literal(pivotVal)); - RexNode dataRef = b.field(yDataFieldName); - RexNode caseExpr = b.call(SqlStdOperatorTable.CASE, condition, dataRef, b.literal(null)); - - // Generate the output column name - String colName = generateColumnName(yDataFieldName, pivotVal, separator, format); - projections.add(b.alias(caseExpr, colName)); - projectionNames.add(colName); -======= - // Reference pivot output column by its generated name: {value}_{agg} - String pivotColName = pivotVal + "_" + aggName; - reorderProjections.add(b.field(pivotColName)); - reorderNames.add(generateColumnName(aggName, pivotVal, separator, format)); ->>>>>>> 59ad4dc48ce98b5121320f8a226e6148a9386e27 + // Reference pivot output column by its generated name: {value}_{agg} + String pivotColName = pivotVal + "_" + aggName; + reorderProjections.add(b.field(pivotColName)); + reorderNames.add(generateColumnName(aggName, pivotVal, separator, format)); } } Suggestion importance[1-10]: 10 __ Why: The code contains unresolved Git merge conflict markers (`<<<<<<< HEAD`, `=======`, `>>>>>>>`), which will cause a compilation failure. This is a critical issue that must be resolved before the PR can be merged.	High
Possible issue	Silent exception swallowing hides test failures The tests silently catch `ResponseException` and continue to call `verifyQuery(result)` on the error response body. This means if the `xyseries` command throws an error (e.g., because Calcite is not enabled), the test will not fail but will instead verify the error JSON, potentially masking real failures. The exception should be re-thrown or the test should assert that no exception occurred. integ-test/src/test/java/org/opensearch/sql/ppl/NewAddedCommandsIT.java [538-552] -JSONObject result; -try { - result = - executeQuery( - StringEscapeUtils.escapeJson(...)); -} catch (ResponseException e) { - result = new JSONObject(TestUtils.getResponseBody(e.getResponse())); -} +JSONObject result = + executeQuery( + StringEscapeUtils.escapeJson( + String.format( + "search source=%s \| stats avg(balance) as avg_balance by gender, state" + + " \| xyseries state gender in (\"F\", \"M\") avg_balance", + TEST_INDEX_BANK))); verifyQuery(result); Suggestion importance[1-10]: 6 __ Why: The `catch (ResponseException e)` block silently swallows errors and continues to call `verifyQuery(result)` on the error response, which can mask real test failures. However, this pattern appears to be intentional for testing both Calcite-enabled and non-Calcite scenarios, so the impact depends on the test's intent.	Low
General	Hardcoded MAX aggregation may fail for non-numeric types The pivot always uses `MAX` as the aggregation function regardless of the original aggregation applied in the upstream `stats` command. Since the pivot is applied after the `stats` aggregation, the y-data fields already contain aggregated values. Using `MAX` is a reasonable choice here, but if the upstream aggregation produces non-numeric types (e.g., strings), `MAX` may fail or produce unexpected results. Consider using a more neutral aggregation like `ANY_VALUE` or `FIRST_VALUE` to avoid type-related issues. core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [3427-3431] -// Build aggregate calls - MAX for each y-data field +// Build aggregate calls - use ANY_VALUE since data is already aggregated upstream List<AggCall> aggCalls = yDataFieldNames.stream() - .map(name -> b.max(b.field(name)).as(name)) + .map(name -> b.aggregateCall(SqlStdOperatorTable.ANY_VALUE, b.field(name)).as(name)) .collect(Collectors.toList()); Suggestion importance[1-10]: 4 __ Why: Using `MAX` as the pivot aggregation is a reasonable choice since data is already aggregated upstream, but it could fail for non-numeric types. The suggestion to use `ANY_VALUE` is valid but the `improved_code` references `SqlStdOperatorTable.ANY_VALUE` which may not exist in all Calcite versions, making this a minor concern.	Low
General	Mutable list field allows unintended external modification The `pivotValues` list is declared as `final` but its contents can still be mutated if a mutable list (e.g., `ArrayList`) is passed in. Since `Xyseries` is an AST node that should be immutable after construction, consider wrapping the list in `Collections.unmodifiableList()` or using `ImmutableList.copyOf()` in the constructor to prevent external mutation. core/src/main/java/org/opensearch/sql/ast/tree/Xyseries.java [35-36] /** Explicit pivot values from the IN (...) clause. */ private final List<String> pivotValues; +// In the constructor or via a defensive copy: +// this.pivotValues = ImmutableList.copyOf(pivotValues); + Suggestion importance[1-10]: 3 __ Why: While the immutability concern is valid, the `improved_code` is essentially the same as `existing_code` with only a comment added, making it a low-impact suggestion. The fix would need to be applied in the constructor, not just as a comment.	Low

Suggestions up to commit 59ad4dc

Category	Suggestion	Impact
Possible issue	Remove silent exception swallowing in tests The test silently catches `ResponseException` and continues to call `verifyQuery(result)` on the error response body. This means if the query fails (e.g., because Calcite is not enabled), the test will pass as long as `verifyQuery` doesn't fail on the error JSON, masking real failures. The exception should either be rethrown or the test should assert that no exception occurred. integ-test/src/test/java/org/opensearch/sql/ppl/NewAddedCommandsIT.java [538-552] -JSONObject result; -try { - result = - executeQuery( - StringEscapeUtils.escapeJson( - String.format( - "search source=%s \| stats avg(balance) as avg_balance by gender, state" - + " \| xyseries state gender in (\"F\", \"M\") avg_balance", - TEST_INDEX_BANK))); - -} catch (ResponseException e) { - result = new JSONObject(TestUtils.getResponseBody(e.getResponse())); -} +JSONObject result = + executeQuery( + StringEscapeUtils.escapeJson( + String.format( + "search source=%s \| stats avg(balance) as avg_balance by gender, state" + + " \| xyseries state gender in (\"F\", \"M\") avg_balance", + TEST_INDEX_BANK))); verifyQuery(result); Suggestion importance[1-10]: 6 __ Why: The pattern of catching `ResponseException` and calling `verifyQuery` on the error response is a real testing anti-pattern that can mask failures. The suggestion to remove the try-catch is valid and would make the tests more reliable, though the existing pattern appears intentional to handle both Calcite-enabled and non-Calcite environments.	Low
Possible issue	Fix hardcoded pivot column name ordering The pivot output column name is hardcoded as `pivotVal + "_" + aggName`, but the actual column name generated by Calcite's `pivot()` may differ (e.g., it could be `aggName_pivotVal` or use a different separator). If the generated column name doesn't match, `b.field(pivotColName)` will throw a runtime exception. You should verify the exact column naming convention used by Calcite's RelBuilder pivot and align accordingly, or look up the column by index rather than by name. core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [3457-3459] // Reference pivot output column by its generated name: {value}_{agg} -String pivotColName = pivotVal + "_" + aggName; +// Calcite pivot generates columns as: {agg}_{value} - verify this matches your Calcite version +String pivotColName = aggName + "_" + pivotVal; reorderProjections.add(b.field(pivotColName)); Suggestion importance[1-10]: 5 __ Why: The suggestion raises a valid concern about whether Calcite's `pivot()` generates columns as `pivotVal_aggName` or `aggName_pivotVal`. However, the integration tests in `CalciteXySeriesCommandIT` appear to pass with the current naming, suggesting the existing code may already be correct for the Calcite version in use. The suggestion to verify is reasonable but the proposed fix (reversing the order) may itself be wrong.	Low
General	Guard against missing child node The `getChild()` method returns an empty list when `child` is null, but `visitChildren` in the visitor may rely on having a child to traverse. If `Xyseries` is visited before `attach()` is called (e.g., during validation or early traversal), this could silently skip child processing. Consider adding a null-check guard or ensuring `attach()` is always called before visiting. core/src/main/java/org/opensearch/sql/ast/tree/Xyseries.java [55-58] @Override public List<UnresolvedPlan> getChild() { - return this.child == null ? ImmutableList.of() : ImmutableList.of(this.child); + if (this.child == null) { + throw new IllegalStateException("Xyseries node has no child attached"); + } + return ImmutableList.of(this.child); } Suggestion importance[1-10]: 2 __ Why: Other `UnresolvedPlan` nodes in the codebase (like `Chart`) follow the same pattern of returning an empty list when `child` is null, making this a consistent design choice. Throwing an exception would break the existing pattern and could cause issues during early traversal phases.	Low
General	Document implicit MAX aggregation assumption Using `MAX` as the aggregation function for all y-data fields is a hardcoded assumption. The xyseries command is typically applied after a `stats` command that has already aggregated the data, so the pivot should use `MAX` (or `MIN`) to select the single non-null value per group. However, if the upstream data has multiple rows per (x-field, pivot-value) combination, silently using `MAX` could produce incorrect results without any warning to the user. This assumption should at least be documented clearly in the code. core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [3427-3430] +// Use MAX to select the single non-null value per (x-field, pivot-value) group. +// xyseries is expected to be applied after stats, so each group should have at most one row. List<AggCall> aggCalls = yDataFieldNames.stream() .map(name -> b.max(b.field(name)).as(name)) .collect(Collectors.toList()); Suggestion importance[1-10]: 2 __ Why: This is purely a documentation/comment suggestion with no code change. The `improved_code` only adds a comment, which doesn't warrant a high score per the scoring guidelines.	Low

Signed-off-by: Asif Bashar <asif.bashar@gmail.com>

github-actions · 2026-04-14T02:32:01Z

Persistent review updated to latest commit 59ad4dc

Signed-off-by: Asif Bashar <asif.bashar@gmail.com>

github-actions · 2026-04-14T18:15:54Z

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit c7b7536.

Path	Line	Severity	Description
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java	3430	medium	Unresolved merge conflict markers (<<<<<<< HEAD / ======= / >>>>>>>) left in production code. The HEAD branch block references undefined variables (yNameAsString, yDataFieldName, projections, projectionNames) that do not exist in the surrounding scope. If this somehow bypassed compilation checks, the dead code block could produce unexpected runtime behavior. This appears to be an accidental inclusion of conflicting code rather than intentional malice, but warrants investigation to ensure the correct implementation was chosen and no unintended logic was merged.

The table above displays the top 10 most important findings.

Total: 1 | Critical: 0 | High: 0 | Medium: 1 | Low: 0

Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.

⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

github-actions · 2026-04-14T18:17:04Z

Persistent review updated to latest commit c7b7536

Signed-off-by: Asif Bashar <asif.bashar@gmail.com>

github-actions · 2026-04-14T18:25:42Z

Persistent review updated to latest commit a72ffe1

github-actions · 2026-04-14T18:26:34Z

Persistent review updated to latest commit a72ffe1

github-actions · 2026-04-14T18:28:05Z

Persistent review updated to latest commit 13ddcb5

Signed-off-by: Asif Bashar <asif.bashar@gmail.com>

github-actions · 2026-04-14T18:34:50Z

Persistent review updated to latest commit 1858a98

asifabashar added 4 commits April 13, 2026 16:15

xyseries command implementation

e550a17

Signed-off-by: Asif Bashar <asif.bashar@gmail.com>

xyseries implementation

349378e

Signed-off-by: Asif Bashar <asif.bashar@gmail.com>

xyseries implementation

6483d6e

Signed-off-by: Asif Bashar <asif.bashar@gmail.com>

xyseries implementation

ecc60f7

Signed-off-by: Asif Bashar <asif.bashar@gmail.com>

xyseries implementation

59ad4dc

Signed-off-by: Asif Bashar <asif.bashar@gmail.com>

asifabashar added 2 commits April 14, 2026 11:10

xyseries implementation

7f57b6c

Signed-off-by: Asif Bashar <asif.bashar@gmail.com>

merge incoming changes

c7b7536

Signed-off-by: Asif Bashar <asif.bashar@gmail.com>

asifabashar added 2 commits April 14, 2026 11:24

merge incoming changes

c264365

Signed-off-by: Asif Bashar <asif.bashar@gmail.com>

merge incoming changes

a72ffe1

Signed-off-by: Asif Bashar <asif.bashar@gmail.com>

Merge remote-tracking branch 'oss/main' into xyseries2

13ddcb5

merge incoming changes

1858a98

Signed-off-by: Asif Bashar <asif.bashar@gmail.com>

asifabashar marked this pull request as ready for review April 14, 2026 18:41

asifabashar requested review from Swiddis, acarbonetto, anirudha, dai-chen, joshuali925, mengweieric, penghuo, ps48 and vamsimanohar as code owners April 14, 2026 18:41

asifabashar requested review from LantaoJin, RyanL1997, ahkcs, noCharger, qianheng-aws, songkant-aws, ykmr1224 and yuancu as code owners April 14, 2026 18:41

asifabashar changed the title ~~Xyseries2~~ Xyseries command implementation Apr 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Xyseries command implementation#5343

Xyseries command implementation#5343
asifabashar wants to merge 11 commits intoopensearch-project:mainfrom
asifabashar:xyseries2

asifabashar commented Apr 14, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Apr 14, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Apr 14, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Apr 14, 2026

Uh oh!

github-actions bot commented Apr 14, 2026

Uh oh!

github-actions bot commented Apr 14, 2026

Uh oh!

github-actions bot commented Apr 14, 2026

Uh oh!

github-actions bot commented Apr 14, 2026

Uh oh!

github-actions bot commented Apr 14, 2026

Uh oh!

github-actions bot commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

asifabashar commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

User Syntax

Arguments

Options

Examples

Semantics

Input shape and type rules

Null handling

Implementation Plan

out of scope

Limitations

Related Issues

Check List

Uh oh!

github-actions bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Reviewer Guide 🔍

(Review updated until commit 1858a98)

Uh oh!

github-actions bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Previous suggestions

Uh oh!

github-actions bot commented Apr 14, 2026

Uh oh!

github-actions bot commented Apr 14, 2026

PR Code Analyzer ❗

Uh oh!

github-actions bot commented Apr 14, 2026

Uh oh!

github-actions bot commented Apr 14, 2026

Uh oh!

github-actions bot commented Apr 14, 2026

Uh oh!

github-actions bot commented Apr 14, 2026

Uh oh!

github-actions bot commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

asifabashar commented Apr 14, 2026 •

edited

Loading

github-actions bot commented Apr 14, 2026 •

edited

Loading

(Review updated until commit `1858a98`)

github-actions bot commented Apr 14, 2026 •

edited

Loading