Skip to content

[Bug] [connector-jdbc] CatalogTable.partitionKeys is empty in JdbcSource #10597

@zhangqs0205

Description

@zhangqs0205

Search before asking

  • I had searched in the issues and found no similar issues.

What happened

I found that in JdbcSource, the CatalogTable obtained from JDBC metadata has empty partitionKeys.

The scenario where I noticed this problem was metadata retrieval, but the issue itself is more general: JdbcSource does not preserve partition key metadata in the generated CatalogTable.

For a partitioned table, the resulting metadata is still similar to:

CatalogTable{..., partitionKeys=[], ...}

I checked the JDBC catalog-building code and found that partitionKeys are explicitly initialized as empty in the current implementation.

In AbstractJdbcCatalog#getTable, CatalogTable is created with empty partition keys:

  return CatalogTable.of(
          tableIdentifier,
          tableSchemaBuilder.build(),
          buildConnectorOptions(tablePath),
          Collections.emptyList(),
          "",
          catalogName);

In CatalogUtils, CatalogTable built from query metadata also uses empty partition keys:

  return CatalogTable.of(
          tableIdentifier,
          tableSchema,
          new HashMap<>(),
          new ArrayList<>(),
          "",
          catalogName);

Because of this, JdbcSource loses partition metadata in the resulting CatalogTable.

The scenario where I found this issue was metadata retrieval, but the problem itself is in JdbcSource metadata generation:

  • JdbcSource gets a CatalogTable
  • the partitionKeys in that CatalogTable are empty
  • partition metadata is lost on the JDBC source side

Expected behavior:

  • for partitioned source tables, CatalogTable.partitionKeys should contain the actual partition key columns

Actual behavior:

  • CatalogTable.partitionKeys is empty in JdbcSource

SeaTunnel Version

2.3.9

SeaTunnel Config

env {
    parallelism = 1
    job.mode = "BATCH"
  }

  source {
    Jdbc {
      url = "jdbc:hive2://localhost:10000/default"
      driver = "org.apache.hive.jdbc.HiveDriver"
      user = "test"
      password = "test"
      table_path = "default.partitioned_table"
    }
  }

  sink {
    Console {}
  }

Running Command

./bin/seatunnel.sh --config config/jdbc_hive_test.conf

Error Exception

No explicit exception is thrown.

  But the CatalogTable used by JdbcSource contains:
  CatalogTable{..., partitionKeys=[], ...}

Zeta or Flink or Spark Version

Flink 1.18

Java or Scala Version

java 8

Screenshots

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions