Skip to content

[Improve][Zeta] Improve engine-server local IDE startup experience (missing Gson dependency) #10730

@nzw921rx

Description

@nzw921rx

Search before asking

  • I had searched in the feature and found no similar feature requirement.

Description

What are you trying to achieve

Enable developers to run and debug SeaTunnelServerStarter directly from IntelliJ (using module classpath of seatunnel-engine-server) without hitting avoidable NoClassDefFoundError failures.

Currently, local IDE startup fails with two sequential errors before any job is submitted. These errors do not occur when starting via seatunnel-cluster.sh, which uses a full lib/* classpath.


Problem 1: Missing com.google.gson dependency

After configuring localfile checkpoint and adding -Dseatunnel.config as VM option, starting SeaTunnelServerStarter fails immediately:

ERROR com.hazelcast.spi.impl.servicemanager.impl.ServiceManagerImpl - Error while initializing service: com/google/gson/GsonBuilder
java.lang.NoClassDefFoundError: com/google/gson/GsonBuilder
    at org.apache.seatunnel.engine.server.rest.servlet.PendingJobsServlet.<clinit>(PendingJobsServlet.java:60)
    at org.apache.seatunnel.engine.server.JettyService.createJettyServer(JettyService.java:185)
    at org.apache.seatunnel.engine.server.SeaTunnelServer.init(SeaTunnelServer.java:179)
    ...
Caused by: java.lang.ClassNotFoundException: com.google.gson.GsonBuilder
    at java.net.URLClassLoader.findClass(URLClassLoader.java:387)

Root cause: seatunnel-engine-server main code imports com.google.gson.* (in PendingJobsServlet, BaseServlet, RestHttpGetCommandProcessor), but gson is not declared as a direct dependency in seatunnel-engine-server/pom.xml. In distribution builds it arrives transitively; in IDE module classpath it is missing.

Local workaround: Manually add com.google.code.gson:gson:2.8.9 to the IDE run configuration classpath.


Problem 2: Unconditional Hadoop class loading in SeaTunnelServer#init

After resolving Gson, a second startup failure occurs:

ERROR com.hazelcast.spi.impl.servicemanager.impl.ServiceManagerImpl - Error while initializing service: org/apache/hadoop/fs/FileSystem$Statistics
java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FileSystem$Statistics
    at org.apache.seatunnel.engine.server.SeaTunnelServer.init(SeaTunnelServer.java:184)
    ...
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.FileSystem$Statistics
    at java.net.URLClassLoader.findClass(URLClassLoader.java:387)

Root cause: SeaTunnelServer.java line 182-184 unconditionally loads a Hadoop class:

// a trick way to fix StatisticsDataReferenceCleaner thread class loader leak.
// see https://issues.apache.org/jira/browse/HADOOP-19049
FileSystem.Statistics statistics = new FileSystem.Statistics("SeaTunnel");

This executes on every master init regardless of whether checkpoint storage is localfile or hdfs. It forces loading org.apache.hadoop.fs.FileSystem, which requires Hadoop jars. IDE module classpath typically does not include seatunnel-hadoop3-3.1.4-uber (declared as provided scope in pom).

Local workaround: Comment out this line:

// FileSystem.Statistics statistics = new FileSystem.Statistics("SeaTunnel");

After commenting out, engine-server starts successfully in IDE with localfile checkpoint.


Proposed fix

Fix 1: Add explicit Gson dependency

Add gson as a direct compile dependency in seatunnel-engine-server/pom.xml:

<dependency>
    <groupId>com.google.code.gson</groupId>
    <artifactId>gson</artifactId>
</dependency>

This aligns the pom declaration with actual imports and ensures IDE module classpath is self-contained.

Fix 2: Make HADOOP-19049 workaround conditional based on checkpoint storage type

The SeaTunnelServer#init method already has access to seaTunnelConfig. The checkpoint storage type can be read via:

seaTunnelConfig.getEngineConfig().getCheckpointConfig().getStorage().getStorage()

which returns "localfile" or "hdfs" (default is "localfile").

The fix should only execute the HADOOP-19049 workaround when the configured storage type actually requires Hadoop:

// a trick way to fix StatisticsDataReferenceCleaner thread class loader leak.
// see https://issues.apache.org/jira/browse/HADOOP-19049
String storageType = seaTunnelConfig.getEngineConfig()
        .getCheckpointConfig().getStorage().getStorage();
if (!"localfile".equalsIgnoreCase(storageType)) {
    FileSystem.Statistics statistics = new FileSystem.Statistics("SeaTunnel");
}
  • When checkpoint storage is hdfs (production / distribution): workaround executes as before, no behavior change.
  • When checkpoint storage is localfile (local IDE debugging): skips Hadoop class loading, engine-server starts without Hadoop jars.

Benefits

  • Fast local startup: Developers can run and debug SeaTunnelServerStarter from IDE without the full distribution classpath or Hadoop uber jars
  • Correct dependency declaration: gson as direct dependency matches actual imports, eliminates flaky transitive-only resolution

Environment

Item Value
Checkpoint type: localfile

Usage Scenario

No response

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions