[Feature][Zeta] Add Master and Worker split mode deployment #6947

EricJoy2048 · 2024-06-05T08:30:32Z

Purpose of this pull request

Zeta Master is separated from the Worker

Does this PR introduce any user-facing change?

Yes, Since this PR, the user can start zeta cluster node with -r <role>, the role can be master_and_worker,master,worker. More information can get from the document in the PR.

How was this patch tested?

Check list

If any new Jar binary package adding in your PR, please add License Notice according
New License Guide
If necessary, please update the documentation to describe the new feature. https://github.com/apache/seatunnel/tree/dev/docs
If you are contributing the connector code, please check that the following files are updated:
1. Update change log that in connector document. For more details you can refer to connector-v2
2. Update plugin-mapping.properties and add new connector information in it
3. Update the pom file of seatunnel-dist
Update the release-note.

gitfortian

great

...src/main/java/org/apache/seatunnel/engine/server/resourcemanager/ResourceManagerFactory.java

config/hazelcast-worker.yaml

...gine-server/src/main/java/org/apache/seatunnel/engine/server/LiteNodeDropOutTcpIpJoiner.java

...unnel-engine-server/src/main/java/org/apache/seatunnel/engine/server/CoordinatorService.java

...nel-engine-server/src/main/java/org/apache/seatunnel/engine/server/SeaTunnelNodeContext.java

...nel-engine-server/src/main/java/org/apache/seatunnel/engine/server/dag/physical/SubPlan.java

dailai

LGTM if ci passed

Hisoka-X · 2024-06-27T03:09:13Z

docs/en/seatunnel-engine/download-seatunnel.md

@@ -0,0 +1,70 @@
+---


What's different with https://github.com/apache/seatunnel/blob/dev/docs/en/start-v2/locally/deployment.md?

Hisoka-X · 2024-06-27T03:11:22Z

docs/en/seatunnel-engine/hybrid-cluster-deployment.md

+
+The Master service and Worker service of SeaTunnel Engine are mixed in the same process, and all nodes can run jobs and participate in the election to become master, that is, the master node is also running synchronous tasks simultaneously. In this mode, the Imap (which saves the status information of the task to provide support for the task's fault tolerance) data will be distributed across all nodes.
+
+Usage Recommendation: It is recommended to use the [separated cluster mode](separated-cluster-deployment.md). In the hybrid cluster mode, the Master node needs to run tasks synchronously. When the task scale is large, it will affect the stability of the Master node. Once the Master node crashes or the heartbeat times out, it will cause the Master node to switch, and the Master node switch will cause all running tasks to perform fault tolerance, further increasing the load on the cluster. Therefore, we recommend using the [separated cluster mode](separated-cluster-deployment.md).


Can we mark separated cluster mode as experimental feature? When it ready on prd then change it to recommend.

Hisoka-X · 2024-06-27T03:12:25Z

docs/en/seatunnel-engine/hybrid-cluster-deployment.md

+        # Other configurations
+```
+
+### 4.2_slot configuration


Suggested change

### 4.2_slot configuration

### 4.2 Slot Configuration

Hisoka-X · 2024-06-27T03:18:55Z

...tarter/src/main/java/org/apache/seatunnel/core/starter/seatunnel/args/ServerCommandArgs.java

@@ -38,6 +38,12 @@ public class ServerCommandArgs extends CommandArgs {
            description = "The cluster daemon mode")
    private boolean daemonMode = false;

+    @Parameter(
+            names = {"-r", "--rule"},


Suggested change

names = {"-r", "--rule"},

names = {"-r", "--role"},

Hisoka-X · 2024-06-27T03:23:02Z

...hazelcast/seatunnel-hazelcast-shade/src/main/java/com/hazelcast/cluster/impl/MemberImpl.java

@@ -0,0 +1,295 @@
+/*


Can we add some comment blocks to quickly locate which codes we have modified, so that we will not lose the codes when upgrading in the future?

Hisoka-X · 2024-06-27T03:36:09Z

config/hazelcast.yaml

@@ -37,5 +37,11 @@ hazelcast:
    hazelcast.invocation.max.retry.count: 20
    hazelcast.tcp.join.port.try.count: 30
    hazelcast.logging.type: log4j2
-    hazelcast.operation.generic.thread.count: 50
+    hazelcast.operation.generic.thread.count: 100


why change the default value of thread count? The thread in operation thread pool never release. So if there is no need, don't increase it.

Hisoka-X · 2024-06-27T03:38:20Z

...nel-engine-server/src/main/java/org/apache/seatunnel/engine/server/dag/physical/SubPlan.java

@@ -450,6 +463,8 @@ private boolean prepareRestorePipeline() {
                reset();
                jobMaster.getCheckpointManager().reportedPipelineRunning(pipelineId, false);
                jobMaster.getPhysicalPlan().addPipelineEndCallback(this);
+                log.info("Wait {}s and then restore the pipeline ", pipelineRestoreIntervalSeconds);


Please add job id and pipeline id info

Hisoka-X · 2024-06-27T03:45:40Z

...nel-engine-server/src/main/java/org/apache/seatunnel/engine/server/dag/physical/SubPlan.java

@@ -450,6 +463,8 @@ private boolean prepareRestorePipeline() {
                reset();
                jobMaster.getCheckpointManager().reportedPipelineRunning(pipelineId, false);
                jobMaster.getPhysicalPlan().addPipelineEndCallback(this);
+                log.info("Wait {}s and then restore the pipeline ", pipelineRestoreIntervalSeconds);
+                Thread.sleep(pipelineRestoreIntervalSeconds);


Suggested change

Thread.sleep(pipelineRestoreIntervalSeconds);

Thread.sleep(pipelineRestoreIntervalSeconds * 1000);

EricJoy2048 changed the title ~~240329 test split master worker~~ [Feature][Zeta] Add Master and Worker split mode deployment Jun 5, 2024

EricJoy2048 force-pushed the 240329_test_split_master_worker branch 13 times, most recently from 3cb8283 to 47f5f24 Compare June 11, 2024 13:09

EricJoy2048 force-pushed the 240329_test_split_master_worker branch 16 times, most recently from ea34249 to c631ef7 Compare June 15, 2024 06:33

EricJoy2048 force-pushed the 240329_test_split_master_worker branch 2 times, most recently from 1d1a58b to 94ff17c Compare June 15, 2024 13:24

hailin0 mentioned this pull request Jun 18, 2024

SeaTunnel community meeting Topic collect #7004

Closed

3 tasks

EricJoy2048 force-pushed the 240329_test_split_master_worker branch 4 times, most recently from 3c1378f to 324972a Compare June 18, 2024 11:57

EricJoy2048 added this to the 2.3.6 milestone Jun 18, 2024

gitfortian approved these changes Jun 20, 2024

View reviewed changes

github-actions bot added the reviewed label Jun 20, 2024

EricJoy2048 commented Jun 20, 2024

View reviewed changes

...src/main/java/org/apache/seatunnel/engine/server/resourcemanager/ResourceManagerFactory.java Outdated Show resolved Hide resolved

EricJoy2048 added 7 commits June 27, 2024 10:42

lite note drop out from master election

b7c05cc

update zeta cluster start script

1867466

Add document

c5df270

revert dynamic slot config

a5c63e9

Fix CI error

1ead7a1

add k8s log

d9a6817

remove no use code

7b6882f

EricJoy2048 force-pushed the 240329_test_split_master_worker branch from 12b212c to 7b6882f Compare June 27, 2024 02:43

dailai reviewed Jun 27, 2024

View reviewed changes

Fix review problem

ae3293c

dailai approved these changes Jun 27, 2024

View reviewed changes

Hisoka-X reviewed Jun 27, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature][Zeta] Add Master and Worker split mode deployment #6947

[Feature][Zeta] Add Master and Worker split mode deployment #6947

EricJoy2048 commented Jun 5, 2024 •

edited

Loading

gitfortian left a comment

dailai left a comment

Hisoka-X Jun 27, 2024

Hisoka-X Jun 27, 2024

Hisoka-X Jun 27, 2024

Hisoka-X Jun 27, 2024

Hisoka-X Jun 27, 2024

Hisoka-X Jun 27, 2024

Hisoka-X Jun 27, 2024

Hisoka-X Jun 27, 2024


		The Master service and Worker service of SeaTunnel Engine are mixed in the same process, and all nodes can run jobs and participate in the election to become master, that is, the master node is also running synchronous tasks simultaneously. In this mode, the Imap (which saves the status information of the task to provide support for the task's fault tolerance) data will be distributed across all nodes.

		Usage Recommendation: It is recommended to use the [separated cluster mode](separated-cluster-deployment.md). In the hybrid cluster mode, the Master node needs to run tasks synchronously. When the task scale is large, it will affect the stability of the Master node. Once the Master node crashes or the heartbeat times out, it will cause the Master node to switch, and the Master node switch will cause all running tasks to perform fault tolerance, further increasing the load on the cluster. Therefore, we recommend using the [separated cluster mode](separated-cluster-deployment.md).

	Thread.sleep(pipelineRestoreIntervalSeconds);
	Thread.sleep(pipelineRestoreIntervalSeconds * 1000);

[Feature][Zeta] Add Master and Worker split mode deployment #6947

Are you sure you want to change the base?

[Feature][Zeta] Add Master and Worker split mode deployment #6947

Conversation

EricJoy2048 commented Jun 5, 2024 • edited Loading

Purpose of this pull request

Does this PR introduce any user-facing change?

How was this patch tested?

Check list

gitfortian left a comment

Choose a reason for hiding this comment

dailai left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

EricJoy2048 commented Jun 5, 2024 •

edited

Loading