Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow running Driver threads with SCHED_BATCH policy on Linux #23053

Merged
merged 2 commits into from
Jun 27, 2024

Conversation

arhimondr
Copy link
Member

Description

Use SCHED_BATCH scheduling policy for Driver thread pool

Motivation and Context

With blocking IO used by connectors (such as Hive connector) often it is necessary to set the number of threads to the number higher than the number of available cores.

This is needed to avoid slowdowns for IO heavy workload.

However for CPU intensive queries it may create unnecessary thread contention and may cause stability problems when communication threads are delayed.

Threads scheduled with SCHED_BATCH policy are run with slightly lower priority giving a green light for communication threads. These can run for longer resulting in less cache flushes if only other batch threads are waiting for execution.

More information about different scheduling policies in Linux can be found here: https://man7.org/linux/man-pages/man7/sched.7.html

Impact

Improves efficiency and stability for certain cluster configurations

Test Plan

  • Set driver.threads-batch-scheduling-enabled=true
  • Find a Driver thread
  • Run chrt -p <driver thread id>

Result:

chrt -p 3238430
pid 3238430's current scheduling policy: SCHED_BATCH
pid 3238430's current scheduling priority: 0

Contributor checklist

  • Please make sure your submission complies with our development, formatting, commit message, and attribution guidelines.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== NO RELEASE NOTE ==

@arhimondr arhimondr requested a review from a team as a code owner June 21, 2024 21:05
@arhimondr arhimondr requested a review from xiaoxmeng June 21, 2024 21:05
Copy link
Contributor

@xiaoxmeng xiaoxmeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arhimondr thanks % minors.

presto-native-execution/presto_cpp/main/PrestoServer.h Outdated Show resolved Hide resolved
@@ -401,8 +421,8 @@ void PrestoServer::run() {
destination,
queue,
pool,
driverExecutor_.get(),
exchangeHttpExecutor_.get(),
exchangeHttpCpuExecutor_.get(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we see improvement with a dedicated cpu executor for this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a Driver executor is busy doing computation it may result in increase in Exchange request latency, slowing down data exchange. The idea is that having a separate executor should decouple Exchange communication from execution potentially improving throughput.

if (systemConfig->driverThreadsBatchSchedulingEnabled()) {
#ifdef __linux__
threadFactory = std::make_shared<BatchThreadFactory>("Driver");
#else
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we log a warning instead of fail the server start if the non-linux platform doesn't this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The property is false by default and shouldn't be set on systems other than Linux. Failing loud when somebody is trying to set it on systems other than Linux seems to less error prone.

presto-native-execution/presto_cpp/main/PrestoServer.cpp Outdated Show resolved Hide resolved
presto-native-execution/presto_cpp/main/PrestoServer.cpp Outdated Show resolved Hide resolved
presto-native-execution/presto_cpp/main/PrestoServer.cpp Outdated Show resolved Hide resolved
return folly::NamedThreadFactory::newThread(
[func_2 = std::move(func)]() mutable {
sched_param param;
param.sched_priority = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we tuned on this? Thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only supported value by SCHED_BATCH. It fails when I set it to any different value.

@arhimondr
Copy link
Member Author

Thanks for the review @xiaoxmeng , updated

Copy link
Contributor

@xiaoxmeng xiaoxmeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arhimondr thanks!

@xiaoxmeng xiaoxmeng merged commit dcb4ed5 into prestodb:master Jun 27, 2024
58 of 59 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants