-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow running Driver threads with SCHED_BATCH policy on Linux #23053
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arhimondr thanks % minors.
@@ -401,8 +421,8 @@ void PrestoServer::run() { | |||
destination, | |||
queue, | |||
pool, | |||
driverExecutor_.get(), | |||
exchangeHttpExecutor_.get(), | |||
exchangeHttpCpuExecutor_.get(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we see improvement with a dedicated cpu executor for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When a Driver executor is busy doing computation it may result in increase in Exchange request latency, slowing down data exchange. The idea is that having a separate executor should decouple Exchange communication from execution potentially improving throughput.
if (systemConfig->driverThreadsBatchSchedulingEnabled()) { | ||
#ifdef __linux__ | ||
threadFactory = std::make_shared<BatchThreadFactory>("Driver"); | ||
#else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we log a warning instead of fail the server start if the non-linux platform doesn't this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The property is false
by default and shouldn't be set on systems other than Linux. Failing loud when somebody is trying to set it on systems other than Linux seems to less error prone.
return folly::NamedThreadFactory::newThread( | ||
[func_2 = std::move(func)]() mutable { | ||
sched_param param; | ||
param.sched_priority = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we tuned on this? Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only supported value by SCHED_BATCH
. It fails when I set it to any different value.
To avoid competing with Driver tasks
Thanks for the review @xiaoxmeng , updated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arhimondr thanks!
Description
Use SCHED_BATCH scheduling policy for Driver thread pool
Motivation and Context
With blocking IO used by connectors (such as Hive connector) often it is necessary to set the number of threads to the number higher than the number of available cores.
This is needed to avoid slowdowns for IO heavy workload.
However for CPU intensive queries it may create unnecessary thread contention and may cause stability problems when communication threads are delayed.
Threads scheduled with SCHED_BATCH policy are run with slightly lower priority giving a green light for communication threads. These can run for longer resulting in less cache flushes if only other batch threads are waiting for execution.
More information about different scheduling policies in Linux can be found here: https://man7.org/linux/man-pages/man7/sched.7.html
Impact
Improves efficiency and stability for certain cluster configurations
Test Plan
driver.threads-batch-scheduling-enabled=true
Driver
threadchrt -p <driver thread id>
Result:
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.