-
Notifications
You must be signed in to change notification settings - Fork 873
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[KYUUBI #6368] Flink engine supports user impersonation #6383
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6383 +/- ##
============================================
- Coverage 58.37% 0.00% -58.38%
============================================
Files 656 675 +19
Lines 40267 41674 +1407
Branches 5498 5689 +191
============================================
- Hits 23507 0 -23507
- Misses 14259 41674 +27415
+ Partials 2501 0 -2501 ☔ View full report in Codecov by Sentry. |
kyuubi-common/src/main/scala/org/apache/kyuubi/config/KyuubiConf.scala
Outdated
Show resolved
Hide resolved
@@ -68,12 +68,10 @@ class HadoopFsDelegationTokenProvider extends HadoopDelegationTokenProvider with | |||
}.toMap | |||
|
|||
try { | |||
// Renewer is not needed. But setting a renewer can avoid potential NPE. | |||
val renewer = UserGroupInformation.getCurrentUser.getUserName |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this change is worth a dedicated PR.
@@ -58,14 +61,30 @@ class FlinkProcessBuilder( | |||
// flink.execution.target are required in Kyuubi conf currently | |||
val executionTarget: Option[String] = conf.getOption("flink.execution.target") | |||
|
|||
private lazy val proxyUserEnable: Boolean = { | |||
doAsEnabled && conf.get(ENGINE_FLINK_DOAS_ENABLED) && | |||
conf.getOption(s"$FLINK_CONF_PREFIX.$FLINK_SECURITY_KEYTAB_KEY").isEmpty && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in Spark, when doAs is enabled, we actually have a constraint to ensure that the principal should always be the session user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation of flink proxy user is different from the built-in --proxy-user
of spark. It relies on HADOOP_PROXY_USER
and has some limitations. It is difficult for us to control the behavior of the client well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should unify the concept on the Kyuubi layer as much as possible.
kyuubi-server/src/main/scala/org/apache/kyuubi/engine/flink/FlinkProcessBuilder.scala
Outdated
Show resolved
Hide resolved
kyuubi-server/src/main/scala/org/apache/kyuubi/engine/flink/FlinkProcessBuilder.scala
Outdated
Show resolved
Hide resolved
private def generateTokenFile(): Option[(String, String)] = { | ||
// We disabled `hadoopfs` token service, which may cause yarn client to miss hdfs token. | ||
// So we generate a hadoop token file to pass kyuubi engine tokens to submit process. | ||
// TODO: Removed this after FLINK-35525, delegation tokens will be passed by `kyuubi` provider |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm hesitant to mix two approaches up, maybe we need a switch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FLINK-35525 has not yet been introduced in stable version. the hadoop token file way is currently only effective one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flink 1.20 is on the way.
We can simply make the decision based on the Flink version, but considering vendors are likely to backport patches to their internal distributions, an explicit switch is preferred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
an explicit switch is preferred.
Do you mean add a configuration to control this behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, a new configuration. Given the current approach is hacky and will be eventually removed, we can mark the configuration as internal.
kyuubi-server/src/main/scala/org/apache/kyuubi/engine/flink/FlinkProcessBuilder.scala
Show resolved
Hide resolved
...c/main/java/org/apache/kyuubi/engine/flink/security/token/KyuubiDelegationTokenProvider.java
Outdated
Show resolved
Hide resolved
|
||
<plugin> | ||
<groupId>net.alchim31.maven</groupId> | ||
<artifactId>scala-maven-plugin</artifactId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unnecessary
val ENGINE_FLINK_DOAS_ENABLED: ConfigEntry[Boolean] = | ||
buildConf("kyuubi.engine.flink.doAs.enabled") | ||
.doc("Whether to enable using hadoop proxy user to run flink engine. Only takes effect" + | ||
s" in kerberos environment and when `${ENGINE_DO_AS_ENABLED.key}` is set to `true`.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this configuration should be independent with kyuubi.engine.doAs.enabled
. And we should fail the Flink engine bootstrap when user enables the configuration on a non-Kerberized environment.
the docs might be: "When enabled, the session user is used as the proxy user to launch the Flink engine, otherwise, the server user. Note, due to the limitation of Apache Flink, it can only be enabled on Kerberized environment."
@@ -58,14 +61,30 @@ class FlinkProcessBuilder( | |||
// flink.execution.target are required in Kyuubi conf currently | |||
val executionTarget: Option[String] = conf.getOption("flink.execution.target") | |||
|
|||
private lazy val proxyUserEnable: Boolean = { | |||
doAsEnabled && conf.get(ENGINE_FLINK_DOAS_ENABLED) && | |||
conf.getOption(s"$FLINK_CONF_PREFIX.$FLINK_SECURITY_KEYTAB_KEY").isEmpty && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should unify the concept on the Kyuubi layer as much as possible.
private def generateTokenFile(): Option[(String, String)] = { | ||
// We disabled `hadoopfs` token service, which may cause yarn client to miss hdfs token. | ||
// So we generate a hadoop token file to pass kyuubi engine tokens to submit process. | ||
// TODO: Removed this after FLINK-35525, delegation tokens will be passed by `kyuubi` provider |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, a new configuration. Given the current approach is hacky and will be eventually removed, we can mark the configuration as internal.
🔍 Description
Issue References 🔗
This pull request fixes #6368
Describe Your Solution 🔧
Support impersonation mode for flink sql engine.
Types of changes 🔖
Test Plan 🧪
Behavior Without This Pull Request ⚰️
Behavior With This Pull Request 🎉
Test in hadoop-testing env.
Connection:
sql:
result:
launch engine command:
launch engine log:
jobmanager job:
taskmanager log:
Related Unit Tests
Checklist 📝
Be nice. Be informative.