Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a timeout to IFunctionProvider.GetFunctionMetadataAsync #10249

Open
wants to merge 4 commits into
base: dev
Choose a base branch
from

Conversation

kshyju
Copy link
Member

@kshyju kshyju commented Jun 25, 2024

resolves #10219

The fix is to wrap the GetFunctionMetadataAsync with a timeout task so that if the GetFunctionMetadataAsync does not return within a specific time, we return an empty array for that provider and log a message.

IMPORTANT: Currently, changes must be backported to the in-proc branch to be included in Core Tools and non-Flex deployments.

  • Backporting to the in-proc branch is not required
    • Otherwise: Link to backporting PR : Will follow.
  • My changes do not require documentation changes
    • Otherwise: Documentation issue linked to PR
  • My changes should not be added to the release notes for the next release
    • Otherwise: I've added my notes to release_notes.md
  • My changes do not need to be backported to a previous version
    • Otherwise: Backport tracked by issue/PR #issue_or_pr - will follow
  • My changes do not require diagnostic events changes
    • Otherwise: I have added/updated all related diagnostic events and their documentation (Documentation issue linked to PR)
  • I have added all required tests (Unit tests, E2E tests)

…t if a provider does not return, it will not cause a deadlock state.
@kshyju kshyju requested a review from a team as a code owner June 25, 2024 23:14
@kshyju kshyju changed the title Shkr/gh 10219 deadlock Adding a timeout to IFunctionProvider.GetFunctionMetadataAsync Jun 25, 2024
…h can be set to a different value than default, from the tests.
@kshyju kshyju requested a review from jviau June 27, 2024 16:56
@@ -17,3 +17,4 @@
- Ordered invocations are now the default (#10201)
- Skip worker description if none of the profile conditions are met (#9932)
- Fixed incorrect function count in the log message.(#10220)
- Adding a timeout to `GetFunctionMetadataAsync` to prevent deadlocks (#10219)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we update the note here so it's more end user friendly (i.e., doesn't require understanding of the codebase). Something along the lines of Adding a timeout when retrieving function metadata from metadata providers

if (getFunctionMetadataFromProviderTask.IsFaulted)
{
// Task completed but with an error
_logger.LogWarning($"Failure in retrieving metadata from '{functionProvider.GetType().FullName}': {getFunctionMetadataFromProviderTask.Exception?.Flatten().ToString()}");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pass the values as arguments into your log call:

Example:

_logger.LogWarning("Failure in retrieving metadata from '{typeName}': {exception}", functionProvider.GetType().FullName, getFunctionMetadataFromProviderTask.Exception?.Flatten());

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, should this be an error instead?

}
}
// Timeout case. getFunctionMetadataFromProviderTask was not the one that completed
_logger.LogWarning($"Timeout or failure in retrieving metadata from '{functionProvider.GetType().FullName}'.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about logging arguments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, should this be an error?

{
// Task completed but with an error
_logger.LogWarning($"Failure in retrieving metadata from '{functionProvider.GetType().FullName}': {getFunctionMetadataFromProviderTask.Exception?.Flatten().ToString()}");
return ImmutableArray<FunctionMetadata>.Empty;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the behavior difference considered here? Prior to this change, this would throw an exception that would bubble up in the metadata loading path (failing initialization). This will now log and move on, initializing with an empty list of functions.

var getFunctionMetadataFromProviderTask = functionProvider.GetFunctionMetadataAsync();
var delayTask = Task.Delay(TimeSpan.FromSeconds(MetadataProviderTimeoutInSeconds));

var completedTask = Task.WhenAny(getFunctionMetadataFromProviderTask, delayTask).ContinueWith(t =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit painful that this lives in WebJobs.Script (and that it targets .NET standard). WaitAsync would be a cleaner approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deadlock in GetFunctionMetadataAsync when IFunctionMetadataProvider.GetFunctionMetadataAsync does not return
3 participants