Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for xDS based EndpointGroup #5450

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

jrhee17
Copy link
Contributor

@jrhee17 jrhee17 commented Feb 7, 2024

Motivation:

Explain why you're making this change and what problem you're trying to solve.

Modifications:

  • List the modifications you've made in detail.

Result:

  • Closes #. (If this resolves the issue.)
  • Describe the consequences that a user will face after this PR is merged.

Copy link

codecov bot commented Feb 13, 2024

Codecov Report

Attention: Patch coverage is 84.50185% with 168 lines in your changes are missing coverage. Please review.

Project coverage is 74.18%. Comparing base (b8eb810) to head (bd4968f).
Report is 203 commits behind head on main.

❗ Current head bd4968f differs from pull request most recent head 30dfecb. Consider uploading reports for the commit 30dfecb to get more accurate results

Files Patch % Lines
...p/armeria/xds/client/endpoint/XdsEndpointUtil.java 63.09% 17 Missing and 14 partials ⚠️
...rp/armeria/xds/client/endpoint/ClusterManager.java 70.40% 22 Missing and 7 partials ⚠️
...ria/xds/client/endpoint/ZoneAwareLoadBalancer.java 72.82% 14 Missing and 11 partials ⚠️
...rmeria/xds/client/endpoint/SubsetLoadBalancer.java 86.27% 13 Missing and 8 partials ⚠️
...a/xds/client/endpoint/ZoneAwareLbStateFactory.java 92.77% 7 Missing and 6 partials ⚠️
.../xds/client/endpoint/SubsetPrioritySetBuilder.java 73.68% 8 Missing and 2 partials ⚠️
...corp/armeria/xds/client/endpoint/MetadataUtil.java 85.71% 4 Missing and 5 partials ⚠️
.../linecorp/armeria/xds/client/endpoint/HostSet.java 86.20% 6 Missing and 2 partials ⚠️
...corp/armeria/xds/client/endpoint/EndpointUtil.java 88.37% 2 Missing and 3 partials ⚠️
...t/endpoint/WeightedRandomDistributionSelector.java 91.83% 1 Missing and 3 partials ⚠️
... and 5 more
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #5450      +/-   ##
============================================
+ Coverage     73.95%   74.18%   +0.22%     
- Complexity    20115    20996     +881     
============================================
  Files          1730     1821      +91     
  Lines         74161    77501    +3340     
  Branches       9465     9901     +436     
============================================
+ Hits          54847    57492    +2645     
- Misses        14837    15342     +505     
- Partials       4477     4667     +190     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@github-actions github-actions bot requested review from minwoox and trustin March 4, 2024 15:29
@jrhee17 jrhee17 force-pushed the poc/xds-endpoints branch 2 times, most recently from a320ce5 to 72788df Compare March 6, 2024 13:56
@jrhee17 jrhee17 changed the title Modify XdsEndpointGroup to be a CompositeEndpointGroup Add support for xDS based EndpointGroup Mar 7, 2024
@jrhee17 jrhee17 force-pushed the poc/xds-endpoints branch 2 times, most recently from 43f442b to 66cde21 Compare March 13, 2024 01:52
minwoox pushed a commit that referenced this pull request Mar 22, 2024
…5502)

Motivation:

More classes will be added related to client endpoint, so I propose that we move `XdsEndpointGroup` from the package `com.linecorp.armeria.xds` to `com.linecorp.armeria.xds.client.endpoint`.

Although this PR is mostly refactoring, a new API `EndpointGroup#of(ClusterSnapshot)` is added since `ConfigSourceClient` uses `GrpcService` to decide on the endpoint which will be used.

POC: #5450

Modifications:

- Move `XdsEndpointGroup`, `XdsConstants` to `com.linecorp.armeria.xds.client.endpoint`
- Introduce a new API `EndpointGroup#of(ClusterSnapshot)`
- Introduce `XdsEndpointUtil` and move `client.endpoint` related utility methods from `XdsConverterUtil`

Result:

- Preparation for adding new features to `XdsEndpointGroup` is done

<!--
Visit this URL to learn more about how to write a pull request description:
https://armeria.dev/community/developer-guide#how-to-write-pull-request-description
-->
minwoox pushed a commit that referenced this pull request Mar 22, 2024
…5501)

Motivation:

`EdfLoadBalancer` is a basic load balancer which distributes entries based on weight using [Earliest Deadline First Scheduling](https://en.wikipedia.org/wiki/Earliest_deadline_first_scheduling).

ref: https://github.com/envoyproxy/envoy/blob/b7818b0df716af47ec22982c5a1cbbace5f2ae15/source/common/upstream/load_balancer_impl.h#L508

This is used not only for request load balancing, but also load balancing based on `Locality`

ref: https://github.com/envoyproxy/envoy/blob/b7818b0df716af47ec22982c5a1cbbace5f2ae15/source/common/upstream/upstream_impl.cc#L625

Within our codebase, it seems like we could use `WeightedRandomDistributionEndpointSelector` easily for this purpose. I propose that we extract a `WeightedRandomDistributionSelector` and make it publicly available.

See the following for a sample use-case in #5450
ref: https://github.com/line/armeria/blob/bd4968fda63089b2c309583b21201dff1a1119bf/xds/src/main/java/com/linecorp/armeria/xds/client/endpoint/HostSet.java#L38-L39

POC: #5450

Modifications:

- Extract `WeightedRandomDistributionSelector` and move it to the `internal` package.

Result:

- `WeightedRandomDistributionSelector` is now ready to be reused.

<!--
Visit this URL to learn more about how to write a pull request description:
https://armeria.dev/community/developer-guide#how-to-write-pull-request-description
-->
jrhee17 added a commit that referenced this pull request Apr 2, 2024
…int pt 3) (#5503)

This PR attempts to support `DNS`, Health Checked clusters. This is done
by converting `ClusterSnapshot` to an `EndpointGroup` via
`XdsEndpointUtil#convertEndpoints`. The rest of the changes are simply
refactoring.

Motivation:

There are three major points of change:

1. `XdsEndpointUtil#convertEndpointGroup` has been introduced which
converts a `ClusterSnapshot` into an `EndpointGroup`.

This method essentially converts a `ClusterSnapshot` into a
`DnsAddressEndpointGroup` or a `HealthCheckedEndpointGroup`. Note that
only basic functionality is supported at this stage.

During this implementation, I found it convenient to refer to a
`LbEndpoint` or `LocalityLbEndpoints` for a particular `Endpoint`. For
this reason, `XdsAttributeKeys`, `XdsAttributeAssigningEndpointGroup`
has been introduced so that xDS resource information can be retrieved
easily. This is useful later as well, in order to fetch the locality,
priority, health, etc. of a particular endpoint.

2. New data structures have been introduced to match that of envoy's.
Some include:
- `LoadBalancer`:
https://github.com/envoyproxy/envoy/blob/d5bfc06f110647127edc5ddc3bd6cfda78f2760b/source/common/upstream/load_balancer_impl.h#L83
- `ClusterManager`:
https://github.com/envoyproxy/envoy/blob/d5bfc06f110647127edc5ddc3bd6cfda78f2760b/source/common/upstream/cluster_manager_impl.h#L245
- `ClusterEntry`:
https://github.com/envoyproxy/envoy/blob/d5bfc06f110647127edc5ddc3bd6cfda78f2760b/source/common/upstream/cluster_manager_impl.h#L578
- `PrioritySet`:
https://github.com/envoyproxy/envoy/blob/d5bfc06f110647127edc5ddc3bd6cfda78f2760b/envoy/upstream/upstream.h#L533
- `SubsetLoadBalancer`:
https://github.com/envoyproxy/envoy/blob/d5bfc06f110647127edc5ddc3bd6cfda78f2760b/source/extensions/load_balancing_policies/subset/subset_lb.h#L135

The nuances are slightly different due to a difference in existing APIs,
difference in threading model and lifecycles. However, an effort has
been made to keep the implementation relatively similarly looking.

Note that the logic inside `SubsetLoadBalancer` is changed minimally to
keep the changeset small.

3. The way updates are propagated has changed for `XdsEndpointGroup`.

`ClusterManager` now receives updates on the `ListenerSnapshot` by
listening to the provided `ListenerRoot`. On each snapshot update, a
`ClusterEntry` is created for each `ClusterSnapshot`. `ClusterEntry`
constructs the `EndpointGroup` based on the provided `ClusterSnapshot`
and endpoint updates (dns/health checked `EndpointGroup`s may update
multiple times).

Whenever 1) A `ClusterEntry#endpointGroup`'s endpoints are updated or 2)
`ListenerSnapshot` is updated, then `ClusterManager#notifyListeners`
calls `XdsEndpointGroup#setEndpoints`. This will eventually trigger
`XdsEndpointSelector#selectNow`. This will subsequently trigger
`ClusterManager#selectNow` and `LoadBalancer#selectNow`.

**Miscellaneous changes**
- `XdsEndpointGroup` implements `EndpointGroup` directly since custom
comparison logic is required for `XdsEndpointGroup#updateState`
- I think we will eventually need to turn `Endpoint` into an interface,
and supported various types of endpoints (e.g. `HealthCheckedEndpoint`,
`XdsEndpoint`, etc..). For now, I propose to use attributes for
simplicity.
- Because `ClusterManager#notifyListeners` is called for any update
(i.e. update to the snapshot, `DnsAddressEndpointGroup` updates, etc..)
there is a good chance that empty endpoints will be set. This isn't
ideal based on the normal use case where users usually wait for initial
endpoints via `EndpointGroup#whenComplete`. I've set
`allowEmptyEndpoints = false` to be the default.

POC: #5450

Modifications:

- Introduce `XdsEndpointUtil#convertEndpointGroup`, which creates an
`EndpointGroup` based on the `ClusterSnapshot`. `DNS`, `HealthCheck`
capabilities are partially supported.
- `LoadBalancer`, `ClusterManager`, `ClusterEntry`, `PrioritySet`,
`SubsetLoadBalancer` are newly introduced.
- `ClusterManager` listens for any updates regarding `Endpoint`s or
`Snapshot`s and notifies `EndpointGroup` if there are any changes. Every
time this occurs, `LoadBalancer#selectNow` is called.

Result:

- `XdsEndpointGroup` now partially supports DNS and health check.

<!--
Visit this URL to learn more about how to write a pull request
description:

https://armeria.dev/community/developer-guide#how-to-write-pull-request-description
-->

---------

Co-authored-by: minux <[email protected]>
@trustin
Copy link
Member

trustin commented Jun 6, 2024

I guess you're going to update this PR again once #5610 is merged, right, @jrhee17?

minwoox pushed a commit that referenced this pull request Jun 10, 2024
Motivation:

This pull request attempts to implement the default [EdfLoadBalancerBase](https://github.com/envoyproxy/envoy/blob/bdff856f461c2261305b8303285e83d6f2cec627/source/common/upstream/load_balancer_impl.h#L508) which is the default load balancer used in envoy. Most load balancer implementations in envoy derive from this base class.

The basic load balancer supports [locality weighted load balancing](https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/upstream/load_balancing/locality_weight) and [priority levels](https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/upstream/load_balancing/priority).

Note that not all functionality (namely zone aware load balancing, etc..) have been implemented for simplicity.
Additionally, [SubsetLoadBalancer](https://github.com/envoyproxy/envoy/blob/bdff856f461c2261305b8303285e83d6f2cec627/source/extensions/load_balancing_policies/subset/subset_lb.h#L135) has not been implemented in this PR for easier reviewing. The previous `SubsetLoadBalancer` will be replaced in the next (and final) pull request.

This pull request mostly focuses on updating the `LoadBalancer` state when a `ClusterSnapshot` is updated.
The flow is as follows:
1. `ClusterEntry#accept` is called, which indicates a `ClusterSnapshot` has been updated
2. A [PriorityStateManager](https://github.com/envoyproxy/envoy/blob/bdff856f461c2261305b8303285e83d6f2cec627/source/common/upstream/upstream_impl.h#L1347) computes endpoints per priority and locality.
  - https://github.com/envoyproxy/envoy/blob/bdff856f461c2261305b8303285e83d6f2cec627/source/common/upstream/upstream_impl.cc#L2075
3. A [PrioritySet](https://github.com/envoyproxy/envoy/blob/bdff856f461c2261305b8303285e83d6f2cec627/envoy/upstream/upstream.h#L512) is created. This `PrioritySet` contains a map of `Integer -> HostSet` where a [HostSet](https://github.com/envoyproxy/envoy/blob/bdff856f461c2261305b8303285e83d6f2cec627/envoy/upstream/upstream.h#L382) contains host information for each health/locality.
  - https://github.com/envoyproxy/envoy/blob/bdff856f461c2261305b8303285e83d6f2cec627/source/common/upstream/upstream_impl.cc#L2099
  - https://github.com/envoyproxy/envoy/blob/bdff856f461c2261305b8303285e83d6f2cec627/source/common/upstream/upstream_impl.cc#L610
4. A `DefaultLbStateFactory` is created which creates a `LbState` for the `DefaultLoadBalancer`. A `LbState` is a convenience object used to avoid potential race conditions. The `LbState` is replaced atomically from the perspective of `DefaultLoadBalancer`. 
  - In terms of `envoy` source, each load balancer registers a callback which listens for `HostSet` updates. Once a `HostSet` is updated, the load balancer state is updated.
  - https://github.com/envoyproxy/envoy/blob/bdff856f461c2261305b8303285e83d6f2cec627/source/common/upstream/load_balancer_impl.cc#L149-L154
  - Essentially, a range from 1~100 is assigned to each priority where healthy endpoints are prioritized.
5. Lastly, the load balancer chooses a `HostSource`, and subsequently a host from the selected `HostSet`
  - https://github.com/envoyproxy/envoy/blob/bdff856f461c2261305b8303285e83d6f2cec627/source/common/upstream/load_balancer_impl.cc#L1173
  - https://github.com/envoyproxy/envoy/blob/bdff856f461c2261305b8303285e83d6f2cec627/source/common/upstream/load_balancer_impl.cc#L891-L892

ref: #5450

Modifications:

- `ClusterEntry` now creates a `PrioritySet` for each `ClusterSnapshot` update
  - In order to create a `PrioritySet`, utility classes `PriorityStateManager`, `PriorityState`, `HostSet`, `UpdateHostsParam` have been created.
  - Utility classes `EndpointUtil`, `EndpointGroupUtil` have been created
- `DefaultLoadBalancer` has been introduced. The functionality of this `LoadBalancer` is equivalent to `EdfLoadBalancer`.
- `DefaultLbStateFactory` has been introduced to facilitate creating a state for `DefaultLoadBalancer`. The `LbState` created by `DefaultLbStateFactory` is intended to be updated atomically from the perspective of `DefaultLoadBalancer`.

Result:

- Priority and locality based load balancing is now available for `XdsEndpointGroup`

<!--
Visit this URL to learn more about how to write a pull request description:
https://armeria.dev/community/developer-guide#how-to-write-pull-request-description
-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants