Plan for clustering and streaming #277

tiedotguy · 2019-11-11T07:36:10Z

I've had several in-person conversations about the direction I'm heading with the HTTP refactoring work, however that's not useful for people outside of Atlassian who aren't aware of the approach being taken. This issue is to document the roadmap (not timeline), to explain my thoughts and the dependencies.

Step 1: unify HTTP clients. Up until recently, every component of the system made its own HTTP client. This resulted in a lot of copying and pasting and some amount of inconsistency. This has been done (#260)

Step 2: consolidate HTTP logic. This abstracts away lower level things such as retry logic, and turns "make HTTP request" in to "send a message". First PR is in draft as #272, however it requires another PR to actually consume it across the code base.

Step 3: add clustering. This is receiving a single message in the aggregation layer, and splitting it up in to messages destined for other aggregation hosts, or processing locally. This is sitting behind steps 1+2, because I don't want to add yet another HTTP client that needs an inevitable cleanup.

Step 4: add influxdb. This is fairly trivial, but again, I don't want another HTTP client adding to tech debt.

Step 5: add output streams. Once step 2 is completed, and components are sending a message to an http client/transport created by a central system, that message can be processed as an alternate form. While the exact plan isn't decided yet, this could be something like specifying a URL of kafka://somethingsomething, or it could be specifying a URL of https://somethingsomething, and having a roundtripper which submits it to Kafka.

This sidesteps the issue of coming up with an output format, because the output format is defined by the caller of the client. If the caller is the HTTP forwarder, then it's effectively pre-aggregated raw metrics. If the caller is the Datadog backend, then it's the Datadog JSON format, etc.

Step 6: add input streams for raw data. This is pretty vague, and needs some planning, but it's essentially taking the output from the forwarder, sending it to a stream, then having another instance read that stream and aggregate it. There may be some other steps before this, specifically around handling timestamps.

tiedotguy added the enhancement label Nov 11, 2019

tiedotguy self-assigned this Nov 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Plan for clustering and streaming #277

Plan for clustering and streaming #277

tiedotguy commented Nov 11, 2019

Plan for clustering and streaming #277

Plan for clustering and streaming #277

Comments

tiedotguy commented Nov 11, 2019