Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(iroh-net): Implement websocket protocol upgrade in iroh-relay #2387

Merged
merged 28 commits into from
Jun 28, 2024

Conversation

matheus23
Copy link
Contributor

@matheus23 matheus23 commented Jun 19, 2024

Implements the websocket protocol upgrade in iroh-relay and changes iroh-net clients to connect with websocket instead of iroh derp http.

State of this is:

  • cargo test -p iroh-net runs green
  • websocket on the relay is supported
  • backwards compatibility is tested
  • the client also supports connecting using websockets
    • (but we're forced to use tungstenites connection establishment, no custom MaybeTlsStream.)
    • It decides this via the URL scheme (ws(s):// vs. http(s)://)

TODO:

  • Enumify the client so it's possible to keep using the old relay protocol
  • Cleanup!
  • Some perf TODOs in the server, less copying.
  • Update documentation (e.g. local_addr having another case being None)
  • Add metrics for websocket-accept & derp-accept counts
  • Snapshot tests (using parse_hexdump) for "new" wire format

Description

  • Supports clients connecting to iroh-relay with websockets & running the derp protocol over websocket Binary msgs.
  • Adds support for the relay answering Upgrade: websocket headers instead of Upgrade: iroh derp http appropriately.
  • Adds support for ClientBuilder to dial via websockets, if the relay URL is set with a ws/wss URL scheme.

Breaking Changes

  • Not in the protocol: The old HTTP upgrade & protocol should still be supported. There's tests to ensure this.
  • Client::local_addr will now also return None, if one connected using websockets.
  • iroh_net::relay::http::ClientError: Added a variant WebsocketError for errors when establishing a connection using websockets.

Notes & open questions

Closes #2370

Change checklist

  • Self-review.
  • Documentation updates if relevant.
  • Tests if relevant.
  • All breaking changes documented.

@matheus23 matheus23 changed the title feat: Implement websocket protocol upgrade in iroh-relay feat(iroh-net): Implement websocket protocol upgrade in iroh-relay Jun 19, 2024
@matheus23
Copy link
Contributor Author

FWIW, I'm still hacking on stuff over here. There's some module visibility changes that are probably not necessary. I'll do a pass over the diff at the end - making sure it's minimal.
Just working in the open - that's why the PRs here (and thanks for the comments already - that's why I opened it :) ).

@dignifiedquire dignifiedquire added this to the v0.20.0 milestone Jun 20, 2024
@matheus23 matheus23 self-assigned this Jun 21, 2024
@matheus23 matheus23 marked this pull request as ready for review June 21, 2024 14:59
@matheus23
Copy link
Contributor Author

matheus23 commented Jun 21, 2024

Rebased this on main & flipped this to ready for review.


Seeking Feedback

I'm still not happy with some things in this PR:

  1. We have both Frame::Ping and tungstenite::Message::Ping, so sometimes you ping-pong on the websocket level, sometimes on the "derp" level. That's werid.
    • this is fixable, but would require quite a bit of work. The Ping messages don't match exactly - and we'd need to think about how to handle this in the client's loop (how does it decide to send websocket ping messages instead of derp pings? and similar issues with close frames).
  2. I'm using tokio_tungstenite_wasm::WebSocketStream in the client, in preparation for making this browser-ready. However, this means two things:
    • we lose control over the connection handshake, i.e. we're not using our own MaybeTlsStreamReader etc., which in turn means we can't mock the WebSocketStream in tests the same way, and the HTTP proxy support doesn't work (perhaps the proxy stuff is fixable?)
    • we have an annoying BiLock somewhere inside our SplitSink and SplitStream wrappers, that split the WebSocketStream into parts. Instead, it'd be nice to eventually just separate the sink and stream completely, but that means we'll need to reimplement some parts of the websocket protocol (when to respond to pings & when to create close frames), but also, this is completely irrelevant for the browser case, since there it's not fixable.

Would love feedback on these things.


Also: There's the cargo semver check failure. How do I fix that? Do I bump the version within the PR?

@dignifiedquire
Copy link
Contributor

  1. We have both Frame::Ping and tungstenite::Message::Ping

My understanding is that these are for different reasons, websocket pings are keep alive pings, where as the Frame::Ping is mostly used for testing/measuring

@dignifiedquire
Copy link
Contributor

Also: There's the cargo semver check failure. How do I fix that? Do I bump the version within the PR?

please make it a separate PR

Cargo.lock Outdated Show resolved Hide resolved
Cargo.lock Outdated Show resolved Hide resolved
@matheus23
Copy link
Contributor Author

On what I wrote above about seeking feedback:

  1. Seems like it's fine to keep both.
  2. Yes, we want to do this, but I'll do it in a PR afterwards - bringing back some more testing & proxy support, tracked in Support HTTP proxies in clients connecting to relays over websockets #2418

And it seems like I can just let the semver check sit there on failing. I've documented the breaking changes.

@matheus23 matheus23 requested a review from ramfox June 26, 2024 15:44
Copy link
Contributor

@divagant-martian divagant-martian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have been trying to move away from the DERP naming in general, but I guess coming up with a better name than "custom relaying protocol", which quite correctly is used here to describe what it is, requires some bikeshedding

iroh-net/src/relay/client.rs Outdated Show resolved Hide resolved
iroh-net/src/relay/client.rs Outdated Show resolved Hide resolved
Comment on lines 285 to 315
pub(crate) fn into_ws_message(self) -> std::io::Result<tungstenite::Message> {
Ok(tungstenite::Message::binary(self.into_ws_vec()))
}

/// Serializes this to bytes and wraps it in a websocket binary message for use in wasm.
pub(crate) fn into_wasm_ws_message(self) -> std::io::Result<tokio_tungstenite_wasm::Message> {
Ok(tokio_tungstenite_wasm::Message::binary(self.into_ws_vec()))
}

/// Unwraps any binary messages received via websockets and parses them into `Frame`s.
///
/// Ignores any non-binary websocket messages with a warning.
pub(crate) fn from_ws_message(
msg: tungstenite::Result<tungstenite::Message>,
) -> Option<anyhow::Result<Self>> {
match msg {
Ok(tungstenite::Message::Binary(vec)) => Some(Self::from_ws_vec(vec)),
Ok(msg) => {
tracing::warn!(?msg, "Got websocket message of unsupported type, skipping.");
None
}
Err(e) => Some(Err(e.into())),
}
}

/// Unwraps any binary messages received via websockets (in wasm) and parses them into `Frame`s.
///
/// Ignores any non-binary websocket messages with a warning.
pub(crate) fn from_wasm_ws_message(
msg: tokio_tungstenite_wasm::Result<tokio_tungstenite_wasm::Message>,
) -> Option<anyhow::Result<Self>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these four not TryFrom impls?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm used to just not do *From traits except for errors.
I agree this is one of the cases where using TryFrom really is appropriate 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this again - the from_ws_message and from_wasm_ws_message functions actually return an Option<Result<...>>, so you can't implement TryFrom for them. We could write From functions, but IMO that's now getting really far away from how From and TryFrom are used usually.

Perhaps it makes sense to bikeshed the function names?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given they are fallible, maybe call them try_from_ws_message

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meh - ended up moving these helpers out of Frame and just inline their logic into the Stream and Sink implementations.
That also uncovered some weird, unnecessary ? from the into_ws_message call-site. I think the code just belongs closer to the Stream implementation, the logic was quite intertwined anyways (effectively some nested matches, this is now flattened).

Instead, it's only two new functions on Frame remaining: Frame::encode_for_ws_msg and Frame::decode_from_ws_msg.

Feels way better IMHO.

iroh-net/src/relay/codec.rs Outdated Show resolved Hide resolved
iroh-net/src/relay/http/client.rs Show resolved Hide resolved
iroh-net/src/relay/http/server.rs Outdated Show resolved Hide resolved
iroh-net/src/relay/http/server.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@ramfox ramfox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. Wooo!

My only note, which diva mentioned, is I don't love doubling down on the name "derp", since we have been trying to step away from calling the protocol "derp" since we've technically broken away from it multiple times now 😂

However, I think this is a problem best left for our "wire protocol" review in anticipation of our mythical "one final breaking change" plan. Gotta love that we still ping the /derp endpoint and use 'iroh derp http' in our headers 🤣

So I say merge without worrying about the derp naming, and leave that to a more general "relay protocol" follow up later.

iroh-net/src/relay/codec.rs Show resolved Hide resolved
@divagant-martian divagant-martian added this pull request to the merge queue Jun 28, 2024
Merged via the queue into main with commit 17c654e Jun 28, 2024
24 of 25 checks passed
@divagant-martian divagant-martian deleted the matheus23/relay-websockets branch June 28, 2024 16:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Add websocket HTTP upgrade/framing support to iroh-relay
4 participants