-
Couldn't load subscription status.
- Fork 707
NIP-DC: Direct Connect #2075
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
NIP-DC: Direct Connect #2075
Conversation
|
@chakany pinging you to ask what you think about this refactoring of your original nip, and if you are ok with being credited as an author |
|
NIPs tend to do better when they are specific, not generalistic. Where are you going to apply this protocol? For instance, we don't want the "presence" to be used by both a text-only chat client that doesn't support video and a video call client that doesn't support chats. Additionally, A voice-only client should not use the same protocol of those other two. Clients that are only sharing files need other clients that can interpret file-sharing, etc. Otherwise, people will know they are in the same place, but their clients cannot see or talk to each other, even though they implement this NIP. |
|
Thanks for your feedback, I’m using an earlier iteration of this for the netcode in ngengine. The idea is that every P2P app needs a way for peers to discover each other and coordinate the initial connection. Typically, this is handled by centralized servers or by a DHT, as in the case of Holepunch’s Hyperswarm and other decentralized protocols. This NIP standardizes a way for apps to do this on Nostr. However, it doesn’t make apps inherently interoperable, but it can be a base layer for other nips. I think it is similar to nip-90 in this regard. I can think of two ways to use this nip:
Maybe the "Standard protocols" section can be rewritten as an "example", so to not make this nip tied specifically to some undefined use of webrtc . |
Sure, but the point of this repository is to make things interoperable. The way NIP-90 solves this is by having a separate repo with a separate event kind list and documentation procedure for each type of DVM, fully documented. For this PR, I guess it would mean that each game would have to document their game protocol to allow other clients to code and run the same game from start to finish. Also, keep in mind that NIP-90 is a terrible NIP. There has been multiple efforts to completely rewrite it because right now it is so general that it doesn't really help anyone. None of those efforts are moving forward because nobody agrees to anything. Which is the worst it can happen to a repo whose sole purpose is to get implementers together not further apart. |
|
I’ve been experimenting with NIP-90 and I get where you’re coming from, but this NIP works at a much lower level. The signaling layer is an independent issue from the app protocol itself, which means it can be implemented in standard Nostr libraries and be interoperable. The actual data exchange layer is a separate concern, that’s beyond the scope of this PR, except for a few general notes on how it might be facilitated. If you pair this signaling NIP with any UDP-based transport, NAT traversal, and a STUN server, you already have everything you need for a P2P connection. WebRTC conveniently handles transport and NAT traversal and is widely available. What’s left is the signaling, which this NIP provides, and STUN, which could also be delivered via relays. This leads to a standard way for developers to build P2P apps on Nostr, that doesn't require to reinvent the wheel for every app. I hope this clarifies my point of view, that is about making the signaling part of p2p protocols interoperable, not the apps built on top of them. If you still disagree, that’s fine, I think this is a complex issue and there are many valid perspectives. Anecdotally, I was initially more drawn to Holepunch because their modules make building P2P apps easy, that's one of the things i am working to improve, because I think we can achieve the same on Nostr, with the added benefit of full browser support, using NIPs like this and their corresponding implementations in standard libraries. Which is why I’m bundling WebRTC support in nostr4j. An added bonus is that relays and clients will recognize (or ignore) these events as likely P2P signaling, rather than having to handle apps embedding signaling data in DMs or other unrelated or conflicting event kinds. |
I think it is a big assumption that every P2P app needs the same signaling layer. At least, from some early tests I did 2 years ago, only the "presence" message was needed, and it didn't need a public key or expiration. When you say things like: "The offer format is protocol specific". This means that if I am coding this, I will get offers all the time in formats I don't know and thus cannot support. That seems like wasted resources. My client should only be downloading the events it can parse in the application layer. So, to me, all of these other message types are all application dependent and should probably be using event kinds that only their app download. Otherwise, if this is successful and everybody uses it for their game, clients are going to connect to a relay, download thousands of stuff they cannot parse before the first thing they can. For instance, if a client uses this but not for Shouldn't your "disconnect" message link to the respective "connect" one? Otherwise, if a client (or multiple clients) is connected to 3 protocols/games and disconnects from 2, how would the others know which one is being disconnected? Or can I not connect with the same room in 2 clients at the same time? |
Presence is not enough, you need to negotiate a connection with the other peers, as they will have different networks conditions:
etc.. This nip implements the webrtc handshake that can also update the ice candidates as they come (using the routes event). Why do you think you need more or less than this? The expiration is used to know when to consider a peer "lost" when the connection drops without a disconnection packet. The public key (i suppose the room public key?) is a convenient id to group peers that want to connect to eachother (see response below) while also having a way to prove they are authorized to do so (they must have the room private key).
You won’t be flooded with irrelevant signaling events, since this is scoped to the specific room you connect to. Also, you shouldn’t need to subscribe to the kind, because starting the signaling phase already requires the room private key, that is obtained through discovery or sharing (this is outside the scope of this NIP). You only subscribe to the P tags matching the room public key. In the case of a game, the room would typically be shared via some form of matchmaking. An app could provide a clickable link. I suppose, NIP-53 could be used for this too.
Good point. Disconnecting from 3 games simultaneously isn’t possible for the reasons mentioned above, but the current spec would allow two instances of the same app, using the same keypair and connected to the same room, to be disconnected at the same time by a single disconnect packet. To prevent this, I think it would be better to enforce the use of throwaway keypairs for signaling. That way, the two apps cannot eavesdrop on each other’s traffic. EDIT: Thanks or the feedbacks, I made some changes and added a |
How does this work? Nothing in the spec requires the private key for the room. Anyone can just spam any
Because I coded a thin webrtc some years ago and didn't need a room definition. Users would just post a "presence"-like event attesting that they are online, which already includes the coordinates to connect to them directly. The concept of a "room" is quite weird. Maybe it's necessary for games, but regular P2P nostr or voice calls, for instance, don't actually require any room. I assume you want to use TURN servers for the non-nostr-event payloads, which I wasn't using at the time, and it is not P2P anymore, which means that any of the server-in-the-middle libraries out there can be used. IMO, TURN servers defeat the purpose of any P2P stack. But with them, you will need some parts of these other message types. Still, the Offer/Answer protocol is just a way to describe an infinite number of features to be used by each app. On Nostr, each NIP generally just picks a preferred configuration to avoid forcing clients to support multiple options in the flow. For instance, we only use secp256k1 for signing messages, not any cryptographic curve or algo. We only use AES-GCM for encryptions in NIP-17. We only use ChaCha for private chats. There are no options. Which means there is no need to make a protocol to understand and choose options. It would be great if we could pre-choose options for this NIP, too. Options work great when the same company is designing both sides of the call (WhatsApp only talks to WhatsApp). But it is terrible when you have 100 clients, all coded from scratch, trying to support multiple paths in the exact same way without any shared codebase At the time, my biggest hope was to make an IPv6-only P2P protocol so that none of these NAT/Firewall issues can get in the way to complicate things. I am not sure if we are there now, but at the time, most Amethyst instances already had access to IPv6. |
The private key requirement is in the encryption section. Of course, anyone can spam a public relay, but without the room key their messages are meaningless to peers in the room.
Publishing everyone's IPs on Nostr and connecting blindly isn’t ideal. Without a room, every peer can try to connect to everyone. A room provides a clean way to scope discovery and ensures that only selected peers can see each other. With the offer-answer handshake you know who you’re talking to BEFORE attempting a direct connection. And you know who is trying to connect to you, this can be used to attempt nat traversal from the other side. Without the offer-answer handshake you will be also forced to do some more complex handshake after the connection, because you are still going to need to know which packet is from whom.
I’d argue there’s always some notion of a room/topic, whether for a group chat, a call, or a game. It’s just the mechanism that ensures only the intended peers are discovered.
TURN is only used as an optional fallback when direct P2P isn’t possible. It’s not a replacement for P2P. The handshake step can also carry other useful info: protocol version, session metadata, acceptance/rejection, etc.
That would be ideal, but IMO not realistic today. Many consumer networks are still IPv4, and even IPv6 devices might sit behind firewalls and nats. Even in a pure IPv6 world, you’d still need signaling to know who to connect to and how, you’d just skip NAT traversal. |
It doesn't need to be public. There are many crypto schemes that given apps need to use. The event is just a marker that a client is online. In my case, the presence kind was per NIP. So a Voice Call NIP would use a presence event kind just for itself and would specify that the follow-up would transfer IP via direct NIP-44 encryption between the two Forcing every type of application to use a fixed shared secret scheme via rooms is not ideal, IMO. What if users don't trust each other to share that secret? What if the application is doing a better MLS-based design where the secrets are based on the hierarchical trees such that each subgroup is a separate secret. What if applications need a public IP instead?
You will always know before attempting a direct connection, regardless of which protocol you use.
I don't think that is true. Yes, the application will have to do some handshake. But that will not be "more complex" in any way because each NIP should define a handshake that removes everything that is not needed for that particular application and adds more information that is exclusive to this application flow, without having to bother the other types of apps out there. Some apps will have a very simple handshake, others will have a multi-round one. It's unfair to ask apps that just need a simple handshake to defensively implement all possible handshakes just to "comply" with this NIP. |
|
All I am saying is that if you focus this NIP on the needs of the gaming engine handshake, you can reduce the complexity this abstraction is creating, specify more parts of the flow to streamline implementations, and get to a better level of interoperability between the apps you care about. Then the other WebRTC (or even broader P2P) handshakes can all define their own NIPs with their own handshakes as well. It's a win-win. |
|
If I’m understanding correctly, what you’ve described here is an offer-answer flow, similar to the one described in this nip, but you filter presence by kind instead of I still believe standardizing signaling is possible and beneficial with some iteration of this event set, but if there’s no consensus on that, I think it would at least be worth reserving a kind for app signaling, similar to what NIP-78 does. Anyhow, thank you again for taking the time to share your feedback. |
I didn't say everybody should use that protocol. So much so that I didn't even create a NIP PR for it. What I am saying is that each app type should probably use their own event kinds with its own structures and flows that maximize for their unique needs. Making all apps use just one fixed abstraction layer blocks them from exploring more efficient protocols... that might only work for them, but that's ok. Sometimes these generalistic protocols are necessary. But here it does feel like every app could benefit from the added freedom. Either way, I look forward to seeing this working with the complete flow for the engine. |
Based on #363 - mostly a refactoring with some generalization
This NIP describes how Nostr relays can be used for the signaling required to establish direct peer-to-peer connections between two or more participants.
The first part defines a generic abstraction of the signaling events, and the second part specifies an implementation for WebRTC data channels (ie. generic binary packets).