NIP-77: Negentropy syncing #1494

fiatjaf · 2024-09-12T15:58:55Z

This is a copy of @hoytech's https://github.com/hoytech/strfry/blob/nextneg/docs/negentropy.md with some small tweaks in the text (not in the protocol) just to have it here somewhere.

https://github.com/nostr-protocol/nips/blob/negentropy/77.md

vitorpamplona · 2024-09-12T16:32:00Z

IMHO, In the long run, sync will have to become a mandatory part of the protocol. I just don't see Nostr taking over everything without robust syncing operations. Yes, we can sync via REQs but that is so bad that I don't actually consider it a viable solution.

That means that the sync protocol must be extremely well defined such that we can actually have 100s of different implementations in all possible languages, with all possible database options, working together. We need to simplify and clarify this NIP to the extreme. It needs to become so good that it is fine if we add it to 01.md

Libraries can help, but the ability to code it from near scratch is a MUST to me.

kehiy · 2024-09-12T17:39:07Z

77.md

+
+```jsonc
+[
+    "NEG-ERR",


These new message types must be added to readme as well.

Let's not do it for now.

jb55 · 2024-09-18T16:40:14Z

sync will have to become a mandatory part of the protocol

I hope so

braydonf · 2024-09-18T22:03:03Z

A great one to add, this will be great for mobile clients and for relay to relay syncing. Lots of potential. Let's go! :)

staab · 2024-09-27T16:09:21Z

It would be great to merge this so that we can start checking supported_nips for support rather than targeting negentropy software versions.

braydonf · 2024-09-27T21:58:46Z

Something that could be helpful for these types of protocol extensions is a consistent machine readable response that indicates it isn't supported. Currently I'm getting several different NOTICE responses (not intended to be machine readable) from relays, such as:

"ERROR: bad msg: negentropy disabled"
"ERROR: negentropy error: negentropy query missing elements"
"could not parse command"
"ERROR: bad msg: invalid message: {'message_type': ['Invalid enum value NEG-OPEN']}
"error: bad message"
"could not parse command"
"bad message type"
"invalid: "value" does not match any of the allowed types
"Command unrecognized"

fiatjaf · 2024-10-27T02:44:01Z

https://github.com/fiatjaf/khatru supports this as of now, https://github.com/nbd-wtf/go-nostr has supported it from the client side for some weeks.

fiatjaf/khatru@1dc12e5

You were interested in trying it, @gzuuus @barrydeen, so once you did let me know if it's broken.

fiatjaf · 2024-10-27T02:50:29Z

@hoytech what do you think of getting rid of the CLOSED and TOO_MANY_RESULTS error messages and reusing the other format that is already common on Nostr of an arbitrary string with a machine-readable prefix, like closed: you took too long to respond!, blocked: this query is too big (just the part before the colon matters), then we could also use other common responses, like auth-required: _ for when a relay only accepts negentropying in these conditions, for example, and a generic error: _, for internal errors and other stuff.

hoytech · 2024-10-29T18:50:17Z

@fiatjaf - On the error message format: Good idea, this would be better. I will push a change shortly.

bezysoftware · 2024-11-01T14:09:48Z

As someone who just finished implementing the base negentropy protocol in C# (https://github.com/bezysoftware/negentropy.net) I have a few notes:

I think there is no point in attaching the base spec in this NIP (appendix)
- Most relay implementations will use a dedicated negentropy package for their language
- Devs wanting to implement it will have to look at @hoytech's repo anyway, as it's more descriptive, has more details and, more importantly, cross-implementation tests
- We don't have sha256 or others described in NIP-01, it's just a basic assumption it's something standard. I think negentropy should be treated the same
I would appreciate more platform-independent tests (e.g. written in Gherkin)
- The test suite in @hoytech's repo is great, but the items size is too huge to debug
- But those should go to that repo, not here.
I agree with streamlining the error messages

alltheseas · 2024-11-04T18:49:09Z

Added placeholder negentropy tracker nostrability/nostrability#125

vitorpamplona · 2024-11-05T14:09:18Z

I am coding in Kotlin at the moment and I am not going to lie. I hate this VarInt and the Query encoding part. Yeah, it's very optimized, but there is so much byte crunching and type overflows that we are going to have bugs everywhere on this. I wish we just made it a regular JSON object when transmitting messages between nodes to keep the Nostr theme of "not really the most optimal, but really easy to code without bugs"

bezysoftware · 2024-11-05T16:04:58Z

I am coding in Kotlin at the moment and I am not going to lie. I hate this VarInt and the Query encoding part. Yeah, it's very optimized, but there is so much byte crunching and type overflows that we are going to have bugs everywhere on this. I wish we just made it a regular JSON object when transmitting messages between nodes to keep the Nostr theme of "not really the most optimal, but really easy to code without bugs"

That was my thought too, all the VarInt and Bound encoding (and especially relative timestamps) is non trivial, but given the complexity of the algorithm overall, I think it's all or nothing. Either adopt Negentropy as it is, or not at all and use something like #1027

I still think it's worth it - invest in base libraries and you'll save a ton in bandwidth

vitorpamplona · 2024-11-05T16:49:10Z

invest in base libraries and you'll save a ton in bandwidth

I don't think varint is saving a lot. Similar to the delta timestamps in the same message, which turns the algo stateful. Maybe I am wrong, but I think @hoytech could have used longs for everything and this would be much simpler to implement at the expense of a few extra bytes in each message.

And yes, we desperately need static test cases.

alltheseas · 2024-11-05T16:59:03Z

If a nostr app, or a nostr relay implements negentropy, is it backwards compatible?

vitorpamplona · 2024-11-05T23:52:33Z

Kotlin PR is up: hoytech/negentropy#15

bezysoftware · 2024-11-06T21:23:57Z

invest in base libraries and you'll save a ton in bandwidth

I don't think varint is saving a lot. Similar to the delta timestamps in the same message, which turns the algo stateful. Maybe I am wrong, but I think @hoytech could have used longs for everything and this would be much simpler to implement at the expense of a few extra bytes in each message.

And yes, we desperately need static test cases.

That depends on the size of the diff. For fun I gutted one of @hoytech's tests with around 40k items on both sides (with 9k mismatches both ways); once with delta timestamps + varints and once without those (numbers+dates encoded as longs - 8 bytes). The non-optimized query was 2x as large in the first round.
The combination of delta timestamp+varint save space especially if the timestamps are close to each other when you can encode them using only 1 or 2 bytes,

vitorpamplona · 2024-11-06T23:29:39Z

That's the issue with the test cases... They generate created_ats on the same day (1000s of events in the same day), which is not very realistic.

But it's ok. We can keep it like that.

vitorpamplona · 2024-11-08T21:09:16Z

I finally got parity with the C++ and JavaScript implementations on hoytech/negentropy#15

Pros:

Extremely efficient in the number of messages and rounds to sync.
"Set and forget" protocol

Cons:

It's finicky. Requires absolute byte-parity on all implementations. If somebody misses an = in a bound filter, the whole thing goes crazy and the implementation might not realize it's talking oranges to a client speaking apples. This can be used to DDoS relays and clients.
It's stateful. Requires keeping the storage for Negentropy objects (which must be separate from the DB) live to complete sync and since Nostr is famous for delays, reconnections, and clients/relays straight out going away, there will be some memory costs in keeping so many filters in memory if this gets adopted for real.
No caching. The sync bands are dynamically defined which means no pre-caching of the hashes can be done. If clients and servers are syncing the same filters, caching on a static band protocol could significantly reduce computational costs.

Meanwhile, something like #826 is:

Pros:

Bugs are less problematic. It's harder to get different implementations because the bug surface is so much smaller. And if different implementations exist, a simple rule of syncing twice and giving up if it can't, can solve everything.
Cache. Hashes can be cached for syncing in repeat filters. Old data won't change so much on nostr, so caching could be an effective tool.
Stateless. Clients and relays might lose connection at any point and they will be fine. There is no added memory requirements. Each side can keep changing their DB in the middle of the sync.

Cons:

Requires 10x more rounds to complete a sync in a similar way Negentropy would do and thus it is way less efficient in time (not necessarily in processing power/data consumption).
Requires Nostr devs to make their own protocol for how to sync and what to ask for.

hoytech · 2024-11-08T21:21:14Z

Thank you for this @vitorpamplona! I pretty much agree with your assessment, although I'll make a couple comments. I'm going to find a moment to merge your PR, hopefully this weekend.

Requires absolute byte-parity on all implementations

The protocol is actually relatively forgiving of implementation differences. Yes, the test-suite is much stricter and does verify that they exactly conform, but this is mostly just to help developers detect bugs, and I could relax this if needed.

It's stateful

It depends. In theory, you could throw away the set of IDs after generating each message and re-run the query for each received message. The protocol is tolerant of changes to the ID set while messages are in flight. This would make it stateless at the expense of much higher CPU usage. Personally I think keeping state is the right tradeoff in most cases.

For implementations with btree storage like strfry, queries for specific pre-selected filters (plus their variants with since/until) can be performed entirely statelessly. I think this is most useful for full-DB syncs (ie {}) or partial but common requests (ie {"kinds":[0,3]}).

And yes, we desperately need static test cases.

Yes, this would be wonderful. I will add this to my TODO list and will make some (eventually!).

vitorpamplona · 2024-11-08T21:42:37Z

The protocol is actually relatively forgiving of implementation differences.

I ran into many issues of "forever syncing" when fingerprints/bands didn't perfectly match (because of the bugs in my implementation). I do think a malicious actor can use the sync SPEC to keep relays busy, but my real concern is if there are too many slightly-off implementations of the protocol such that nothing would work and people just give up because of the lack of consistent support across the network. Libraries help, but the risk is still there.

Let me know if you need any help running the kotlin/java stuff.

hoytech · 2024-11-12T23:09:43Z

@fiatjaf - I made a PR to update the error code format as per your suggestion: #1579

I'm updating strfry now. I will check to see if any other relays have their own implementation that needs updating (but I don't think so). AFAIK no clients depend on these exact errors at this time either.

bezysoftware · 2024-11-26T16:43:05Z

77.md

+
+Error reasons are the same format as in NIP-01. They should begin with a machine-readable single-word prefix, followed by a `:` and then a human-readable message with more information.
+
+The current suggested error reasons are


Should we also add something like invalid: unknown subscription id? AFAIK there is no equivalent in NIP-01, since it's not needed. In case of CLOSE message the client is not expecting any reply, so it's fine the relay doesn't send back an error, but in case of MSG the client is actually waiting for a response, so it makes send to send appropriate error.

Good question. Currently strfry sends closed: unknown subscription handle in this case. It doesn't bother trying to differentiate between a subscription ID that was previously open but got closed (by a NEG-CLOSE message, timeout, etc) and one that was never opened in the first place.

bezysoftware · 2024-11-29T23:07:48Z

The relay portion of this is now implemented in Netstr running here: https://relay.netstr.io/
It's internally using the Negentropy.net package

NIP-77: Negentropy syncing

8790dab

kehiy reviewed Sep 12, 2024

View reviewed changes

mikedilger mentioned this pull request Sep 20, 2024

add negentropy mikedilger/nostr-types#15

Open

flesh out NIP-77 (negentropy syncing)

d473f2c

bezysoftware mentioned this pull request Sep 27, 2024

Negentropy to sync events bezysoftware/netstr#25

Closed

alltheseas mentioned this pull request Nov 4, 2024

Negentropy sync NIP-77 placeholder nostrability/nostrability#125

Open

make negentropy error message format match NIP-01

bde7d1c

fiatjaf mentioned this pull request Nov 25, 2024

Question: relay synchronization? fiatjaf/khatru#22

Open

bezysoftware reviewed Nov 26, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NIP-77: Negentropy syncing #1494

NIP-77: Negentropy syncing #1494

fiatjaf commented Sep 12, 2024 •

edited

Loading

vitorpamplona commented Sep 12, 2024 •

edited

Loading

kehiy Sep 12, 2024

fiatjaf Sep 12, 2024

kehiy Sep 12, 2024

jb55 commented Sep 18, 2024

braydonf commented Sep 18, 2024

staab commented Sep 27, 2024

braydonf commented Sep 27, 2024

fiatjaf commented Oct 27, 2024 •

edited

Loading

fiatjaf commented Oct 27, 2024

hoytech commented Oct 29, 2024

bezysoftware commented Nov 1, 2024

alltheseas commented Nov 4, 2024

vitorpamplona commented Nov 5, 2024 •

edited

Loading

bezysoftware commented Nov 5, 2024 •

edited

Loading

vitorpamplona commented Nov 5, 2024

alltheseas commented Nov 5, 2024

vitorpamplona commented Nov 5, 2024

bezysoftware commented Nov 6, 2024

vitorpamplona commented Nov 6, 2024 •

edited

Loading

vitorpamplona commented Nov 8, 2024 •

edited

Loading

hoytech commented Nov 8, 2024

vitorpamplona commented Nov 8, 2024

hoytech commented Nov 12, 2024

bezysoftware Nov 26, 2024

hoytech Nov 27, 2024

bezysoftware commented Nov 29, 2024


		Error reasons are the same format as in NIP-01. They should begin with a machine-readable single-word prefix, followed by a `:` and then a human-readable message with more information.

		The current suggested error reasons are

NIP-77: Negentropy syncing #1494

Are you sure you want to change the base?

NIP-77: Negentropy syncing #1494

Conversation

fiatjaf commented Sep 12, 2024 • edited Loading

vitorpamplona commented Sep 12, 2024 • edited Loading

kehiy Sep 12, 2024

Choose a reason for hiding this comment

fiatjaf Sep 12, 2024

Choose a reason for hiding this comment

kehiy Sep 12, 2024

Choose a reason for hiding this comment

jb55 commented Sep 18, 2024

braydonf commented Sep 18, 2024

staab commented Sep 27, 2024

braydonf commented Sep 27, 2024

fiatjaf commented Oct 27, 2024 • edited Loading

fiatjaf commented Oct 27, 2024

hoytech commented Oct 29, 2024

bezysoftware commented Nov 1, 2024

alltheseas commented Nov 4, 2024

vitorpamplona commented Nov 5, 2024 • edited Loading

bezysoftware commented Nov 5, 2024 • edited Loading

vitorpamplona commented Nov 5, 2024

alltheseas commented Nov 5, 2024

vitorpamplona commented Nov 5, 2024

bezysoftware commented Nov 6, 2024

vitorpamplona commented Nov 6, 2024 • edited Loading

vitorpamplona commented Nov 8, 2024 • edited Loading

hoytech commented Nov 8, 2024

vitorpamplona commented Nov 8, 2024

hoytech commented Nov 12, 2024

bezysoftware Nov 26, 2024

Choose a reason for hiding this comment

hoytech Nov 27, 2024

Choose a reason for hiding this comment

bezysoftware commented Nov 29, 2024

fiatjaf commented Sep 12, 2024 •

edited

Loading

vitorpamplona commented Sep 12, 2024 •

edited

Loading

fiatjaf commented Oct 27, 2024 •

edited

Loading

vitorpamplona commented Nov 5, 2024 •

edited

Loading

bezysoftware commented Nov 5, 2024 •

edited

Loading

vitorpamplona commented Nov 6, 2024 •

edited

Loading

vitorpamplona commented Nov 8, 2024 •

edited

Loading