-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
crash due to "mismatched follow" #472
Comments
I can confirm this issue, with a similar stack trace. |
Like taziden, I had the same error and a similar stack trace this afternoon. I'm running etherpad-lite commit 309e3b0 on node v0.6.11. Do either of you have logs from the crashes you experienced? The baserev property of the USER_CHANGES messages sent from the users normally increase by one each time. In this case, the baserev of the USER_CHANGES message immediately proceeding the crash went down by one. By itself, I think that's okay. It just means the change is based on a previous revision (not the latest). But I also see a client-side JS error logged less than 10 seconds earlier. It's a shot in the dark, but logs in other cases where this error occurred may help us determine if there is any relation. I still have to do some reading up on changesets. |
I replaced PadMessageHandler.js line 431 (in the commit I'm running) changeset = Changeset.follow(c, changeset, false, apool); with try
{
changeset = Changeset.follow(c, changeset, false, apool);
}
catch(e)
{
console.warn("Can't apply USER_CHANGES "+changeset+", possibly because of mismatched follow error");
client.json.send({disconnect:"badChangeset"});
return;
} Explanation: Changeset.follow() runs assert() which throws an error if things don't match properly. The try-catch block disconnects the user submitting the mismatch. There's a very similar block about 30 lines above, which is what I based this on. Feedback is appreciated. |
EDIT - I would ignore this comment as it may have been premature. The issue here seems to be unrelated to the mismatched follow error and the code in my previous comment appears to solve that problem on its own. At the very least, further exploration is required.
|
Did we ever see a pull request with this fix in? |
No, @johnyma22. I have not created a pull request for two reasons: (a) my lack of familiarity with git and pull requests, and (b) potential impact on other moving parts of Etherpad Lite. If you or others with experience evaluate the code and determine it to be valid, I have no problem with it being added to the project. |
I started seeing this crash frequently with about 10 users, some using older browsers. The two code changes above abolished the crashes. Thank you very much ! A minor side effect is authors sometimes disappear for a short interval and then reappear again. |
Hi @iamer, I'm glad to know the code works for you! You said you applied two changes, but as I wrote above, you should ignore the second. (I've actually just edited the comment to apply strikethrough text so no one else gets confused.) The correct code has been running for me without issue for quite a while. We're not seeing authors disappear and reappear, so I wonder what happens if you undo the code change that is (now) crossed out. |
I've reverted the spurious change, will keep an eye on the logs and report back. Thanks for your reply. |
I have added a task for myself to review this and issue a pull request after testing. |
Now with 30 users I am getting a similar crash, but not sure if it is the same. [2012-09-11 14:44:03.357] [WARN] socket.io - client not handshaken 'client should reconnect' |
Just a quick reply for now....
Have you tried turning DEBUG level logging on? Perhaps it will isolate what is leading to the |
I will try to write a testing script that replays HTTP traffic and reproduce the bug with debugging turned on. |
We haven't seen the mismatched follow error since applying the pull request from #1023, but I think the underlying problem hasn't been fixed. The "mismatched follow" would mean that the clients were sending bad changesets and we haven't found an explanation for that. |
@mrneil : Your patch work... if your problem is "how to do a pull request with git", i gladly help you :) |
Will crash Etherpad after a while. We now have a method to reproduce so should be good to debug.
|
Cool! This could be either very interesting or very boring :) If it's a bug in the stresstest client, then it doesn't help us much. But if the clients are sending correct changesets but the server somehow munges them then we really have the root cause. Either way, though, the server shouldn't crash no matter what the clients send. So some version of mrneil's patch is needed. |
So, this is @mrneil's approach to fix the problem: |
The above fix does not help here (latest version 7a0ad32, applied in PadMessageHandler.js around line 530). After server start, a pad content is delivered, then we have in the log file: The last line repeats a thousand times, then: ESC[32m[2013-02-26 21:24:17.941] [INFO] message - ESC[39mto iBoQF9yURV8-2OVyLU0m: {"disconnect":"userdup"} |
@rasos can you replicate or is it a one off? Also any chance you can clean your paste up? |
Error can be replicated, we currently loop: RESTART! |
Can you please extract the pad using
replacing %padID% with the padId of the pad that keeps failing Once you have the .db file of the pad (it will be in your folder) put it on the web and share the link here so we can investigate further |
Mmh, the size of the padID.db file is 0, replacing the padID with the pad name. I can read in the logfile that somebody uses the pad to exchange even passwords, so I probably should not post that in the web. Any way I can send that to you in pm? Or any hint to disactivate that pad? |
email me it john splat mclear.co.uk |
The fix I posted above will not directly solve the issue experienced by @rasos because the log does not indicate it is being caused by a mismatched follow error. However, the report reminds me of issue #800. That one occurred after several hundred |
patch documented here ether#472 ...
Yea @rasos went off topic, basically I can close this but I'm mindful stress testing may throw some exceptions we need to debug. |
This upgrade solves the high-severity vulnerabilities regarding https-proxy-agent that were still present in 8e6bca4. The output of `npm audit` goes from this: found 29 vulnerabilities (3 low, 26 high) in 13338 scanned packages run `npm audit fix` to fix 4 of them. 1 vulnerability requires semver-major dependency updates. 24 vulnerabilities require manual review. See the full report for details. To this: found 5 vulnerabilities (3 low, 2 high) in 13338 scanned packages 1 vulnerability requires semver-major dependency updates. 4 vulnerabilities require manual review. See the full report for details. Changelog: - https://github.com/npm/cli/releases 6.13.1 (2019-11-18) BUG FIXES 938d6124d #472 fix(fund): support funding string shorthand (@ruyadorno) b49c5535b #471 should not publish tap-snapshot folder (@ruyadorno) 3471d5200 #253 Add preliminary WSL support for npm and npx (@infinnie) 3ef295f23 #486 print quick audit report for human output (@isaacs) TESTING dbbf977ac #278 added workflow to trigger and run benchmarks (@mikemimik) b4f5e3825 #457 feat(docs): adding tests and updating docs to reflect changes in registry teams API. (@nomadtechie) 454c7dd60 #456 fix git configs for git 2.23 and above (@isaacs) DEPENDENCIES 661d86cd2 [email protected] (@claudiahdz) 6.13.0 (2019-11-05) NEW FEATURES 4414b06d9 #273 add fund command (@ruyadorno) BUG FIXES e4455409f #281 delete ps1 files on package removal (@NoDocCat) cd14d4701 #279 update supported node list to remove v6.0, v6.1, v9.0 - v9.2 (@ljharb) DEPENDENCIES a37296b20 [email protected] d3cb3abe8 [email protected] TESTING 688cd97be #272 use github actions for CI (@JasonEtco) 9a2d8af84 #240 Clean up some flakiness and inconsistency (@isaacs)
Crash happened, causing the server to restart (and clients to be disconnected), stack trace:
[2012-02-20 09:48:23.623] [ERROR] console - Error: exports: mismatched follow
at Object.error (/Changeset.js:34:11)
at Object.assert (/Changeset.js:41:13)
at Object.follow (/Changeset.js:1797:11)
at Object.callback (/srv/etherpad-lite/node/handler/PadMessageHandler.js:431:35)
at /srv/etherpad-lite/node_modules/ueberDB/CloneAndAtomicLayer.js:120:17
at /srv/etherpad-lite/node_modules/ueberDB/CacheAndBufferLayer.js:345:7
at [object Object].get (/srv/etherpad-lite/node_modules/ueberDB/CacheAndBufferLayer.js:142:5)
at [object Object].getSub (/srv/etherpad-lite/node_modules/ueberDB/CacheAndBufferLayer.js:317:8)
at [object Object].doOperation as operatorFunction
at [object Object].emit (/srv/etherpad-lite/node_modules/ueberDB/node_modules/channels/channels.js:38:11)
RESTART!
The text was updated successfully, but these errors were encountered: