Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clean-up work when process quit #2364

Closed
garyyu opened this issue Jan 13, 2019 · 4 comments
Closed

clean-up work when process quit #2364

garyyu opened this issue Jan 13, 2019 · 4 comments

Comments

@garyyu
Copy link
Contributor

garyyu commented Jan 13, 2019

Recap the Gitter discussion here: https://gitter.im/grin_community/dev?at=5c3b123d95e17b45254415c3

Gary Yu @garyyu 18:26
In PR #2354, I add a new commit (4a8bd44) try to clean-up the remaining (last 30 minutes) txhashset zip files when grin process exit.
According to the usage of Drop: https://doc.rust-lang.org/1.30.0/book/second-edition/ch15-03-drop.html
I use a Drop function to do some clean-up (for those remaining txhashset zip files) when grin exit:
impl Drop for Chain {
fn drop(&mut self) {
...
}
}
But unfortunately, I tested and found this Drop function seems not called when we kill grin process, and the remaining zip files (last 30 minutes) still there. Somebody have idea on this? I’m stuck here

Casey Rodarmor @casey 18:33
@garyyu Does the grin process exit by returning from main, or are process::exit or process::abort being called? process::{exit,abort} will terminate the process immediately, so drop implementations won't be called.

Gary Yu @garyyu 18:38
@casey we use ctrlc crate to handle the kill signal and call server::stop():
https://github.com/mimblewimble/grin/blob/master/src/bin/cmd/server.rs#L75 and https://github.com/mimblewimble/grin/blob/master/servers/src/grin/server.rs#L455
we don't use something like process::abort / exit.
it’s a strange thing that the Drop can’t be called automatically, perhaps that’s a risk.
But at this moment, i would just explicitly call a chain::clean_up_zip_files here in this stop function, as a workaround for this problem.
If anyone found the root cause of this strange issue, please let us know :-)

Casey Rodarmor @casey 18:43
Yeah, drop gets called automatically by the compiler
@garyyu I'm just guessing, but perhaps there are other threads holding on to an Arc, so drop is never getting called

Gary Yu @garyyu 18:46
@casey Yes, that could be the root cause! then it make sense here.
it’s still a risk for process exit, I will submit an issue to remind us this remaining issue.
thanks @casey

I create this issue to remind us there's a clean-up (on process exit) issue to be fixed.

@RyanKung
Copy link

While server dropping, there is still 5 strong ref and 10 week ref.

impl Drop for Server {
    fn drop(&mut self) {
	warn!("Dropping Server");
        warn!("strong {:?}", Arc::strong_count(&self.chain));
        warn!("weak {:?}", Arc::weak_count(&self.chain));
    }
}
20190125 17:37:49.028 WARN grin_servers::grin::server - strong 5
20190125 17:37:49.028 WARN grin_servers::grin::server - weak 10

BTW, If user use the kill -9 signal, it will still cause issues.
So maybe a better solution is to setup a lock flag, and do some clean up works after restart?

@antiochp
Copy link
Member

@RyanKung We already do some level of cleanup/rewinding on restart, based on existing state of the various MMR files. You can see some of this in chain::init().
That said it does look like there are some edge cases that we still do not fully account for.

@RyanKung
Copy link

RyanKung commented Jan 25, 2019

Hi @antiochp , I just checked chain::init(). what I got is, if stop_lock.is_stooped(), it will throw out a panic! finally?

On chain::init()

			if stop_lock.is_stopped() {
				return Err(ErrorKind::Stopped.into());
			}

On Server::new

		let shared_chain = Arc::new(chain::Chain::init(
			config.db_root.clone(),
			db_env,
			chain_adapter.clone(),
			genesis.clone(),
			pow::verify_size,
			verifier_cache.clone(),
			archive_mode,
			stop_state.clone(),
		)?);

On Server::start:

		let serv = Arc::new(Server::new(config)?);


And in fn start_server_tui(config: servers::ServerConfig)

		servers::Server::start(config, |serv: Arc<servers::Server>| {
			let running = Arc::new(AtomicBool::new(true));
			let r = running.clone();
			ctrlc::set_handler(move || {
				r.store(false, Ordering::SeqCst);
			})
			.expect("Error setting handler for both SIGINT (Ctrl+C) and SIGTERM (kill)");
			while running.load(Ordering::SeqCst) {
				thread::sleep(Duration::from_secs(1));
			}
		    warn!("Received SIGINT (Ctrl+C) or SIGTERM (kill).");

		    serv.stop();
		})
		.unwrap();

It's simply unwrap() and will panic!

@0xmichalis
Copy link
Contributor

Graceful shutdown is added at #2812.

I think we can close this issue now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants