Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4.1+ systemd.service needs to be updated in admin/deployment docs. #451

Closed
troy2914 opened this issue Dec 16, 2020 · 1 comment
Closed
Assignees

Comments

@troy2914
Copy link
Member

troy2914 commented Dec 16, 2020

irrd.service was not specifying Requires or After redis-server.service or [email protected]
With a large db, and random systemd timing, we see

2020-12-14 14:49:44,106 irrd[2082]: [irrd.storage.preload#CRITICAL] Updating preload store failed, retrying in 5s, traceback follows: (psycopg2.OperationalError) FATAL:  the database system is starting up
2020-12-14 14:49:49,114 irrd[2082]: [irrd.storage.preload#CRITICAL] Updating preload store failed, retrying in 5s, traceback follows: (psycopg2.OperationalError) FATAL:  the database system is starting up
2020-12-14 14:49:54,122 irrd[2082]: [irrd.storage.preload#CRITICAL] Updating preload store failed, retrying in 5s, traceback follows: (psycopg2.OperationalError) FATAL:  the database system is starting up

followed by about 30 [irrd-whois-serv] processes.
This in general should be handled better. Either fail to start, or start only returning an error.

the service file update is adding a Requires= and appending to the After=

Requires=redis-server.service [email protected]
After=basic.target network.target redis-server.service [email protected]

note the 11-main bit is the postgres cluster info, which might vary.

seen with:
4.1.1 (though this dependency probably started in 4.0)

mxsasha added a commit that referenced this issue Dec 30, 2020
mxsasha added a commit that referenced this issue Dec 30, 2020
If the whois or HTTP workers fail to connect to the database
or preloader on startup, they will send a termination signal
to the main process, terminating IRRd. This prevents starting
up in a broken state.

This does not change behaviour afterwards: if the connection
is lost later, the workers will fail the current query after one
reconnection attempt. They will again attempt reconnection
on future queries. This allows IRRd to recover from a
PostgreSQL restart.
mxsasha added a commit that referenced this issue Dec 30, 2020
If the whois or HTTP workers fail to connect to the database
or preloader on startup, they will send a termination signal
to the main process, terminating IRRd. This prevents starting
up in a broken state.

This does not change behaviour afterwards: if the connection
is lost later, the workers will fail the current query after one
reconnection attempt. They will again attempt reconnection
on future queries. This allows IRRd to recover from a
PostgreSQL restart.
mxsasha added a commit that referenced this issue Dec 30, 2020
If the whois or HTTP workers fail to connect to the database
or preloader on startup, they will send a termination signal
to the main process, terminating IRRd. This prevents starting
up in a broken state.

This does not change behaviour afterwards: if the connection
is lost later, the workers will fail the current query after one
reconnection attempt. They will again attempt reconnection
on future queries. This allows IRRd to recover from a
PostgreSQL restart.
@mxsasha
Copy link
Collaborator

mxsasha commented Apr 6, 2021

I think this is resolved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants