Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ACTUALLY corrects the behavior in #297 #303

Merged
merged 7 commits into from
Mar 9, 2021

Conversation

ferricoxide
Copy link
Member

Previous fix for #297 turned out to not actually fix the "restarted too many times too soon" behavior. This fix appears to better do so.

@ferricoxide ferricoxide requested a review from a team March 4, 2021 15:09
ash-linux/el7/STIGbyID/cat2/restart_sshd.sls Outdated Show resolved Hide resolved
ash-linux/el7/STIGbyID/cat2/restart_sshd.sls Outdated Show resolved Hide resolved
Copy link
Member

@lorengordon lorengordon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to first just try changing:

- file: file_{{ stig_id }}-{{ cfgFile }}

to:

- file: {{ cfgFile }}

in each file, to see if that gives salt enough of a clue to de-dupe the service.running states.

If that doesn't work, then we can proceed with moving the service.running to a separate sls, but we'll still need to modify the stig sls's to include the restart_sshd sls, remove the onchanges from the service.running state, and use onchanges_in the file.replace states to point at the state service: service_sshd_restart.

The include directive should be within the jinja else block, so the service state does not run if the stig_id is skipped.

@ferricoxide
Copy link
Member Author

I'd like to first just try changing:

- file: file_{{ stig_id }}-{{ cfgFile }}

to:

- file: {{ cfgFile }}

in each file, to see if that gives salt enough of a clue to de-dupe the service.running states.

If that doesn't work, then we can proceed with moving the service.running to a separate sls, but we'll still need to modify the stig sls's to include the restart_sshd sls, remove the onchanges from the service.running state, and use onchanges_in the file.replace states to point at the state service: service_sshd_restart.

The include directive should be within the jinja else block, so the service state does not run if the stig_id is skipped.

Lemme make a quick branch to try it out.

@ferricoxide
Copy link
Member Author

I'd like to first just try changing:

- file: file_{{ stig_id }}-{{ cfgFile }}

to:

- file: {{ cfgFile }}

in each file, to see if that gives salt enough of a clue to de-dupe the service.running states.

If that doesn't work, then we can proceed with moving the service.running to a separate sls, but we'll still need to modify the stig sls's to include the restart_sshd sls, remove the onchanges from the service.running state, and use onchanges_in the file.replace states to point at the state service: service_sshd_restart.

The include directive should be within the jinja else block, so the service state does not run if the stig_id is skipped.

Ok, so, they've updated saltstack so that using file: {{ cfgFile }} no longer produces an error. However, changing the states (back) to referencing the config-file – directly or via the previous state-ID method – still results in errors:

2021-03-09 16:42:50,815 P2466 [INFO]    2021-03-09 16:42:50,318 [watchmaker.workers.base.SaltLinux][ERROR][3082]: Command stderr: b'To force a start use "systemctl reset-failed sshd.service" followed by "systemctl start sshd.service" again.'
2021-03-09 16:42:50,815 P2466 [INFO]    2021-03-09 16:42:50,562 [watchmaker.workers.base.SaltLinux][DEBUG][3082]: Command retcode: 2
2021-03-09 16:42:50,815 P2466 [INFO]    2021-03-09 16:42:50,622 [watchmaker.workers.base.SaltLinux][INFO ][3082]: Setting selinux back to enforcing mode
2021-03-09 16:42:50,815 P2466 [INFO]    2021-03-09 16:42:50,623 [watchmaker.workers.base.SaltLinux][DEBUG][3082]: Command: setenforce enforcing
2021-03-09 16:42:50,815 P2466 [INFO]    2021-03-09 16:42:50,634 [watchmaker.workers.base.SaltLinux][DEBUG][3082]: Command retcode: 0
2021-03-09 16:42:50,816 P2466 [INFO]    2021-03-09 16:42:50,635 [watchmaker.Client][CRITICAL][3082]: Execution of the workers cadence has failed.
2021-03-09 16:42:50,816 P2466 [INFO]    2021-03-09 16:42:50,635 [watchmaker][CRITICAL][3082]:
2021-03-09 16:42:50,816 P2466 [INFO]    Traceback (most recent call last):
2021-03-09 16:42:50,816 P2466 [INFO]      File "/usr/local/bin/watchmaker", line 8, in <module>
2021-03-09 16:42:50,816 P2466 [INFO]        sys.exit(main())
2021-03-09 16:42:50,816 P2466 [INFO]      File "/usr/local/lib/python3.6/site-packages/click/core.py", line 829, in __call__
2021-03-09 16:42:50,816 P2466 [INFO]        return self.main(*args, **kwargs)
2021-03-09 16:42:50,816 P2466 [INFO]      File "/usr/local/lib/python3.6/site-packages/click/core.py", line 782, in main
2021-03-09 16:42:50,816 P2466 [INFO]        rv = self.invoke(ctx)
2021-03-09 16:42:50,816 P2466 [INFO]      File "/usr/local/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
2021-03-09 16:42:50,816 P2466 [INFO]        return ctx.invoke(self.callback, **ctx.params)
2021-03-09 16:42:50,816 P2466 [INFO]      File "/usr/local/lib/python3.6/site-packages/click/core.py", line 610, in invoke
2021-03-09 16:42:50,816 P2466 [INFO]        return callback(*args, **kwargs)
2021-03-09 16:42:50,816 P2466 [INFO]      File "/usr/local/lib/python3.6/site-packages/watchmaker/cli.py", line 110, in main
2021-03-09 16:42:50,817 P2466 [INFO]        sys.exit(watchmaker_client.install())
2021-03-09 16:42:50,817 P2466 [INFO]      File "/usr/local/lib/python3.6/site-packages/watchmaker/__init__.py", line 452, in install
2021-03-09 16:42:50,817 P2466 [INFO]        workers_manager.worker_cadence()
2021-03-09 16:42:50,817 P2466 [INFO]      File "/usr/local/lib/python3.6/site-packages/watchmaker/managers/worker_manager.py", line 59, in worker_cadence
2021-03-09 16:42:50,817 P2466 [INFO]        worker.install()
2021-03-09 16:42:50,817 P2466 [INFO]      File "/usr/local/lib/python3.6/site-packages/watchmaker/workers/salt.py", line 828, in install
2021-03-09 16:42:50,817 P2466 [INFO]        self.process_states(self.salt_states, self.exclude_states)
2021-03-09 16:42:50,817 P2466 [INFO]      File "/usr/local/lib/python3.6/site-packages/watchmaker/workers/salt.py", line 659, in process_states
2021-03-09 16:42:50,817 P2466 [INFO]        indent=4
2021-03-09 16:42:50,817 P2466 [INFO]    watchmaker.exceptions.WatchmakerException: Salt state execution failed:
2021-03-09 16:42:50,817 P2466 [INFO]        listener_service_RHEL-07-040670-/etc/ssh/sshd_config:
2021-03-09 16:42:50,817 P2466 [INFO]            __id__: listener_service_RHEL-07-040670-/etc/ssh/sshd_config
2021-03-09 16:42:50,817 P2466 [INFO]            __run_num__: 653
2021-03-09 16:42:50,817 P2466 [INFO]            __sls__: ash-linux.el7.STIGbyID.cat2.RHEL-07-040670
2021-03-09 16:42:50,818 P2466 [INFO]            changes: {}
2021-03-09 16:42:50,818 P2466 [INFO]            comment: 'Running scope as unit run-29592.scope.
2021-03-09 16:42:50,818 P2466 [INFO]
2021-03-09 16:42:50,818 P2466 [INFO]                Job for sshd.service failed because start of the service was attempted
2021-03-09 16:42:50,818 P2466 [INFO]                too often. See "systemctl status sshd.service" and "journalctl -xe" for
2021-03-09 16:42:50,818 P2466 [INFO]                details.
2021-03-09 16:42:50,818 P2466 [INFO]
2021-03-09 16:42:50,818 P2466 [INFO]                To force a start use "systemctl reset-failed sshd.service" followed by
2021-03-09 16:42:50,818 P2466 [INFO]                "systemctl start sshd.service" again.'
2021-03-09 16:42:50,818 P2466 [INFO]            duration: 93.73
2021-03-09 16:42:50,818 P2466 [INFO]            name: sshd
2021-03-09 16:42:50,818 P2466 [INFO]            result: false
2021-03-09 16:42:50,818 P2466 [INFO]            start_time: '16:42:49.822611'
2021-03-09 16:42:50,819 P2466 [INFO]        listener_service_RHEL-07-040680-/etc/ssh/sshd_config:
2021-03-09 16:42:50,819 P2466 [INFO]            __id__: listener_service_RHEL-07-040680-/etc/ssh/sshd_config
2021-03-09 16:42:50,819 P2466 [INFO]            __run_num__: 654
2021-03-09 16:42:50,819 P2466 [INFO]            __sls__: ash-linux.el7.STIGbyID.cat2.RHEL-07-040680
2021-03-09 16:42:50,819 P2466 [INFO]            changes: {}
2021-03-09 16:42:50,819 P2466 [INFO]            comment: 'Running scope as unit run-29597.scope.
2021-03-09 16:42:50,819 P2466 [INFO]
2021-03-09 16:42:50,819 P2466 [INFO]                Job for sshd.service failed because start of the service was attempted
2021-03-09 16:42:50,819 P2466 [INFO]                too often. See "systemctl status sshd.service" and "journalctl -xe" for
2021-03-09 16:42:50,819 P2466 [INFO]                details.
2021-03-09 16:42:50,819 P2466 [INFO]
2021-03-09 16:42:50,819 P2466 [INFO]                To force a start use "systemctl reset-failed sshd.service" followed by
2021-03-09 16:42:50,819 P2466 [INFO]                "systemctl start sshd.service" again.'
2021-03-09 16:42:50,819 P2466 [INFO]            duration: 136.291
2021-03-09 16:42:50,820 P2466 [INFO]            name: sshd
2021-03-09 16:42:50,820 P2466 [INFO]            result: false
2021-03-09 16:42:50,820 P2466 [INFO]            start_time: '16:42:49.916974'
2021-03-09 16:42:50,820 P2466 [INFO]        listener_service_RHEL-07-040690-/etc/ssh/sshd_config:
2021-03-09 16:42:50,820 P2466 [INFO]            __id__: listener_service_RHEL-07-040690-/etc/ssh/sshd_config
2021-03-09 16:42:50,820 P2466 [INFO]            __run_num__: 655
2021-03-09 16:42:50,820 P2466 [INFO]            __sls__: ash-linux.el7.STIGbyID.cat2.RHEL-07-040690
2021-03-09 16:42:50,820 P2466 [INFO]            changes: {}
2021-03-09 16:42:50,820 P2466 [INFO]            comment: 'Running scope as unit run-29603.scope.
2021-03-09 16:42:50,820 P2466 [INFO]
2021-03-09 16:42:50,820 P2466 [INFO]                Job for sshd.service failed because start of the service was attempted
2021-03-09 16:42:50,820 P2466 [INFO]                too often. See "systemctl status sshd.service" and "journalctl -xe" for
2021-03-09 16:42:50,820 P2466 [INFO]                details.
2021-03-09 16:42:50,821 P2466 [INFO]
2021-03-09 16:42:50,821 P2466 [INFO]                To force a start use "systemctl reset-failed sshd.service" followed by
2021-03-09 16:42:50,821 P2466 [INFO]                "systemctl start sshd.service" again.'
2021-03-09 16:42:50,821 P2466 [INFO]            duration: 131.175
2021-03-09 16:42:50,821 P2466 [INFO]            name: sshd
2021-03-09 16:42:50,821 P2466 [INFO]            result: false
2021-03-09 16:42:50,821 P2466 [INFO]            start_time: '16:42:50.053755'
2021-03-09 16:42:50,821 P2466 [INFO]        listener_service_RHEL-07-040700-/etc/ssh/sshd_config:
2021-03-09 16:42:50,821 P2466 [INFO]            __id__: listener_service_RHEL-07-040700-/etc/ssh/sshd_config
2021-03-09 16:42:50,821 P2466 [INFO]            __run_num__: 656
2021-03-09 16:42:50,821 P2466 [INFO]            __sls__: ash-linux.el7.STIGbyID.cat2.RHEL-07-040700
2021-03-09 16:42:50,821 P2466 [INFO]            changes: {}
2021-03-09 16:42:50,821 P2466 [INFO]            comment: 'Running scope as unit run-29608.scope.
2021-03-09 16:42:50,821 P2466 [INFO]
2021-03-09 16:42:50,822 P2466 [INFO]                Job for sshd.service failed because start of the service was attempted
2021-03-09 16:42:50,822 P2466 [INFO]                too often. See "systemctl status sshd.service" and "journalctl -xe" for
2021-03-09 16:42:50,822 P2466 [INFO]                details.
2021-03-09 16:42:50,822 P2466 [INFO]
2021-03-09 16:42:50,822 P2466 [INFO]                To force a start use "systemctl reset-failed sshd.service" followed by
2021-03-09 16:42:50,822 P2466 [INFO]                "systemctl start sshd.service" again.'
2021-03-09 16:42:50,822 P2466 [INFO]            duration: 132.083
2021-03-09 16:42:50,822 P2466 [INFO]            name: sshd
2021-03-09 16:42:50,822 P2466 [INFO]            result: false
2021-03-09 16:42:50,822 P2466 [INFO]            start_time: '16:42:50.185436'
2021-03-09 16:42:50,822 P2466 [INFO]
2021-03-09 16:42:50,822 P2466 [INFO] ------------------------------------------------------------
2021-03-09 16:42:50,823 P2466 [ERROR] Exited with error code 1

@ferricoxide
Copy link
Member Author

@lorengordon @eemperor

Any other methods want to test before accepting the PR?

Copy link
Member

@lorengordon lorengordon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current implementation is still incomplete. It relies on someone invoking the .cat2 sls to include the service restart. If someone calls RHEL-07-040660 directly, no restart.

To address that, each of the RHEL-07-xxxxxx.sls files changed in this PR can implement the following pattern with include and onchanges_in:

{%- else %}
include:
  - ash-linux.el7.STIGbyID.cat2.restart_sshd

file_{{ stig_id }}-{{ cfgFile }}:
  file.replace:
    - name: '{{ cfgFile }}'
    - pattern: '^\s*{{ parmName }} .*$'
    - repl: '{{ parmName }} {{ parmValu }}'
    - append_if_not_found: True
    - not_found_content: |-
        # Inserted per STIG {{ stig_id }}
        {{ parmName }} {{ parmValu }}
    - onchanges_in:
      - service: service_sshd_restart
{%- endif %}

Then:

  • Remove ash-linux.el7.STIGbyID.cat2.restart_sshd from cat2/init.sls as it is no longer necessary
  • Delete the onchanges directive from cat2/restart_sshd.sls, since each state sets the requisite with onchanges_in

@ferricoxide
Copy link
Member Author

ferricoxide commented Mar 9, 2021

Seems like that would still cause each state that has the include/onchanges_in content to attempt to individually-restart the sshd service (resulting in the "too many times" error)?

@lorengordon
Copy link
Member

I don't believe so, but it's a bit difficult to say for sure without studying the rendered highstate data structure.

@ferricoxide
Copy link
Member Author

Running a test now to validate: just because things feel like you're circling over the same terrain doesn't actually mean you are (and, even if you are, that doing it on a ATV rather than a dirtbike won't produce different results).

@ferricoxide
Copy link
Member Author

ferricoxide commented Mar 9, 2021

Ok. First run through seems to have not triggered the "too many restarts" error ...but then, because I'd branched off of master, meant that my skip-logic for the readonly TMOUT= login-profile state was no longer present in the testing-branch's code. Going to re-run with skipping actually activated for that one. If it runs a second time through the sshd stuff without incident, I'll merge the code-changes back into this PR's contents.

(git cherry-pick will be useful, here 😛)

@ferricoxide
Copy link
Member Author

Looks good:

[…elided…]
2021-03-09 19:26:18,432 P2405 [INFO] ------------------------------------------------------------
2021-03-09 19:26:18,432 P2405 [INFO] Completed successfully.
2021-03-09 19:26:18,436 P2405 [INFO] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
2021-03-09 19:26:18,436 P2405 [INFO] Config finalize
2021-03-09 19:26:18,437 P2405 [INFO] ============================================================
2021-03-09 19:26:18,437 P2405 [INFO] Command 10-signal-success
2021-03-09 19:26:19,329 P2405 [INFO] Completed successfully.
2021-03-09 19:26:19,333 P2405 [INFO] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
2021-03-09 19:26:19,333 P2405 [INFO] Config reboot
2021-03-09 19:26:19,334 P2405 [INFO] ============================================================
2021-03-09 19:26:19,334 P2405 [INFO] Command 10-reboot
2021-03-09 19:26:19,405 P2405 [INFO] -----------------------Command Output-----------------------
2021-03-09 19:26:19,406 P2405 [INFO]    Shutdown scheduled for Tue 2021-03-09 19:27:19 UTC, use 'shutdown -c' to cancel.
2021-03-09 19:26:19,406 P2405 [INFO] ------------------------------------------------------------
2021-03-09 19:26:19,406 P2405 [INFO] Completed successfully.

Will push the mods, shortly.

ash-linux/el7/STIGbyID/cat2/restart_sshd.sls Outdated Show resolved Hide resolved
ash-linux/el7/STIGbyID/cat2/init.sls Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants