-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Added support for creating shared LVM setups #388
feat: Added support for creating shared LVM setups #388
Conversation
468e363
to
592f745
Compare
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #388 +/- ##
==========================================
- Coverage 13.67% 13.65% -0.03%
==========================================
Files 8 8
Lines 1733 1736 +3
Branches 79 79
==========================================
Hits 237 237
- Misses 1496 1499 +3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
ping - any updates? |
592f745
to
a6ff2fe
Compare
New blivet version has been released today. I have incorporated all suggestions into the code and uncommented the test. |
9e201d3
to
8dc721a
Compare
8dc721a
to
988c73f
Compare
With the suggested fixes, I can run the test up until here on centos-9:
def _create(self):
if not self._device:
members = self._manage_encryption(self._create_members())
try:
pool_device = self._blivet.new_vg(name=self._pool['name'], parents=members, shared=self._pool['shared'])
except Exception as e:
raise BlivetAnsibleError("failed to set up pool '%s': %s" % (self._pool['name'], str(e))) what version of blivet has the support for |
d6181b4
to
7003e18
Compare
I have added the switch that skips the test if needed based on blivet version as per vtrefny #388 (comment) |
meta: end_host | ||
when: inventory_hostname == "localhost" | ||
|
||
- name: Gather package facts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- name: Gather package facts | |
- name: Run the role to install blivet | |
include_role: | |
name: linux-system-roles.storage | |
vars: | |
storage_pools: [] | |
storage_volumes: [] | |
- name: Gather package facts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
otherwise, the task Set blivet package name
fails, as blivet is not installed.
Are you able to run this test locally on your laptop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the initial storage role (which installs blivet) run before the check.
While running the test I also noticed that in HA cluster role test_setup.yml there is this task:
- name: Set node name to 'localhost' for single-node clusters
set_fact:
inventory_hostname: localhost # noqa: var-naming
when: ansible_play_hosts_all | length == 1
I am not sure what its purpose is in the role but it messed up the test when I tried to run it on single remote node. I replaced it with task that changes inventory name from 'localhost' to '127.0.0.1' and it seems to do the trick.
7003e18
to
0ed33f2
Compare
- feature requested by GFS2 - adds support for creating shared VGs - shared LVM setup needs lvmlockd service with dlm lock manager to be running - to test this change ha_cluster system role is used to set up degenerated cluster on localhost - the test will be skipped if run locally due to an issue with underlying services - requires blivet version with shared LVM setup support (storaged-project/blivet#1123)
0ed33f2
to
79b1520
Compare
ok - but - is there some platform that has the correct version of blivet? Alternately - if you have some copr blivet build that you are using, can you attach the log output from running the test with the right version of blivet? |
I am running the test (not skipped) on Fedora 38 with the latest blivet package ( |
[citest] |
library/blivet.py
Outdated
@@ -1527,7 +1527,7 @@ def _create(self): | |||
if not self._device: | |||
members = self._manage_encryption(self._create_members()) | |||
try: | |||
pool_device = self._blivet.new_vg(name=self._pool['name'], parents=members) | |||
pool_device = self._blivet.new_vg(name=self._pool['name'], parents=members, shared=self._pool['shared']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need some sort of logic here to avoid using the shared
parameter if not supported
if self._blivet.new_vg supports 'shared' parameter:
pool_device = self._blivet.new_vg(name=self._pool['name'], parents=members, shared=self._pool['shared'])
else:
pool_device = self._blivet.new_vg(name=self._pool['name'], parents=members)
There's probably some way to use introspection to see if the new_vg
method supports shared
or some other way to dynamically construct the new_vg
arguments e.g. new_vg_args = {}
the pass like new_vg(**new_vg_args)
This is what is causing some of the test failures
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have modified the condition so the shared
parameter is not used when its value is default (false
). This should fix the tests.
Older versions of blivet do not support 'shared' parameter. This resulted in failures in tests unrelated to shared VGs. This change fixes that behavior as well as fixes minor condition error in a test.
[citest] |
Looks like fedora 39 has the right version of blivet. When I try your latest like this:
The problem is that runqemu uses the file name of the qcow2 file as the hostname. |
If I add this to the test: - name: Set up test environment for the ha_cluster role
include_role:
name: fedora.linux_system_roles.ha_cluster
tasks_from: test_setup.yml
- name: Create cluster
... Then I get much farther, until here:
I guess somewhere in the blivet module or blivet library it manages lvm.conf? I think we need to change https://github.com/linux-system-roles/ha_cluster/blob/main/tasks/test_setup.yml#L9 to make it more generally applicable. - name: Set node name to 'localhost' for single-node clusters
set_fact:
inventory_hostname: localhost # noqa: var-naming
when: ansible_play_hosts_all | length == 1 @tomjelinek @spetrosi I think the intention of this code is - "If - name: Set node name to 'localhost' for single-node clusters
set_fact:
inventory_hostname: localhost # noqa: var-naming
when:
- ansible_play_hosts_all | length == 1
- not ha_cluster_test_use_given_hostname | d(false) Then
wdyt? |
@richm You got the intention absolutely right. Adding the proposed flag works for me. It would be nice if it can be tested (@japokorn ?) before merging it in the ha_cluster role. And a comment explaining the flag is meant for other roles and thus must be kept in place even though it's not used anywhere in ha_cluster role would be helpful. Feel free to open a PR after testing or let me know to do it myself. |
@tomjelinek there's also an issue with lvmlockd -
how to choose the lock manager? What additional configuration is required by
@japokorn where/how is this done? seems like something the storage role/blivet should do?
this seems like something the
This also seems like something the
the storage role does this
@japokorn this seems like something the storage role should do?
This also seems like something the storage role should do
@tomjelinek this says ". . . may be automated using systemd or a cluster resource manager/agents." - is this something that the |
Well, the documentation says that dlm should be used if corosync is in use. HA cluster uses corosync.
I'm not aware of any configuration options in corosync related to dlm. And I'm not aware of any required dlm configuration, just run with the defaults.
It means: create cluster resources. So you just need to instruct the ha_cluster role to create the appropriate resources, ocf:pacemaker:controld and ocf:heartbeat:lvmlockd. |
@tomjelinek afaict the test is setting the appropriate parameters/resources - https://github.com/linux-system-roles/storage/pull/388/files#diff-2892843b9952fe8a2e8f5867b7f5092369acfd8ae20990b1689a366c01b1584cR68-R82 Then maybe the reason it is working in Jan's testing is because he has a "real" hostname and a real IP address, but in the baseos ci and local qemu testing, the |
@richm Yes, the variables look good. I have verified that the cluster is able to start dlm and lvmlockd resources with no issues with such settings, if it uses a real node name. If the cluster is set up with the 'localhost' node, dlm times out on start. I'm not sure why that happens. I already tried debugging this back in October but I was unable to get any useful info from dlm debug logs. |
Enhancement:
Support for creating shared VGs
Reason:
Requested by GFS2
Result: