Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "else" operator to augeas path-filter expressions (priority selector) #692

Merged
merged 2 commits into from
Oct 1, 2021

Conversation

georgehansper
Copy link
Member

This Pull Request adds a new binary operator to path-filter expressions.

The new operator is currently called else.

The existing binary operators and and or effectively act on 'sets' of nodes.

Given a filter-expression of the form:

*[nodeset_a or nodeset_b]

eg.

/files/etc/hosts/*[alias=~glob('local*') or ipaddr=~glob('192.168.*')]

nodeset_a and nodeset_b refer to a subset of *
* represents all nodes at this level of the tree.
The or operator forms a union of nodeset_a and nodeset_b

Given a filter-expression:

*[nodeset_a and nodeset_b]

The and operator forms an intersection of nodeset_a and nodeset_b

The new binary operator else also deals with nodesets.

Given a filter-expression:

*[nodeset_a else nodeset_b]

The else operator acts as a 'selector' or 'order-of-priority' between alternate nodesets.
nodeset_a will be selected as long as it is not empty. If nodeset_a is empty, nodeset_b is selected.

A typical application for this would be in a command such as set that requires exactly one node to match.

For the case where both nodeset_a and nodeset_b exist, contain a single node, and are not the same node:

  • If the or operator is used this will result in a nodeset with 2 nodes.
  • If the else operator is used, only nodeset_a will be selected, which results in a nodeset with 1 node.

@raphink
Copy link
Member

raphink commented Aug 12, 2020

Wouldn't that be a xor operator then?

@raphink raphink requested a review from lutter August 12, 2020 09:59
@georgehansper
Copy link
Member Author

xor is slightly different - an xor function results in 0 if both inputs are 1.

For this else operator, if nodes exist on both sides, the first (left hand side) nodeset is selected.
xor would result in an empty nodeset.

Also, the else operators can be chained, like this:

 *[ns1 else ns2 else ns3 else ns4]

In which case the 1st non-empty nodeset is selected.
It becomes a list of 'priorities': 1st preference, 2nd preference, 3rd preference etc.

Admittedly, that's getting pretty complex for a path-filter, but the logic supports it out-of-the-box.

@lutter
Copy link
Member

lutter commented Aug 31, 2020

I am not sure that else is needed, at least as a boolean operator it adds no functionality: both and and or are boolean operators, i.e., they take two boolean values and return a boolean. It's just that there are various ways to turn a value into a boolean, e.g., a nodeset is true if it is not empty. In terms of boolean arithmetic, else would be exactly the same as or.

Furthermore inside a [..] predicate, we only care if the expression in the predicate winds up being true or false, and again, inside [..] or and else are indistibguishable.

If you want 'take the first thing that matches' functionality, you'd want an else that mirrors union; union has very low precedence so that pretty much anything you write on its left or right will be evaluated before the union itself is formed. In terms of implementation, I think eval_else should follow eval_union closely, and would look something like

static void eval_else(struct state *state) {
    value_ind_t vind = make_value(T_NODESET, state);
    struct value *r = pop_value(state);
    struct value *l = pop_value(state);
    struct nodeset *res = NULL;

    assert(l->tag == T_NODESET);
    assert(r->tag == T_NODESET);

    RET_ON_ERROR;

    if (l->nodeset->used >0) {
      res = l;
    } else {
      res = r;
    }
    state->value_pool[vind].nodeset = res;
    push_value(vind, state);
 error:
    ns_clear_added(res);
}

That evaluates both the left and right operand of else, but I don't think that's a huge deal. If you already know how many nodes you want to get back (for example, to say 'give me the first alias or the ip address for an entry in /etc/hosts'), you can already do the following:

augtool> match '(/files/etc/hosts/1/alias | /files/etc/hosts/1/ipaddr)[1]'
/files/etc/hosts/1/alias[1] = localhost

What else would add is the ability to get either all aliases or just the ip address if we allowed something like

augtool> match '(/files/etc/hosts/1/alias else /files/etc/hosts/1/ipaddr)'

You can write that down already, too, but it's not entirely straightforward:

augtool> match '(/files/etc/hosts/1/alias | /files/etc/hosts/1/ipaddr[count(../alias) = 0])'

So, I am not convinced that else is really needed. But maybe I am missing a use for it that can't be expressed already.

@georgehansper
Copy link
Member Author

The else operator introduces an or operator that works like the or in most programming languages.
In most programming environments, the left-hand-side of the or operator is evaluated first, and if 'true', the right-hand-side is not considered.

More practically, it would be used in those situtations where exactly one node is required - such as the set statement.

eg. an expression:

augtool> match /files/etc/hosts/*[ ipaddr='127.0.0.1' or ipaddr='::1' ]
/files/etc/hosts/1 = (none)
/files/etc/hosts/2 = (none)
augtool>

matches both nodes and return a nodeset of 2 nodes.

By contrast, when else is used:

augtool> match /files/etc/hosts/*[ ipaddr='127.0.0.1' else ipaddr='::1' ]
/files/etc/hosts/1 = (none)
augtool>

matches only 127.0.0.1 and return a nodeset of 1 node, even though ::1 exists
So or and else are distinctly different inside of [..]

I have to admit that I had not considered your approach of appending another predicate [1] to an expression, like this:

augtool> match /files/etc/hosts/*[ ipaddr='127.0.0.1' or ipaddr='::1' ][1]
/files/etc/hosts/1 = (none)
augtool>

Even this is not quite the same as else because it does not define a priority.
The nodes in the filter could appear in any order, and simply selecting the first one does not guarantee that our 1st preference is used.

For a practical example (albeit contrived), consider the case where I want to set my default-route on a network device.
My first preference would be to use the wired interface, if available, otherwise I would use the Wireless device.

I could do this with else:

set /files/etc/sysconfig/network-scripts/*[ TYPE='Ethernet' else TYPE='Wireless' ]/DEFROUTE 'yes'

The key difference between else and or-with-[1] is that the devices may appear in an arbitrary order.
else defines the priority, where as [1] takes the first node in order-of-appearance.

I note that it is possible to construct this using other logic, as you have demonstrated.

I had not considered using the union operator for this, as I was unaware of its possible usage due to the lack of documentation.
After seeing your example, I would agree that being able to apply else to an entire path in the fashion of union makes it a far more practical tool.

I will update this pull-request to allow else to be used in the fashion of union.

At the moment, this is giving rise to some issues around the locpath_trace functionality.

@georgehansper
Copy link
Member Author

I have updated this pull-request with a modified else operator that can return a T_NODESET after the fashion of union
This would allow things like:

match '(/files/etc/logrotate.d/syslog/rule/compress else /files/etc/logrotate.conf/compress)'

This allows alternate paths to be chosen, depending on what is already there, as may be dictated by the filesystem structure.

I have made some changes to eval_binary() around the handling of the locpath_trace variable, so else can be used with set statements:

set '(/files/etc/logrotate.d/syslog/rule/compress else /files/etc/logrotate.conf/compress)' nocompress

The code around eval_union() has also be modified, such that the following statement results in a consistant error, rather than an abort

set (/files/etc/hosts/1/ipaddr|/files/etc/hosts/2/ipaddr) 127.0.0.1
error: Too many matches for path expression

Although it may be possible to construct path-expressions which give similar results to the else operator, the else operator:

  • allows nodes to be prioritised, one over another
  • is easy to use and understand
  • makes the intent of the path-expression clear to the reader

I very much appreciate your feedback and comments, which have already made this pull-request more usable than the original.

@georgehansper
Copy link
Member Author

If there are no ongoing objections, can this Pull Request please be accepted?

@raphink
Copy link
Member

raphink commented Dec 1, 2020

Sorry for the slow reaction time. @lutter do you want to validate this PR?

Copy link
Member

@lutter lutter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am convinced, I can see how this makes some path manipulations possible, and cleaner/easier to write.

@georgehansper georgehansper merged commit e01f60b into hercules-team:master Oct 1, 2021
@georgehansper georgehansper deleted the or_else branch October 1, 2021 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants