Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for $and operand #69

Open
schenksj opened this issue Jan 30, 2023 · 20 comments
Open

Support for $and operand #69

schenksj opened this issue Jan 30, 2023 · 20 comments
Labels
enhancement New feature or request

Comments

@schenksj
Copy link
Contributor

What is your idea?

It would be great to allow an $and operand, (overcoming JSON "last-one-in-wins" semantics for fields) to enable tagging multiple conditions to a given field. I think this may be possible given how I interpret the implementation of the $or operand. For example, this would exclude any value that (starts with "abc" AND ends with "123" AND also excludes "notmyvalue" and "alsonotmyvalue").

{ "$and": [
       {"myField": [{"anything-but": ["notmyvalue", "alsonotmyvalue"]}]},
       {"myField": [{"anything-but": {"suffix": "123"}}]},
       {"myField": [{"anything-but": {"prefix": "abc"}}]}
   ]
}

Would you be willing to make the change?

Maybe

Additional context

Add any other context (such as images, docs, posts) about the idea here.

@schenksj schenksj added the enhancement New feature or request label Jan 30, 2023
@baldawar
Copy link
Collaborator

baldawar commented Jan 30, 2023

Make sense. I've heard some murmurs on needing this elsewhere as well. Similar to $or we'd have to

  • add a new benchmark test ($or has some performance concerns with arrays, $and will likely be slower when there's an array within the rule or event with lots of elements).
  • we'll need to support $and working at different depths and multiple fields
{ "topField" : {
    "$and": [
          {"myField": [{"anything-but": ["notmyvalue", "alsonotmyvalue"]}]},
          {"anotherField": [{"anything-but": {"suffix": "123"}}]},
          {"lastField": [{"anything-but": {"prefix": "abc"}}]}
      ]
   } 
}

Some relevant files based on $or

@schenksj
Copy link
Contributor Author

@baldawar - I suspect the above example doesn't need the $and operator in that the fields all have different names (existing engine already handles that)... The only gap I can find that needs addressing is to allow multiple rules on the SAME field, specifically in the suffix/prefix/anything-but-[prefix|suffix]. Do you disagree?

@baldawar @timbray ... Curious of your thoughts... From what I can gather, the OR operator implementation essentially just built "n" separate rules with the same name, with each permutation of "or"s children. In your view, what is the right way to build the $and operator to allow multiple conditions on the same key... Things I can think of:

  1. Modify the parser and the state machines to allow multiple rules on a single input field
  2. Modify the parser to to create a separately named secondary rule (think parent-child) with each "and" condition, link it to the parent rule, and do a post-order cleanup after executing to reconcile the parent-child relationship (redact the children returning to the caller, redact the parent if all children aren't present) -- based on his comments on my last PR, I'M guessing @timbray would see this as a hack, though its less impactful/risky to the overall codebase

@schenksj
Copy link
Contributor Author

I suppose if route 2 was taken, it could similarly support a $not operand, but this might be straining the state-machine architecture a bit too much?

@timbray
Copy link
Collaborator

timbray commented Jan 31, 2023

In general, over the history of Ruler, we basically haven't added any feature unless we had some group yelling "we really need this". Who really needs this? What's the scenario?

@schenksj
Copy link
Contributor Author

@timbray An "$and" use case... in a detection system, false positive reductions often revolve around defining lists of exclusions... For example, in a signature were you're trying to detect an individual user doing something naughty (abusing a capability intended for system processes), you might have a list of exclusions to exclude the known accounts who are entitled to use said functionality. This could take the form of accounts running administrative tools (exact matches) plus accounts with particular suffixes (like AD designators, identifiers for system users, etc...)

I suppose the fundamental issue here is that the anything-but operators only allow either a list of explicit matches OR a single prefix or suffix, rather than mix of explicit matches and multiple prefix/suffix's. The same issue is at the root of the "$not" idea.

@timbray
Copy link
Collaborator

timbray commented Jan 31, 2023

Not bad. But I meant actual groups with actual concrete problems they need solutions for right now.

In my career, I've had really bad luck guessing what people need.

@schenksj
Copy link
Contributor Author

I agree. This is a real use case for me. I can fairly easily create a layer on top of event-ruler to perform these functions in my context, but it seemed high-likelihood that someone else would benefit from the functionality as well. Either way I appreciate the engagement.

@NickMoores
Copy link

Just came across this by chance when researching how to combine "prefix" and "suffix" on a field.

Eg "key": [{ "prefix": "landing/", "suffix": ".xml" }]

I think $and would help solve this?

@timbray
Copy link
Collaborator

timbray commented Feb 1, 2023

Depends what you mean by "solve".

I think this syntax is nice and expressive and we can probably figure out how to build a state machine to implement it:

"key": [{ "prefix": "landing/", "suffix": ".xml" }]

It's a bit of departure from current Rule syntax and raises questions about what you could combine with what, but it's a direction that's worth investigating; and probably less kludgy then $and?

@baldawar
Copy link
Collaborator

baldawar commented Feb 2, 2023

Thinking about how this would get interpreted for keys with sub-key elements, the syntax isn't that bad (looks simpler compared to using $and matcher)

"key" : [
  { "exists" : true}, 
  "sub-key: [
    { "prefix" : "landing/"},
    { "suffix"  : ".xml" }
  ]
]

It's definitely a big departure from current query syntax but inline with rest of the ruler's existing AND behaviour today. no new matcher for users to learn, a big plus for me.

We'd need to make changes to how we compile rules, starting from this line

final Map<String, List<Patterns>> rule = new HashMap<>();
. It would also potentially open the door to addressing the caveat with dots in future.

In any case, definitely don't see any reason to supporting $and.

@schenksj
Copy link
Contributor Author

schenksj commented Feb 2, 2023

@baldawar - One potential point of confusion with the syntax above is that something like the below evaluates to true if any-rule is true, not all-rules are true. So, re-using this syntax might be a breaking change.... For example:

{
  "message": [ 
     { "equals-ignore-case": "A" },
     { "suffix": "b" },
     { "prefix": "c" },
     {"exists": false}
  ]
}

{"message": "a"} - matches on rule 1
{"message": "b"} - matches on rule 2
{"message": "c"} - matches on rule 3
{} - matches rule 4
{"message": "d"} - no match

@schenksj
Copy link
Contributor Author

schenksj commented Feb 2, 2023

There is actually a defect tied into here... if you do this:

{
  "message": [ 
     { "equals-ignore-case": "A" },
     { "suffix": "b" },
     { "prefix": "c" },
     { "anything-but": "b to-the c" },
     { "exists": false }
  ]
}

{"message": "b to-the c"} doesn't match because any anything-but match makes a no-match, vs the any-one-rule behavior without anything-buts. @timbray is this what you were referring to about the anything-but design being a little bit of a hack?

@baldawar
Copy link
Collaborator

baldawar commented Feb 3, 2023

That's a great catch. Ruler can't push breaking change.

Looking back at Nick's comment, I may have interpreted the wrong way. It might just have been an answer to "Who really needs this?" question and had no objections to implementing it like this

{
    "message": {
        "$and": [
            { "prefix": "landing/" },
            { "suffix": ".xml" },
        ]
    }
}

@NickMoores can you confirm?

@NickMoores
Copy link

Sorry for the confusion. Can confirm: my intent was to show support for an $and operator.

My example was an alternative syntax I tried for my use case before arriving here. I think it’s expressive, but I’m not read up enough on Ruler to understand whether it aligns with other design choices, so happy to bow to others with experience. I think $and makes sense given the existence of $or.

@baldawar
Copy link
Collaborator

baldawar commented Feb 7, 2023

From requirements standpoint I have two patterns worth testing here

Simple AND Matching

{ "$and": [
       {"myField": [{"anything-but": ["notmyvalue", "alsonotmyvalue"]}]},
       {"myField": [{"anything-but": {"suffix": "123"}}]},
       {"myField": [{"anything-but": {"prefix": "abc"}}]}
   ]
}

Complex AND & OR Matching

This is purely to stress the functioality and test if we hit any limitations.

{
  "$or": [
    {
      "$and": [
        { "prefix": "landing/" },
        { "anything-but": { "suffix": ".xml" } }
      ]
    },
    {
      "$and": [
        { "anything-but": { "prefix": "landing/" } },
        { "suffix": ".xml" }
      ]
    }
  ]
}

@baldawar
Copy link
Collaborator

baldawar commented Mar 20, 2023

Adding an interesting ask around $and here for future discussions.

Rule:

{
   "testList": { "$and" : ["a","b","c"] }
}

Event 1 (MATCHES):

{
   "testList": ["a","b","c","d","e"]
}

Event 2 (SHOULD NOT MATCH):

{
   "testList": ["a","c"]
}

@jonessha jonessha mentioned this issue Nov 3, 2023
@sridhard
Copy link

@baldawar any update on $and operator?

Also can you please tell me whether $or works for the same field name.
{ "$or": [
{"myField": [{"anything-but": ["notmyvalue", "alsonotmyvalue"]}]},
{"myField": [{"anything-but": {"suffix": "123"}}]},
{"myField": [{"anything-but": {"prefix": "abc"}}]}
]
}

@baldawar
Copy link
Collaborator

any update on $and operator?

No progress has been made yet though we have plans to look at this in late 2024. We don't have a firm date for this to be picked up. That being said, if anyone needs it sooner, we're happy to help guide them through the change.

Also can you please tell me whether $or works for the same field name.

It does, though in future would recommend creating a different issue for unrelated questions to avoid multiple topics getting mixed up in the same thread.

@jdcaperon
Copy link

Adding a +1 for a use case, in this case we are looking to use event ruler in an algebra that supports CONTAINS ALL which afaict is not possible to support within a single rule right now. E.g. given a field elements_used it is not possible to assert that all types are contained:

{
  "elements_used": [
    "shapes", 
    "text", 
    "images"
  ]
}

Moreover array support along the concept of CONTAINS EXACTLY would support the totality of set operations? ANY, ALL, EXACTLY? Although this might be pushing event-ruler beyond intended use case.

@baldawar
Copy link
Collaborator

baldawar commented Sep 3, 2024

Use-case wise supporting $AND fits well with ruler. CONTAINS EXACTLY/ANY/ANYTHING_BUT/ALL could be as well.

In case anyone is curious on progress, there's a branch on this repo where I've backed up work so far. So far, Ruler can parse rulers with $AND matchers, but the actual matching part is not fully-functional yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants