Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pat: add wildcard pattern #356

Merged
merged 3 commits into from
Sep 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 32 additions & 8 deletions PATTERNS.md
Original file line number Diff line number Diff line change
Expand Up @@ -149,9 +149,9 @@ is not equal to any of the strings in the array.
If a Field in a Pattern contains an Anything-But Pattern,
it **MUST NOT** contain any other values.

### Shellstyle Pattern
### Wildcard Pattern

The Pattern Type of a Shellstyle Pattern is `shellstyle`
The Pattern Type of a Wildcard Pattern is `wildcard`
and its value **MUST** be a string which **MAY** contain
`*` (“star”) characters. The star character
functions exactly as the same character does in
Expand All @@ -164,13 +164,37 @@ Consider the following Event:
```json
{"img": "https://example.com/9943.jpg"}
```
The following Shellstyle Patterns would match it:
The following Wildcard Patterns would match it:
```json
{"img": [ {"wildcard": "*.jpg"} ] }
{"img": [ {"wildcard": "https://example.com/*"} ] }
{"img": [ {"wildcard": "https://example.com/*.jpg"} ] }
{"img": [ {"wildcard": "https://example.*/*.jpg"} ] }
```

If it is desired to match the actual character "*", it may be “escaped”
with backslash, "\". For example, consider the following Event:

```json
{"img": [ {"shellstyle": "*.jpg"} ] }
{"img": [ {"shellstyle": "https://example.com/*"} ] }
{"img": [ {"shellstyle": "https://example.com/*.jpg"} ] }
{"img": [ {"shellstyle": "https://example.*/*.jpg"} ] }
{"example-regex": "a**\\.b"}
```

The following Wildcard pattern would match it.

```json
{"example-regex": [ {"wildcard": "a\\*\\*\\\\.b"}]}
```

Note that the "\" backslashes must be doubled to deal with the
fact that they are escape characters for JSON as well as for Quamina.

After a "\", the appearance of any character other than "*" or "\" is an error.

### Shellstyle Pattern

This is an earlier version of the Wildcard pattern, differing only that
\-escaping the "*" and "\" characters is not supported.

### Equals-Ignore-Case Pattern

The Pattern Type of an Equals-Ignore-Case pattern is `equals-ignore-case`
Expand All @@ -192,6 +216,6 @@ the AWS EventBridge service, as documented in
As of release 1.0, Quamina supports Exists and
Anything-But Patterns, but does not yet support any other
EventBridge patterns. Note that a
Shellstyle Pattern with a trailing `*` is equivalent
Wildcard Pattern with a trailing `*` is equivalent
to a `prefix` pattern.

10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,23 +89,23 @@ The following Patterns would match it:
```json
{
"Image": {
"Thumbnail": { "Url": [ { "shellstyle": "*9943" } ] }
"Thumbnail": { "Url": [ { "wildcard": "*9943" } ] }
}
}
```
```json
{
"Image": {
"Thumbnail": { "Url":
[ { "shellstyle": "http://www.example.com/*" } ] }
[ { "wildcard": "http://www.example.com/*" } ] }
}
}
```
```json
{
"Image": {
"Thumbnail": { "Url":
[ { "shellstyle": "http://www.example.*/*9943" } ] }
[ { "wildcard": "http://www.example.*/*9943" } ] }
}
}
```
Expand Down Expand Up @@ -298,7 +298,7 @@ I used to say that the performance of
`MatchesForEvent` was O(1) in the number of
Patterns. That’s probably a reasonable way to think
about it, because it’s *almost* right, except in the
case where a very large number of `shellstyle` patterns
case where a very large number of `wildcard` patterns
have been added; this is discussed in the next section.

To be correct, the performance is a little worse than
Expand Down Expand Up @@ -342,7 +342,7 @@ So, adding a new Pattern that only mentions fields which are
already mentioned in previous Patterns is effectively free,
i.e. O(1) in terms of run-time performance.

### Quamina instances with large numbers of `shellstyle` Patterns
### Quamina instances with large numbers of `wildcard` Patterns

A study of the theory of finite automata reveals that processing
regular-expression constructs such as `*` increases the complexity of
Expand Down
3 changes: 3 additions & 0 deletions pattern.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ const (
anythingButType
prefixType
monocaseType
wildcardType
)

// typedVal represents the value of a field in a pattern, giving the value and the type of pattern.
Expand Down Expand Up @@ -198,6 +199,8 @@ func readSpecialPattern(pb *patternBuild, valsIn []typedVal) (pathVals []typedVa
pathVals, err = readExistsSpecial(pb, pathVals)
case "shellstyle":
pathVals, err = readShellStyleSpecial(pb, pathVals)
case "wildcard":
pathVals, err = readWildcardSpecial(pb, pathVals)
case "prefix":
pathVals, err = readPrefixSpecial(pb, pathVals)
case "equals-ignore-case":
Expand Down
40 changes: 39 additions & 1 deletion pattern_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,10 @@ func TestPatternFromJSON(t *testing.T) {
`{"abc": [ {"prefix": - "a" }, "foo" ] }`,
`{"abc": [ {"prefix": "a" {, "foo" ] }`,
`{"abc": [ {"equals-ignore-case":23}, "foo" ] }`,
`{"abc": [ {"wildcard":"15", "x", 1} ] }`,
`{"abc": [ {"wildcard":"a**b"}, "foo" ] }`,
`{"abc": [ {"wildcard":"a\\b"}, "foo" ] }`, // after JSON parsing, code sees `a/b`
`{"abc": [ {"wildcard":"a\\"}, "foo" ] }`, // after JSON parsing, code sees `a\`
"{\"a\": [ { \"anything-but\": { \"equals-ignore-case\": [\"1\", \"2\" \"3\"] } } ] }", // missing ,
"{\"a\": [ { \"anything-but\": { \"equals-ignore-case\": [1, 2, 3] } } ] }", // no numbers
"{\"a\": [ { \"anything-but\": { \"equals-ignore-case\": [\"1\", \"2\" } } ] }", // missing ]
Expand All @@ -93,6 +97,10 @@ func TestPatternFromJSON(t *testing.T) {
`{"abc": [ {"shellstyle":"a*b"}, "foo" ] }`,
`{"abc": [ {"shellstyle":"a*b*c"} ] }`,
`{"x": [ {"equals-ignore-case":"a*b*c"} ] }`,
`{"abc": [ 3, {"wildcard":"a*b"} ] }`,
`{"abc": [ {"wildcard":"a*b"}, "foo" ] }`,
`{"abc": [ {"wildcard":"a*b*c"} ] }`,
`{"abc": [ {"wildcard":"a*b\\*c"} ] }`,
}
w1 := []*patternField{{path: "x", vals: []typedVal{{vType: numberType, val: "2"}}}}
w2 := []*patternField{{path: "x", vals: []typedVal{
Expand Down Expand Up @@ -156,7 +164,37 @@ func TestPatternFromJSON(t *testing.T) {
},
},
}
wanted := [][]*patternField{w1, w2, w3, w4, w5, w6, w7, w8, w9}
w10 := []*patternField{
{
path: "abc", vals: []typedVal{
{vType: stringType, val: "3"},
{vType: wildcardType, val: `"a*b"`},
},
},
}
w11 := []*patternField{
{
path: "abc", vals: []typedVal{
{vType: wildcardType, val: `"a*b"`},
{vType: stringType, val: `"foo"`},
},
},
}
w12 := []*patternField{
{
path: "abc", vals: []typedVal{
{vType: wildcardType, val: `"a*b*c"`},
},
},
}
w13 := []*patternField{
{
path: "abc", vals: []typedVal{
{vType: wildcardType, val: `"a*b\*c"`},
},
},
}
wanted := [][]*patternField{w1, w2, w3, w4, w5, w6, w7, w8, w9, w10, w11, w12, w13}

for i, good := range goods {
fields, err := patternFromJSON([]byte(good))
Expand Down
4 changes: 3 additions & 1 deletion prettyprinter.go
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,9 @@ func (pp *prettyPrinter) tableLabel(t *smallTable) string {

func (pp *prettyPrinter) labelTable(table *smallTable, label string) {
pp.tableLabels[table] = label
pp.tableSerials[table] = uint(pp.randInts.Int63()%500 + 500)
newSerial := pp.randInts.Int63()%500 + 500
//nolint:gosec
pp.tableSerials[table] = uint(newSerial)
}

func (pp *prettyPrinter) printNFA(t *smallTable) string {
Expand Down
12 changes: 12 additions & 0 deletions value_matcher.go
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,9 @@ func (m *valueMatcher) addTransition(val typedVal, printer printer) *fieldMatche
case shellStyleType:
newFA, nextField = makeShellStyleFA(valBytes, printer)
fields.isNondeterministic = true
case wildcardType:
newFA, nextField = makeWildcardFA(valBytes, printer)
fields.isNondeterministic = true
case prefixType:
newFA, nextField = makePrefixFA(valBytes)
case monocaseType:
Expand Down Expand Up @@ -156,6 +159,12 @@ func (m *valueMatcher) addTransition(val typedVal, printer printer) *fieldMatche
fields.isNondeterministic = true
m.update(fields)
return nextField
case wildcardType:
newAutomaton, nextField := makeWildcardFA(valBytes, printer)
fields.startTable = newAutomaton
fields.isNondeterministic = true
m.update(fields)
return nextField
case prefixType:
newFA, nextField := makePrefixFA(valBytes)
fields.startTable = newFA
Expand Down Expand Up @@ -194,6 +203,9 @@ func (m *valueMatcher) addTransition(val typedVal, printer printer) *fieldMatche
case shellStyleType:
newFA, nextField = makeShellStyleFA(valBytes, printer)
fields.isNondeterministic = true
case wildcardType:
newFA, nextField = makeWildcardFA(valBytes, printer)
fields.isNondeterministic = true
case prefixType:
newFA, nextField = makePrefixFA(valBytes)
case monocaseType:
Expand Down
72 changes: 72 additions & 0 deletions value_matcher_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -347,3 +347,75 @@ func TestMakeFAFragment(t *testing.T) {
}
}
}
func TestExerciseSingletonReplacement(t *testing.T) {
cm := newCoreMatcher()
err := cm.addPattern("x", `{"x": [ "a"]}`)
if err != nil {
t.Error("AP: " + err.Error())
}
err = cm.addPattern("x", `{"x": [1]}`)
if err != nil {
t.Error("AP: " + err.Error())
}
events := []string{`{"x": 1}`, `{"x": "a"}`}
for _, e := range events {
matches, err := cm.matchesForJSONEvent([]byte(e))
if err != nil {
t.Error("m4: " + err.Error())
}
if len(matches) != 1 || matches[0] != "x" {
t.Error("match failed on: " + e)
}
}
events = []string{`{"x": 1}`, `{"x": "a"}`}
for _, e := range events {
matches, err := cm.matchesForJSONEvent([]byte(e))
if err != nil {
t.Error("m4: " + err.Error())
}
if len(matches) != 1 || matches[0] != "x" {
t.Error("match failed on: " + e)
}
}
cm = newCoreMatcher()
err = cm.addPattern("x", `{"x": ["x"]}`)
if err != nil {
t.Error("AP: " + err.Error())
}
err = cm.addPattern("x", `{"x": [ {"wildcard": "x*y"}]}`)
if err != nil {
t.Error("AP: " + err.Error())
}
events = []string{`{"x": "x"}`, `{"x": "x..y"}`}
for _, e := range events {
matches, err := cm.matchesForJSONEvent([]byte(e))
if err != nil {
t.Error("m4: " + err.Error())
}
if len(matches) != 1 || matches[0] != "x" {
t.Error("match failed on: " + e)
}
}
}

func TestMergeNfaAndNumeric(t *testing.T) {
cm := newCoreMatcher()
err := cm.addPattern("x", `{"x": [{"wildcard":"x*y"}]}`)
if err != nil {
t.Error("AP: " + err.Error())
}
err = cm.addPattern("x", `{"x": [3]}`)
if err != nil {
t.Error("AP: " + err.Error())
}
events := []string{`{"x": 3}`, `{"x": "xasdfy"}`}
for _, e := range events {
matches, err := cm.matchesForJSONEvent([]byte(e))
if err != nil {
t.Error("M4: " + err.Error())
}
if len(matches) != 1 || matches[0] != "x" {
t.Error("Match failed on " + e)
}
}
}
Loading
Loading