Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for Length Wildcard #337

Open
I-am-Orion opened this issue Mar 9, 2020 · 4 comments
Open

Request for Length Wildcard #337

I-am-Orion opened this issue Mar 9, 2020 · 4 comments

Comments

@I-am-Orion
Copy link

I-am-Orion commented Mar 9, 2020

I want the length wildcard feature in Rivescript so that I get the option to match user's input text more accurately.
e.g.-
EXACT LENGTH WILDCARD:

+ hello *2

Matches Hello John Doe
Does not match Hello John

VARIABLE LENGTH WILDCARD:

+ hello *~2
- That is crazy!

Matches Hello!
Matches Hello John!
Matches Hello John Doe
Does not match Hello John George Doe

MIN-MAX WILDCARD:

+ hello *(2-4)

Matches Hello John Doe and Hello John Dorian Doe
Does not match Hello John

@dcsan
Copy link
Contributor

dcsan commented May 26, 2020

rivescript uses a subset of regex to make it easier for non technical people to write scripts. there has been some discussion of using full regex syntax but it's never been finalized.

most of discussion moved here
aichaos/rivescript-wd#6

but there are other related issues eg
#253
https://github.com/aichaos/rivescript-js/issues?q=is%3Aissue+regex+is%3Aclosed

@dcsan
Copy link
Contributor

dcsan commented May 26, 2020

#256

actually there's a PR here where you can use the ~ trigger syntax for individual triggers.
this doesn't provide full regex features but perhaps you can extend it based on that PR

@gleuch
Copy link
Contributor

gleuch commented May 27, 2020

You can run it through an object and parse accordingly to get string length. Basic example, YMMV.

> object checkHelloWildcard javascript
  var [rs, [str]] = arguments;
  return rs.reply(rs.currentUser(), `reply hello with ${str.length}`);
< object

+ hello *
- <call>checkHelloWildcard "<star1>"</call>

+ reply hello with 2
- I said hello with 2 characters!

+ reply hello with *
- I said hello with <star1> characters!

You can get more advanced by referencing and returning from topics to handle these cases.

@kirsle
Copy link
Member

kirsle commented May 27, 2020

This has been asked for a few times and I haven't wanted to try and dig into what the regular expression for this would look like.

I know SuperScript.js (a fork of Rivescript) has syntax like described in the OP and I went sleuthing through their code, but didn't find a regexp that I could take from that and add to RiveScript.

RiveScript's "simplified regexp" system for triggers ends up creating some rather gnarly raw regular expressions to support all the features RiveScript has. For example the "[optionals]" syntax in RiveScript expands out to a regexp that looks like:

+ what is your [phone|office|home] number
('^what is your(?:(?:\s|\b)+phone(?:\s|\b)+|(?:\s|\b)+office(?:\s|\b)+|(?:\s|\b)+home(?:\s|\b)+|(?:\s|\b))number$')

A lot of things are going on with this example: it needs the regexp to either look like "what is your (phone|office|home) number" treating the optionals as a regular alternatives capture group, but also needs to match messages that contain none of those words, so needs the spaces on either side to be optional so you can just say "what is your number" but also require at least one space, so that "what is yournumber" does not match (lack of space where the optionals would go). Additionally, it needs to support the optional being at the beginning or the end of the trigger, and in these cases the extra space characters on either side need to be not required for matching. The word-boundary metacharacter \b helps it anchor on "word boundary (spaces) or start/end of string"... and this brings with it a new set of problems, namely, Unicode symbols have difficulty matching because \b only considers ASCII alphabet to be "word characters" and not umlauts or foreign language symbols.

All this to say... extending the regexp engine further to support a "number of words wildcard" while making it compatible with all the existing complexity that RiveScript's triggers currently supports may be a tricky task. If done incorrectly, RiveScript may assemble regular expressions that are invalid and have syntax errors, or that cause matching to fail in certain use cases, and introduce more bugs into the library than it solves.

If you want to take a stab at this and figure out a regexp and send me a pull request, feel free! I can then port that change to other editions of RiveScript, too (i.e. Python, Java and Go versions). But I personally haven't felt motivated enough to do this, and everyone who's asked me for this feature doesn't seem willing to try it themselves either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants