-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(spans): Support lowercase redis commands #4235
Conversation
relay-event-normalization/src/normalize/span/description/redis.rs
Outdated
Show resolved
Hide resolved
Also might be fun to look into |
I also considered using a trie for this, e.g. https://docs.rs/prefix-trie/latest/prefix_trie/. |
… and everything already depends on it anyway, right? |
Yes! Trie's are very cool data structures.
yep^^ |
But tbf, nothing wrong with a good old binary search! |
94f5637
to
48e6077
Compare
|
||
/// Returns a redis command if it is a case-insensitive prefix of `seek`. | ||
pub fn matching_redis_command(seek: &str) -> Option<&str> { | ||
let m = REDIS_COMMAND_MATCHER.find(&format!("{seek} "))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only suboptimal part of this solution is that it adds a string allocation for every search string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be possible with two matchers, one with spaces at the end the other one without, if the matcher without spaces matches (after the one without failed) it must be the full command with no trailing data.
Alternative option, you can find_iter{overlapping}
to get all matches then just select the longest one which has a space or end of input following. That seems pretty straight forward.
@Dav1dde restored my Aho-Corasick approach now. Only thing I don't like is the extra string allocation. |
relay-event-normalization/src/normalize/span/description/redis.rs
Outdated
Show resolved
Hide resolved
/// Returns a redis command if it is a case-insensitive prefix of `seek`. | ||
pub fn matching_redis_command(needle: &str) -> Option<&str> { | ||
let mut longest_match = None; | ||
for command in REDIS_COMMANDS { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems quite expensive, could also build a simple Regex in a LazyLock
(or aho-corasick, which the regex probably compiles to).
relay-event-normalization/src/normalize/span/description/mod.rs
Outdated
Show resolved
Hide resolved
None | Some("") => command.to_string(), | ||
Some(_other) => format!("{command} *"), | ||
}, | ||
None => "*".to_string(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None => "*".to_string(), | |
None => "*".to_owned(), |
@@ -0,0 +1,598 @@ | |||
use aho_corasick::{AhoCorasick, AhoCorasickBuilder, MatchKind}; | |||
use once_cell::sync::Lazy; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The std lib has a LazyLock
now.
|
||
/// Returns a redis command if it is a case-insensitive prefix of `seek`. | ||
pub fn matching_redis_command(seek: &str) -> Option<&str> { | ||
let m = REDIS_COMMAND_MATCHER.find(&format!("{seek} "))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be possible with two matchers, one with spaces at the end the other one without, if the matcher without spaces matches (after the one without failed) it must be the full command with no trailing data.
Alternative option, you can find_iter{overlapping}
to get all matches then just select the longest one which has a space or end of input following. That seems pretty straight forward.
relay-event-normalization/src/normalize/span/description/redis.rs
Outdated
Show resolved
Hide resolved
relay-event-normalization/src/normalize/span/description/redis.rs
Outdated
Show resolved
Hide resolved
/// | ||
/// See <https://redis.io/docs/latest/commands/>. | ||
static COMMAND_REGEX: LazyLock<Regex> = LazyLock::new(|| { | ||
Regex::new(r"(?i)(?-u:\s*)(?P<command>ZUNIONSTORE|ZUNION|ZSCORE|ZSCAN|ZREVRANK|ZREVRANGEBYSCORE|ZREVRANGEBYLEX|ZREVRANGE|ZREMRANGEBYSCORE|ZREMRANGEBYRANK|ZREMRANGEBYLEX|ZREM|ZRANK|ZRANGESTORE|ZRANGEBYSCORE|ZRANGEBYLEX|ZRANGE|ZRANDMEMBER|ZPOPMIN|ZPOPMAX|ZMSCORE|ZMPOP|ZLEXCOUNT|ZINTERSTORE|ZINTERCARD|ZINTER|ZINCRBY|ZDIFFSTORE|ZDIFF|ZCOUNT|ZCARD|ZADD|XTRIM|XSETID|XREVRANGE|XREADGROUP|XREAD|XRANGE|XPENDING|XLEN|XINFO STREAM|XINFO HELP|XINFO GROUPS|XINFO CONSUMERS|XINFO|XGROUP SETID|XGROUP HELP|XGROUP DESTROY|XGROUP DELCONSUMER|XGROUP CREATECONSUMER|XGROUP CREATE|XGROUP|XDEL|XCLAIM|XAUTOCLAIM|XADD|XACK|WATCH|WAITAOF|WAIT|UNWATCH|UNSUBSCRIBE|UNLINK|TYPE|TTL|TS.REVRANGE|TS.RANGE|TS.QUERYINDEX|TS.MREVRANGE|TS.MRANGE|TS.MGET|TS.MADD|TS.INFO|TS.INCRBY|TS.GET|TS.DELETERULE|TS.DEL|TS.DECRBY|TS.CREATERULE|TS.CREATE|TS.ALTER|TS.ADD|TOUCH|TOPK.RESERVE|TOPK.QUERY|TOPK.LIST|TOPK.INFO|TOPK.INCRBY|TOPK.COUNT|TOPK.ADD|TIME|TDIGEST.TRIMMED_MEAN|TDIGEST.REVRANK|TDIGEST.RESET|TDIGEST.RANK|TDIGEST.QUANTILE|TDIGEST.MIN|TDIGEST.MERGE|TDIGEST.MAX|TDIGEST.INFO|TDIGEST.CREATE|TDIGEST.CDF|TDIGEST.BYREVRANK|TDIGEST.BYRANK|TDIGEST.ADD|SYNC|SWAPDB|SUNSUBSCRIBE|SUNIONSTORE|SUNION|SUBSTR|SUBSCRIBE|STRLEN|SSUBSCRIBE|SSCAN|SREM|SRANDMEMBER|SPUBLISH|SPOP|SORT_RO|SORT|SMOVE|SMISMEMBER|SMEMBERS|SLOWLOG RESET|SLOWLOG LEN|SLOWLOG HELP|SLOWLOG GET|SLOWLOG|SLAVEOF|SISMEMBER|SINTERSTORE|SINTERCARD|SINTER|SHUTDOWN|SETRANGE|SETNX|SETEX|SETBIT|SET|SELECT|SDIFFSTORE|SDIFF|SCRIPT LOAD|SCRIPT KILL|SCRIPT HELP|SCRIPT FLUSH|SCRIPT EXISTS|SCRIPT DEBUG|SCRIPT|SCARD|SCAN|SAVE|SADD|RPUSHX|RPUSH|RPOPLPUSH|RPOP|ROLE|RESTORE-ASKING|RESTORE|RESET|REPLICAOF|REPLCONF|RENAMENX|RENAME|READWRITE|READONLY|RANDOMKEY|QUIT|PUNSUBSCRIBE|PUBSUB SHARDNUMSUB|PUBSUB SHARDCHANNELS|PUBSUB NUMSUB|PUBSUB NUMPAT|PUBSUB HELP|PUBSUB CHANNELS|PUBSUB|PUBLISH|PTTL|PSYNC|PSUBSCRIBE|PSETEX|PING|PFSELFTEST|PFMERGE|PFDEBUG|PFCOUNT|PFADD|PEXPIRETIME|PEXPIREAT|PEXPIRE|PERSIST|OBJECT REFCOUNT|OBJECT IDLETIME|OBJECT HELP|OBJECT FREQ|OBJECT ENCODING|OBJECT|MULTI|MSETNX|MSET|MOVE|MONITOR|MODULE UNLOAD|MODULE LOADEX|MODULE LOAD|MODULE LIST|MODULE HELP|MODULE|MIGRATE|MGET|MEMORY USAGE|MEMORY STATS|MEMORY PURGE|MEMORY MALLOC-STATS|MEMORY HELP|MEMORY DOCTOR|MEMORY|LTRIM|LSET|LREM|LRANGE|LPUSHX|LPUSH|LPOS|LPOP|LOLWUT|LMPOP|LMOVE|LLEN|LINSERT|LINDEX|LCS|LATENCY RESET|LATENCY LATEST|LATENCY HISTORY|LATENCY HISTOGRAM|LATENCY HELP|LATENCY GRAPH|LATENCY DOCTOR|LATENCY|LASTSAVE|KEYS|JSON.TYPE|JSON.TOGGLE|JSON.STRLEN|JSON.STRAPPEND|JSON.SET|JSON.RESP|JSON.OBJLEN|JSON.OBJKEYS|JSON.NUMMULTBY|JSON.NUMINCRBY|JSON.MSET|JSON.MGET|JSON.MERGE|JSON.GET|JSON.FORGET|JSON.DEL|JSON.DEBUG MEMORY|JSON.DEBUG HELP|JSON.DEBUG|JSON.CLEAR|JSON.ARRTRIM|JSON.ARRPOP|JSON.ARRLEN|JSON.ARRINSERT|JSON.ARRINDEX|JSON.ARRAPPEND|INFO|INCRBYFLOAT|INCRBY|INCR|HVALS|HTTL|HSTRLEN|HSETNX|HSET|HSCAN|HRANDFIELD|HPTTL|HPEXPIRETIME|HPEXPIREAT|HPEXPIRE|HPERSIST|HMSET|HMGET|HLEN|HKEYS|HINCRBYFLOAT|HINCRBY|HGETALL|HGET|HEXPIRETIME|HEXPIREAT|HEXPIRE|HEXISTS|HELLO|HDEL|GETSET|GETRANGE|GETEX|GETDEL|GETBIT|GET|GEOSEARCHSTORE|GEOSEARCH|GEORADIUS_RO|GEORADIUSBYMEMBER_RO|GEORADIUSBYMEMBER|GEORADIUS|GEOPOS|GEOHASH|GEODIST|GEOADD|FUNCTION STATS|FUNCTION RESTORE|FUNCTION LOAD|FUNCTION LIST|FUNCTION KILL|FUNCTION HELP|FUNCTION FLUSH|FUNCTION DUMP|FUNCTION DELETE|FUNCTION|FT._LIST|FT.TAGVALS|FT.SYNUPDATE|FT.SYNDUMP|FT.SUGLEN|FT.SUGGET|FT.SUGDEL|FT.SUGADD|FT.SPELLCHECK|FT.SEARCH|FT.PROFILE|FT.INFO|FT.EXPLAINCLI|FT.EXPLAIN|FT.DROPINDEX|FT.DICTDUMP|FT.DICTDEL|FT.DICTADD|FT.CURSOR READ|FT.CURSOR DEL|FT.CREATE|FT.CONFIG SET|FT.CONFIG HELP|FT.CONFIG GET|FT.ALTER|FT.ALIASUPDATE|FT.ALIASDEL|FT.ALIASADD|FT.AGGREGATE|FLUSHDB|FLUSHALL|FCALL_RO|FCALL|FAILOVER|EXPIRETIME|EXPIREAT|EXPIRE|EXISTS|EXEC|EVAL_RO|EVALSHA_RO|EVALSHA|EVAL|ECHO|DUMP|DISCARD|DEL|DECRBY|DECR|DEBUG|DBSIZE|COPY|CONFIG SET|CONFIG REWRITE|CONFIG RESETSTAT|CONFIG HELP|CONFIG GET|CONFIG|COMMAND LIST|COMMAND INFO|COMMAND HELP|COMMAND GETKEYSANDFLAGS|COMMAND GETKEYS|COMMAND DOCS|COMMAND COUNT|COMMAND|CMS.QUERY|CMS.MERGE|CMS.INITBYPROB|CMS.INITBYDIM|CMS.INFO|CMS.INCRBY|CLUSTER SLOTS|CLUSTER SLAVES|CLUSTER SHARDS|CLUSTER SETSLOT|CLUSTER SET-CONFIG-EPOCH|CLUSTER SAVECONFIG|CLUSTER RESET|CLUSTER REPLICATE|CLUSTER REPLICAS|CLUSTER NODES|CLUSTER MYSHARDID|CLUSTER MYID|CLUSTER MEET|CLUSTER LINKS|CLUSTER KEYSLOT|CLUSTER INFO|CLUSTER HELP|CLUSTER GETKEYSINSLOT|CLUSTER FORGET|CLUSTER FLUSHSLOTS|CLUSTER FAILOVER|CLUSTER DELSLOTSRANGE|CLUSTER DELSLOTS|CLUSTER COUNTKEYSINSLOT|CLUSTER COUNT-FAILURE-REPORTS|CLUSTER BUMPEPOCH|CLUSTER ADDSLOTSRANGE|CLUSTER ADDSLOTS|CLUSTER|CLIENT UNPAUSE|CLIENT UNBLOCK|CLIENT TRACKINGINFO|CLIENT TRACKING|CLIENT SETNAME|CLIENT SETINFO|CLIENT REPLY|CLIENT PAUSE|CLIENT NO-TOUCH|CLIENT NO-EVICT|CLIENT LIST|CLIENT KILL|CLIENT INFO|CLIENT ID|CLIENT HELP|CLIENT GETREDIR|CLIENT GETNAME|CLIENT CACHING|CLIENT|CF.SCANDUMP|CF.RESERVE|CF.MEXISTS|CF.LOADCHUNK|CF.INSERTNX|CF.INSERT|CF.INFO|CF.EXISTS|CF.DEL|CF.COUNT|CF.ADDNX|CF.ADD|BZPOPMIN|BZPOPMAX|BZMPOP|BRPOPLPUSH|BRPOP|BLPOP|BLMPOP|BLMOVE|BITPOS|BITOP|BITFIELD_RO|BITFIELD|BITCOUNT|BGSAVE|BGREWRITEAOF|BF.SCANDUMP|BF.RESERVE|BF.MEXISTS|BF.MADD|BF.LOADCHUNK|BF.INSERT|BF.INFO|BF.EXISTS|BF.CARD|BF.ADD|AUTH|ASKING|APPEND|ACL WHOAMI|ACL USERS|ACL SETUSER|ACL SAVE|ACL LOG|ACL LOAD|ACL LIST|ACL HELP|ACL GETUSER|ACL GENPASS|ACL DRYRUN|ACL DELUSER|ACL CAT|ACL)(?-u:\s|$)").unwrap() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can disable unicode for the regex with (?-u)
in the beginning, that's what globset does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It complained because I did not escape dots in redis commands. Fixed now.
Not all redis integrations send uppercase redis commands, which results in redis spans being scrubbed as *. See
relay/relay-event-normalization/src/normalize/span/description/mod.rs
Lines 1142 to 1143 in c6d9110
This PR removes the redis regex and introduces a list of redis command that can be prefix-searched.
Fixes #3643