Skip to content

Conversation

annetteshajan
Copy link
Contributor

Description:

This commit fixes the issue mentioned in #4101
It cleans up the regex used for checking access tokens in LarkSuite.
Currently "\b" only covers non word characters (letters, numbers, underscore). It does not include dashes which is popularly used in URLs and other naming conventions.

I have also included a test which checks for invalid tokens.

Checklist:

  • [ ✔️ ] Tests passing (make test-community)?
  • [ ✔️ ] Lint passing (make lint this requires golangci-lint)?

@annetteshajan annetteshajan requested a review from a team as a code owner May 30, 2025 15:36
@CLAassistant
Copy link

CLAassistant commented May 30, 2025

CLA assistant check
All committers have signed the CLA.

@amanfcp
Copy link
Contributor

amanfcp commented Jun 2, 2025

@annetteshajan Thank you for contributing. Please sign the CLA first so that we can move forward with the review.

@@ -34,9 +34,9 @@ const (
var (
defaultClient = common.SaneHttpClient()
tokenPats = map[tokenType]*regexp.Regexp{
TenantAccessToken: regexp.MustCompile(detectors.PrefixRegex([]string{"lark", "larksuite", "tenant"}) + `\b(t-[a-z0-9A-Z_.]{14,50})\b`),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @annetteshajan! Thanks for updating this. I have a small suggestion. We could remove the \b word boundaries and use negative lookbehind and lookahead instead in order to avoid potential backtracking and improve regex efficiency.
Something like:

(?<![-\w])(t-[a-zA-Z0-9_.]{14,50})(?![-\w])

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had tried this initially, but I found that Go doesn't support lookahead and lookbehind with the standard regex libraries. I will have to think of another workaround

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amanfcp Hi, it is done!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@annetteshajan Ah, thanks for clarifying. In that case the current regex works.

@nabeelalam nabeelalam linked an issue Jun 2, 2025 that may be closed by this pull request
Comment on lines 37 to 39
TenantAccessToken: regexp.MustCompile(detectors.PrefixRegex([]string{"lark", "larksuite", "tenant"}) + `(?:^|[^-])\b(t-[a-z0-9A-Z_.]{14,50})\b(?:^|[^-])`),
UserAccessToken: regexp.MustCompile(detectors.PrefixRegex([]string{"lark", "larksuite", "user"}) + `(?:^|[^-])\b(u-[a-z0-9A-Z_.]{14,50})\b(?:^|[^-])`),
AppAccessToken: regexp.MustCompile(detectors.PrefixRegex([]string{"lark", "larksuite", "app"}) + `(?:^|[^-])\b(a-[a-z0-9A-Z_.]{14,50})\b(?:^|[^-])`),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current regex won't match valid patterns at the end of input because the final (?:^|[^-]) requires either start-of-string (impossible at end of match) or a non-hyphen character after the pattern.

Current:
(?:^|[^-])\b(a-[a-z0-9A-Z_.]{14,50})\b(?:^|[^-])

Should be:
(?:^|[^-])\b(a-[a-z0-9A-Z_.]{14,50})\b(?:[^-]|$)

Issue: String like text a-example123456789 (ending with target pattern) won't match with current regex.

Fix: Replace final (?:^|[^-]) with (?:[^-]|$) to handle end-of-string cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @amanfcp
Thanks for the comments!
I have made the changes

Copy link
Contributor

@amanfcp amanfcp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@amanfcp
Copy link
Contributor

amanfcp commented Jun 2, 2025

@nabeelalam Can you review it again please?

@nabeelalam nabeelalam merged commit 1484992 into trufflesecurity:main Jun 3, 2025
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Larksuite detector: poor access token matches
4 participants