Skip to content

Add secrets search and bump DotUtils.MsBuild.BinlogRedactor.SensitiveDataDetector version#829

Merged
KirillOsenkov merged 10 commits intoKirillOsenkov:mainfrom
YuliiaKovalova:dev/ykovalova/add_secrets_search
Nov 1, 2024
Merged

Add secrets search and bump DotUtils.MsBuild.BinlogRedactor.SensitiveDataDetector version#829
KirillOsenkov merged 10 commits intoKirillOsenkov:mainfrom
YuliiaKovalova:dev/ykovalova/add_secrets_search

Conversation

@YuliiaKovalova
Copy link
Copy Markdown
Contributor

@YuliiaKovalova YuliiaKovalova commented Oct 30, 2024

Fixes

#823

Context

This change adds a support of a new search key word "$secret" that allows to detect all the suspicious tree entries, based on the capabilities of this library MSBuild.BinlogRedactor.

MSBuild.BinlogRedactor is already used for secrets reduction functionality, but it this context it's possible to check the presence of the secrets on fly
{4CD40D35-CD93-4195-B83B-30873E35C1D2}

$secret not(SensitiveDataKind) statement is supported here:
{E743D083-2878-4655-AACF-711D9A8E6610}

It's possible to use the new keyword on 2 tabs:
"Search Log"
{12CCFBFD-AC63-4A06-B1F7-86948D7B721A}

"Find in Files"
{ED9BC1B4-EBC7-4979-B8E2-097B9A01AB53}

Changes made

A new SecretsSearch class was implemented based on ISearchExtension interface.
To make this functionality available on "Find in Files" tab, the method was extended

private object FindInFiles(string searchText, int maxResults, CancellationToken cancellationToken)
. There was an idea to use factory for handing this case, but we agreed with to postpone this and get back on demand.
msbuild_logWithFalseSecrets.zip

Comment thread src/StructuredLogger.Utils/SecretsSearch.cs Outdated
Comment thread src/StructuredLogger.Utils/SecretsSearch.cs Outdated
Comment thread src/StructuredLogger.Utils/SecretsSearch.cs Outdated
@YuliiaKovalova YuliiaKovalova changed the title Add secrets search Add secrets search and bump DotUtils.MsBuild.BinlogRedactor.SensitiveDataDetector version Oct 31, 2024
@YuliiaKovalova YuliiaKovalova marked this pull request as ready for review October 31, 2024 12:48
Copy link
Copy Markdown
Collaborator

@JanKrivanek JanKrivanek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this!

Comment thread src/StructuredLogViewer/Controls/BuildControl.xaml.cs Outdated
Comment thread src/StructuredLogger.Utils/SecretsSearch.cs
@JanKrivanek
Copy link
Copy Markdown
Collaborator

Can this be added to list of sample searches?

private static string[] searchExamples = new[]
{
"Copying file from ",
"Resolved file path is ",
"There was a conflict",
"Encountered conflict between",
"Building target completely ",
"is newer than output ",
"Property reassignment: $(",
"out-of-date",
"$task $time",
"$message CompilerServer failed",
"will be compiled because",
};

image

Comment thread src/StructuredLogViewer/Controls/BuildControl.xaml.cs Outdated
var haystack = file.Value;
var resultsInFile = haystack.Find(searchText);
if (resultsInFile.Count > 0)
if (!string.IsNullOrEmpty(searchText) && searchText.StartsWith("$secret"))
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this condition will run for every file, and it doesn't change. How about we extract a bool above the foreach and just check it here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved

@KirillOsenkov
Copy link
Copy Markdown
Owner

Very nice! I left a couple small comments but otherwise it's good to go!

Comment thread src/StructuredLogViewer/Controls/BuildControl.xaml.cs Outdated
@KirillOsenkov
Copy link
Copy Markdown
Owner

Oh, I forgot. When there are no secrets found, we should return something, otherwise you type $secret and nothing happens. For a regular search we return
image

{
var results = new List<(string, IEnumerable<(int, string)>)>();

NodeQueryMatcher notQueryMatcher = new NodeQueryMatcher(searchText);
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you mean nodeQueryMatcher?

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed it in main

var results = new List<(string, IEnumerable<(int, string)>)>();

NodeQueryMatcher notQueryMatcher = new NodeQueryMatcher(searchText);
bool isSecretsSearch = !string.IsNullOrEmpty(searchText) && searchText.StartsWith("$secret");
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can also check for nodeQueryMatcher.TypeKeyword == "secret"

Copy link
Copy Markdown
Owner

@KirillOsenkov KirillOsenkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, thanks!

@KirillOsenkov KirillOsenkov merged commit 49aaad1 into KirillOsenkov:main Nov 1, 2024
@JanKrivanek
Copy link
Copy Markdown
Collaborator

Works like a charm!
Plus the new patterns are now available in the binlogtool redact as well:

image

@JanKrivanek
Copy link
Copy Markdown
Collaborator

@KirillOsenkov
Copy link
Copy Markdown
Owner

One todo would be to investigate adding parallelism to secret detection because right now it's really slow on real-life binlogs, even moderately sized. Takes over two minutes on a 5 MB binlog I use often. I think a judicious use of Task.Run() will really make things better here. You can maybe spawn one task per string, and one task per file, then just foreach the tasks and do task.Wait() on each of them. No need for async as only one thread will be blocked (the one where the computation happens)

@KirillOsenkov
Copy link
Copy Markdown
Owner

I filed #832

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants