Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add concurrent DNS lookup support #48

Merged
merged 2 commits into from
Apr 14, 2024
Merged

Conversation

moorereason
Copy link
Contributor

Add a limited worker pool to do concurrent DNS lookups. Add a lookup_limit config parameter to control pool size (default=50).

I'm using github.com/sourcegraph/conc to simplify the goroutine management. Since conc uses generics, I updated the minimum Go version in go.mod to 1.18.

Updates #10

Add a limited worker pool to do concurrent DNS lookups.  Add a
`lookup_limit` config parameter to control pool size (default=50).

I'm using `github.com/sourcegraph/conc` to simplify the goroutine
management.  Since `conc` uses generics, I updated the minimum Go
version in `go.mod` to 1.18.

Updates tierpod#10
@tierpod
Copy link
Owner

tierpod commented Apr 13, 2024

Hi @moorereason!

Thank you for PR, this feature will significantly improve execution time for cases with many ip addresses in reports. I haven't heard about conc before, looks great.

I only have 1 concern. You moved DNS lookup function from package pkg/dmarc/dmarc.go to command line tool cmd/dmart-report-converter/files.go. I would prefer to keep it in first file, because it could be useful to have such feature builtin into package. It will require adding argument lookupLimit to functions signatures that have lookupAddr argument ( Report.Parse, Report.ReadParse*), but it's OK.

Could you please do this? I completely understand if you don't have time for this. In that case, let me now, and I'll try to do this using your code as reference.

@tierpod tierpod self-assigned this Apr 13, 2024
@moorereason
Copy link
Contributor Author

@tierpod,
I've refactored this code and pushed up another commit.

Two reasons why I didn't do it this way to begin with:

  1. I didn't want to break the dmarc package API, but that doesn't seem to be a concern. 😁
  2. By having the concurrency in the dmarc package, we end up dealing with a single report at a time instead of gathering all reports and running them all through a lookup pool at once.

Either way, this is much faster than the existing code.

I can squash this PR into a single commit before you merge if you want.

@tierpod
Copy link
Owner

tierpod commented Apr 14, 2024

Amazing! You did a great job, appreciate it.

@tierpod tierpod merged commit 8caa163 into tierpod:master Apr 14, 2024
moorereason added a commit to moorereason/dmarc-report-converter that referenced this pull request Apr 14, 2024
Update tests since the data model changed after PR tierpod#48 was merged.
@moorereason moorereason deleted the iss10 branch May 6, 2024 20:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants