Skip to content

Conversation

@leviramsey
Copy link
Contributor

References #31905

Future callbacks in AsyncDnsResolver will, in situations where multiple resolution attempts for a request are being made (e.g. failing over to a different DNS server or if there are multiple search domains), close over actor state which is not otherwise synchronized, thus breaking the actor model's single threaded illusion.

This change moves such resolution attempts to explicit actor messages.


import AsyncDnsResolver._

implicit val ec: ExecutionContextExecutor = context.dispatcher
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do kind of feel that having an implicit execution context in an actor is a bad sign, by decreasing the cognitive cost of adding a future callback within the actor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯 agree

log.debug(s"{} resolved {}", mode, resolved)
resolved
val replyTo = sender()
resolveWithResolvers(name, mode, resolvers).onComplete {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broad idea is to limit the future callbacks inside the actor. Combining the cache updates with piping preserves the sequencing in .map { x => sideEffect(); x }.pipeTo(sender()) and allows use of parasitic for performance without making that the default context.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe pipeTo is better .

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, +100 for parasitic

ipv6Recs.map { v6 =>
DnsProtocol.Resolved(name, v4.rrs ++ v6.rrs, v4.additionalRecs ++ v6.additionalRecs)
}(ExecutionContexts.parasitic)
}(ExecutionContexts.parasitic))
Copy link
Contributor Author

@leviramsey leviramsey Apr 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, personal rule to never implicit val ec = ExecutionContexts.parasitic...

val minTtl = (positiveCachePolicy +: resolved.records.map(_.ttl)).min
cache.put((name, mode), resolved, minTtl) // thread-safe structure, safe in callback
} else if (negativeCachePolicy != Never) {
cache.put((name, mode), resolved, negativeCachePolicy) // thread-safe structure, safe in callback
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it was going to just be a Map, definitely. In this case, since we're going through the expense of using a thread-safe structure, may as well exercise that :)

This does have the benefit of quickly allowing a successful resolution to immediately update the cache for requests which might be in the mailbox now (without having to do a custom mailbox). Since the only cache.get is right when we receive a DnsProtocol.Resolve, it's not much of a win (it might be a slight win under heavy load to check the cache when making follow-up queries).


import AsyncDnsResolver._

implicit val ec: ExecutionContextExecutor = context.dispatcher
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯 agree

}

private def resolveWithSearch(
private def resolvePromise(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand it correctly that the only reason for going back to the actor via Internal.Resolve is that the nextId() is needed here, or is there something else that requires this to be running in the actor?

If that is the case, would it be better to change the requestId counter to an a thread safe counter (AtomicInteger) and stay in the Future land in the companion object methods?

Copy link
Contributor

@octonato octonato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few comments, but need to think more about it.

Maybe we need to explore other strategies.

@leviramsey
Copy link
Contributor Author

Trying a different approach:

  • move requestId responsibility to a dedicated actor
  • the progression through search names and different resolvers is now the responsibility of a per-request actor. This does allow for some optimizations (e.g. only building a list of names with search domains appended once), at the expense of not guaranteeing that the cache update (if any) happens before the requestor receives the response

}
}

private class DnsResolutionActor(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me to have this per request actor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I was consciously thinking about this, but I like the separation of AsyncDnsResolver is primarily managing the cache, the per-request actor is managing the process of preparing requests to resolvers, and the injector is keeping the requestIds sane.

The per-request actor is as pasta-like as the future code was (maybe the invariants are clearer?), but is a bit more encapsulated.

@leviramsey
Copy link
Contributor Author

Test failure looks to be unrelated (actorsystem logging): https://github.com/akka/akka/actions/runs/4723246264/jobs/8378944101#step:5:1074

retriggering

@He-Pin
Copy link
Contributor

He-Pin commented Apr 17, 2023

As only one place to generate the next id, if only from a performance perspective, i don't think this would performant better than a simple atomic integer with updateAndGet.

But still ♡(´∀`)人(´∀`)♡

@johanandren
Copy link
Contributor

close-reopen to retrigger ci

@johanandren johanandren reopened this Apr 18, 2023
@leviramsey leviramsey changed the title fix: Refactor future callbacks in AsyncDnsResolver fix: Refactor concurrency in AsyncDnsResolver Apr 18, 2023
Copy link
Contributor

@johanandren johanandren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much easier to follow now, good.

Copy link
Contributor

@johanandren johanandren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@johanandren johanandren added the 2 - pick next Used to mark issues which are next up in the queue to be worked on. The tag is non-binding label Apr 20, 2023
Copy link
Contributor

@patriknw patriknw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@patriknw patriknw merged commit ef33e71 into akka:main Apr 21, 2023
@leviramsey leviramsey deleted the async-dns-concurrency branch April 21, 2023 14:23
andreezy777 pushed a commit to andreezy777/akka that referenced this pull request May 18, 2023
* Alternative approach: per-request actor
* Restore synchronous 'resolution' for situations where the actor doesn't need to immediately spawn

(cherry picked from commit ef33e71)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2 - pick next Used to mark issues which are next up in the queue to be worked on. The tag is non-binding

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants