Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support DNS SRV record by RFC2782 #2496

Closed
wants to merge 5 commits into from
Closed

Conversation

bootjp
Copy link
Contributor

@bootjp bootjp commented Jul 14, 2019

Which issue(s) this PR fixes:
Fixes : None

What this PR does / why we need it:
https://groups.google.com/forum/#!topic/fluentd/Lxx0-rVtsfo

Support for RFC2782 in out_forward allows for more flexible fluentd operations
Specifically, public cloud service discovery

Docs Changes:
It will be necessary when this request is merged.
I will send a pull request to the document separately.

Release Note:

To the reviewer
SRV#available? may be a better implementation.
For example, fluent/compat/socket_util might seem like a smarter way to implement it, but fluentd's code is too complex for my technical capabilities to figure out how to do it.
I will correct the code better if you give me advice.

@bootjp bootjp force-pushed the master branch 2 times, most recently from adb4c61 to c0231ca Compare July 15, 2019 04:12
bootjp added 2 commits July 16, 2019 13:44
Signed-off-by: ぶーと / Yoshiaki Ueda <[email protected]>
Signed-off-by: ぶーと / Yoshiaki Ueda <[email protected]>
@repeatedly
Copy link
Member

@ganmacs How about this? You are now working on pluggable SD on out_forward.

@ganmacs
Copy link
Member

ganmacs commented Jul 17, 2019

You are now working on pluggable SD on out_forward.

Right. but anyway, I'll take a look this later.


# @param host string
# @return [SRV]
def resolve_srv(host)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SRV records should be cached for performance.
And specifying the timing of purging cache for users is better.

@@ -620,6 +620,69 @@ def read_ack_from_sock(sock, unpacker)
assert{ logs.any?{|log| log.include?("no response from node. regard it as unavailable.") } }
end


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test which checks if the SRV record service discovery works?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, should this use a real domain name?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. mocking or stubbing seems enough.

Copy link
Member

@ganmacs ganmacs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think updating caches should be invoked in another thread since performance issue.
Creating EvenTime and comparing it every write isn't efficient. And Node#resolve_dns is called by multiple threads, right? This means that if there is the only Node which uses this feature and multiple threads try to write (and the cache is expired), every thread tries to resolve the SRV record.

end

# swap config host to srv response host.
@srv_host_mutex.synchronize do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If thread1(t1) and thread2(t2) which use the same Node try to write, thread1 and thread2 can use the same target.
Is it intentional?

  1. t1 does L918(and get target1)
  2. t2 does L918(and get target2)
  3. t1 does L930~L938,(and set @host and @port target1)
  4. context switch from t1 to t2 (t1 is still L939)
  5. t2 does L930~L938,(and set @host and @port target2)
  6. t1 and t2 send message to target2

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to lock other places where node uses @host and @port(e.g. https://github.com/fluent/fluentd/pull/2496/files#diff-40c0e8c88efdc9a297a9fc63bc2bc69fR948).
This code can be a race condition.

  1. t1 tries to do this line and evaluates arguments @host
  2. context switch happens to t2
  3. t2 does L930~L938,(and set @host and @port to another value)
  4. context switch happens to t1
  5. @port is invalid for evaluating @host at step 1 since @port has changed by t2

@bootjp
Copy link
Contributor Author

bootjp commented Oct 8, 2019

Close because it seems to be implemented in #2541 .

@bootjp bootjp closed this Oct 8, 2019
@repeatedly
Copy link
Member

We are now considering DNS SRV record based SD plugin in the core.
If you are interested in it, patches are welcome :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants