Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto-detect compact vs. binary Thrift formats in jaeger-agent UDP servers #1596

Closed
yurishkuro opened this issue Jun 11, 2019 · 12 comments
Closed
Labels
enhancement good first issue Good for beginners help wanted Features that maintainers are willing to accept but do not have cycles to implement

Comments

@yurishkuro
Copy link
Member

yurishkuro commented Jun 11, 2019

Requirement - what kind of business use case are you trying to solve?

Only open one UDP port on jaeger-agent, to minimize attack surface.

Problem - what in Jaeger blocks you from solving the requirement?

Agent currently uses distinct ports (jaegertracing/documentation#262) for different Thrift formats:

Port Protocol Function
6831 UDP accept jaeger.thrift in compact Thrift protocol used by most current Jaeger clients
6832 UDP accept jaeger.thrift in binary Thrift protocol used by Node.js Jaeger client (because thriftrw npm package does not support compact protocol)

Proposal - what do you suggest to solve the problem or improve the existing situation?

Perhaps it's possible to detect the format by inspecting a few bytes in the header of the message.

@chandresh-pancholi
Copy link
Contributor

@yurishkuro could you please help me out here where to start?

@yurishkuro
Copy link
Member Author

The bytes received from udp are decoded here: https://github.com/jaegertracing/jaeger/blob/master/cmd/agent/app/processors/thrift_processor.go

This processor is designed to work with a single thrift format (TProtocol), so will require some creative refactoring. But before that it would be good to just experiment with sending compact and binary payloads (eg using go and Node clients) and see if it's possible to tell which is which from the packet header.

@chandresh-pancholi
Copy link
Contributor

chandresh-pancholi commented Aug 6, 2019

It was very difficult for me to find out protocol from binary payload in binary protocol.
As for compact, the first byte 81 is a compact protocol id.
I suggest putting the if-else check.

if(byte == 82) {
// compact protocol
} else {
// binary protocol
}

@yurishkuro what's your suggestion?

@yurishkuro
Copy link
Member Author

we need to know how the first byte is interpreted in the binary protocol, i.e. is there a chance that it may be also equal to 82

@chandresh-pancholi
Copy link
Contributor

We can figure out Protocols by reading bytes. Below is the code.
In the binary protocol, I am reading the first 4 bytes which are significant to understand the protocol.
In compact, first byte is good enough to figure out the protocol.

func (s *ThriftProcessor) isBinaryProtocol(payload []byte)  {
	buf := payload[0:4]

	value := int32(binary.BigEndian.Uint32(buf))
	version := int64(int64(value) & thrift.VERSION_MASK)
	if version == thrift.VERSION_1{
		s.logger.Info("binary protocol," zap.Any("version", version))
	}

}

func (s *ThriftProcessor) isCompactProtocol(payload []byte) int  {
	buf := bytes.NewBuffer(payload[0:1])

	protocolId, err := buf.ReadByte()
	if err != nil{
		s.logger.Error("read byte buffer failed",  zap.Error(err))
	}

	if protocolId == thrift.COMPACT_PROTOCOL_ID {
		s.logger.Info("protocolId", zap.Any("proto", protocolId))
	}

	return thrift.COMPACT_PROTOCOL_ID
}

@yurishkuro
Copy link
Member Author

Did you check if we're using "strict writes" in binary? because in non-strict mode binary writes struct name without any prefix.

If strict write is confirmed, then the check seems to be even simpler, binary starts with 0x80 (first byte of VERSION_1 = 0x80010000), compact with 0x82 (COMPACT_PROTOCOL_ID).

@chandresh-pancholi
Copy link
Contributor

chandresh-pancholi commented Aug 20, 2019

Yes, Jaeger binary protocol uses thrift.NewTBinaryProtocolFactoryDefault() which default sets strictWrite true

builder.go L64

var (
	protocolFactoryMap = map[Protocol]thrift.TProtocolFactory{
		compactProtocol: thrift.NewTCompactProtocolFactory(),
		binaryProtocol:  thrift.NewTBinaryProtocolFactoryDefault(),
	}
)
func NewTBinaryProtocolFactoryDefault() *TBinaryProtocolFactory {
	return NewTBinaryProtocolFactory(false, true)
}

@chandresh-pancholi
Copy link
Contributor

@yurishkuro Could you suggest a way for going forward? We can use the above logic to run it on 1 port. We can deduce the protocol type if run it in single port (e.g. 6831).

What do you think?

@yurishkuro
Copy link
Member Author

Yes, it looks like we can do that. Do you want to create a pull request?

We already integration-test clients with the backend:

- go
- node
- java
- python
- zipkin-brave-thrift

@yurishkuro
Copy link
Member Author

I just had a look at the agent code, the refactoring to support multiple Thrift protocols on the same port is going to be pretty invasive. It might look like this:

  • introduce a notion of protocol sniffer that will inspect the payload and return the Protocol constant
    type Protocol string
  • there is a nasty business with metrics which are currently labeled by the protocol, but that assumption won't hold if 2 protocols are supported by a server/processor combo. Suggestion:
    • server metrics should be labeled by something other than the protocol, e.g. by its port number
    • each processor would have to maintain >1 group of metrics, one for each protocol
  • instead of resolving the protocol factory in
    protoFactory, ok := protocolFactoryMap[cfg.Protocol]
    pass the whole map into GetThriftProcessor() along with the sniffer
  • The ThritfProcessor would need to keep N sync pools (N=2 in practice) for each of the protocol factories, and two groups of metrics
  • The logic in processBuffer() would be
    func (s *ThriftProcessor) processBuffer() {
    • invoke sniffer to determine the protocol of the messag
    • look up sync.Pool of factories in a map[Protocol]*sync.Pool
    • look up metrics container
    • continue as usual with ^ these

@jpkrohling
Copy link
Contributor

I think this brings little value for the complexity it brings.

@yurishkuro
Copy link
Member Author

UDP servers are deprecated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement good first issue Good for beginners help wanted Features that maintainers are willing to accept but do not have cycles to implement
Projects
None yet
Development

No branches or pull requests

3 participants