Skip to content

Bug of Extension Config Discovery Service #19628

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
johnlanni opened this issue Jan 20, 2022 · 4 comments
Closed

Bug of Extension Config Discovery Service #19628

johnlanni opened this issue Jan 20, 2022 · 4 comments
Assignees
Labels
area/configuration bug stale stalebot believes this issue/PR has not been touched recently

Comments

@johnlanni
Copy link
Contributor

johnlanni commented Jan 20, 2022

Title: Bug of Extension Config Discovery Service

Description:
This bug is triggered when istio's WasmPlugin resource is created and deleted multiple times.

Repro steps:
istio verison: 1.12.1
envoy version: 1.20.1 (istio-proxy used)

step1. create a WasmPlugin resource, like this

# wasmplugin.yml
apiVersion: extensions.istio.io/v1alpha1
kind: WasmPlugin
metadata:
  name: wasmtest
  namespace: istio-system
spec:
  selector:
    matchLabels:
      istio: ingressgateway
  url: oci://registry.cn-hangzhou.aliyuncs.com/ztygw/gowasm:0.1
kubectl create -f wasmplugin.yml

step2. delete the WasmPlugin

kubectl delete -f wasmplugin.yml

step3. change the oci image, create again

# wasmplugin.yml
apiVersion: extensions.istio.io/v1alpha1
kind: WasmPlugin
metadata:
  name: wasmtest
  namespace: istio-system
spec:
  selector:
    matchLabels:
      istio: ingressgateway
  url: oci://registry.cn-hangzhou.aliyuncs.com/ztygw/gowasm:0.3
kubectl create -f wasmplugin.yml

Keep repeating steps 1-3, this bug will appear.

Logs:

2022-01-20T02:55:54.060734Z	info	xdsproxy	connected to upstream XDS server: istiod.istio-system.svc:15012
2022-01-20T02:56:05.742356Z	critical	envoy backtrace	Caught Segmentation fault, suspect faulting address 0xffffffffffffff58
2022-01-20T02:56:05.742372Z	critical	envoy backtrace	Backtrace (use tools/stack_decode.py to get line numbers):
2022-01-20T02:56:05.742374Z	critical	envoy backtrace	Envoy version: e6f45abcf874983fbff384459d70b28c072f68b5/1.20.1/Clean/RELEASE/BoringSSL
2022-01-20T02:56:05.742553Z	critical	envoy backtrace	#0: __restore_rt [0x7f2ef5cc93c0]
2022-01-20T02:56:05.746260Z	critical	envoy backtrace	#1: Envoy::Filter::FilterConfigSubscription::onConfigUpdate() [0x563a40ed25af]
2022-01-20T02:56:05.749998Z	critical	envoy backtrace	#2: Envoy::Config::GrpcSubscriptionImpl::onConfigUpdate() [0x563a40f67cb9]
2022-01-20T02:56:05.753451Z	critical	envoy backtrace	#3: Envoy::Config::GrpcMuxImpl::onDiscoveryResponse() [0x563a40f6db27]
2022-01-20T02:56:05.756899Z	critical	envoy backtrace	#4: Envoy::Grpc::AsyncStreamCallbacks<>::onReceiveMessageRaw() [0x563a40f6f976]
2022-01-20T02:56:05.760341Z	critical	envoy backtrace	#5: Envoy::Grpc::AsyncStreamImpl::onData() [0x563a40f86365]
2022-01-20T02:56:05.763789Z	critical	envoy backtrace	#6: Envoy::Http::AsyncStreamImpl::encodeData() [0x563a40f8b722]
2022-01-20T02:56:05.767228Z	critical	envoy backtrace	#7: Envoy::Router::UpstreamRequest::decodeData() [0x563a40fc06e1]
2022-01-20T02:56:05.770651Z	critical	envoy backtrace	#8: Envoy::Http::ResponseDecoderWrapper::decodeData() [0x563a40daa29b]
2022-01-20T02:56:05.774082Z	critical	envoy backtrace	#9: Envoy::Http::Http2::ConnectionImpl::onFrameReceived() [0x563a40f396a5]
2022-01-20T02:56:05.777518Z	critical	envoy backtrace	#10: Envoy::Http::Http2::ConnectionImpl::Http2Callbacks::Http2Callbacks()::$_21::__invoke() [0x563a40f42120]
2022-01-20T02:56:05.780955Z	critical	envoy backtrace	#11: nghttp2_session_on_data_received [0x563a412827dc]
2022-01-20T02:56:05.784392Z	critical	envoy backtrace	#12: nghttp2_session_mem_recv [0x563a4128484f]
2022-01-20T02:56:05.787827Z	critical	envoy backtrace	#13: Envoy::Http::Http2::ConnectionImpl::dispatch() [0x563a40f37fb0]
2022-01-20T02:56:05.791261Z	critical	envoy backtrace	#14: Envoy::Http::Http2::ConnectionImpl::dispatch() [0x563a40f38c85]
2022-01-20T02:56:05.794691Z	critical	envoy backtrace	#15: Envoy::Http::CodecClient::onData() [0x563a40e16b00]
2022-01-20T02:56:05.798117Z	critical	envoy backtrace	#16: Envoy::Http::CodecClient::CodecReadFilter::onData() [0x563a40e17c85]
2022-01-20T02:56:05.801557Z	critical	envoy backtrace	#17: Envoy::Network::FilterManagerImpl::onContinueReading() [0x563a411923df]
2022-01-20T02:56:05.804990Z	critical	envoy backtrace	#18: Envoy::Network::ConnectionImpl::onReadReady() [0x563a4118baa6]
2022-01-20T02:56:05.808486Z	critical	envoy backtrace	#19: Envoy::Network::ConnectionImpl::onFileEvent() [0x563a411894ef]
2022-01-20T02:56:05.811924Z	critical	envoy backtrace	#20: std::__1::__function::__func<>::operator()() [0x563a4116f651]
2022-01-20T02:56:05.815362Z	critical	envoy backtrace	#21: Envoy::Event::FileEventImpl::assignEvents()::$_1::__invoke() [0x563a41170b5c]
2022-01-20T02:56:05.818809Z	critical	envoy backtrace	#22: event_process_active_single_queue [0x563a4129a680]
2022-01-20T02:56:05.822248Z	critical	envoy backtrace	#23: event_base_loop [0x563a41299091]
2022-01-20T02:56:05.825663Z	critical	envoy backtrace	#24: Envoy::Server::InstanceImpl::run() [0x563a40b7939b]
2022-01-20T02:56:05.829037Z	critical	envoy backtrace	#25: Envoy::MainCommonBase::run() [0x563a3f1ade54]
2022-01-20T02:56:05.832440Z	critical	envoy backtrace	#26: Envoy::MainCommon::main() [0x563a3f1ae6c6]
2022-01-20T02:56:05.835803Z	critical	envoy backtrace	#27: main [0x563a3f1aa8cc]
2022-01-20T02:56:05.835859Z	critical	envoy backtrace	#28: __libc_start_main [0x7f2ef5ae90b3]
AsyncClient 0x563a44392580, stream_id_: 11328871343484645825
&stream_info_:
  StreamInfoImpl 0x563a44392750, upstream_connection_id_: 2164527, protocol_: 1, response_code_: null, response_code_details_: null, attempt_count_: 1, health_check_request_: 0, route_name_:
Http2::ConnectionImpl 0x563a442de3d0, max_headers_kb_: 60, max_headers_count_: 100, per_stream_buffer_limit_: 268435456, allow_metadata_: 0, stream_error_on_invalid_http_messaging_: 0, is_outbound_flood_monitored_control_frame_: 0, skip_encoding_empty_trailers_: 1, dispatching_: 1, raised_goaway_: 0, pending_deferred_reset_streams_.size(): 0
&protocol_constraints_:
  ProtocolConstraints 0x563a442de430, outbound_frames_: 0, max_outbound_frames_: 10000, outbound_control_frames_: 0, max_outbound_control_frames_: 1000, consecutive_inbound_frames_with_empty_payload_: 0, max_consecutive_inbound_frames_with_empty_payload_: 1, opened_streams_: 1, inbound_priority_frames_: 0, max_inbound_priority_frames_per_stream_: 100, inbound_window_update_frames_: 6, outbound_data_frames_: 9, max_inbound_window_update_frames_per_data_frame_sent_: 10
Number of active streams: 1, current_stream_id_: 1 Dumping current stream:
stream:
  ConnectionImpl::StreamImpl 0x563a442de1e0, stream_id_: 1, unconsumed_bytes_: 0, read_disable_count_: 0, local_end_stream_: 0, local_end_stream_sent_: 0, remote_end_stream_: 0, data_deferred_: 1, received_noninformational_headers_: 1, pending_receive_buffer_high_watermark_called_: 0, pending_send_buffer_high_watermark_called_: 0, reset_due_to_messaging_error_: 0, cookies_:   pending_trailers_to_encode_:   null
  absl::get<ResponseHeaderMapPtr>(headers_or_trailers_):   null
Dumping corresponding downstream request for upstream stream 1:
  UpstreamRequest 0x563a442d1200

Call Stack:

I found the code line of this address 0x563a40ed25af in Envoy::Filter::FilterConfigSubscription::onConfigUpdate() [0x563a40ed25af]:

bool is_terminal_filter = factory.isTerminalFilterByProto(*message, factory_context_);

By:

Find the static address of the entry of Envoy::MainCommon::main()

objdump -Cd usr/local/bin/envoy | fgrep <main> -A 20 

image

Then compute the static address of 0x563a40ed25af
image

python -c 'print(hex(0x557971ab15af-0x55796fd898cc+0x15d08c7))'
# result: 0x32f85aa

Use addr2line get the line of the code

addr2line -Ce /usr/local/bin/envoy  0x32f85aa
# result: /proc/self/cwd/external/envoy/source/common/filter/config_discovery_impl.cc:131
@johnlanni johnlanni added bug triage Issue requires triage labels Jan 20, 2022
@bianpengyuan
Copy link
Contributor

Same issue as #14930 which @kyessenov has been working on.

@kyessenov
Copy link
Contributor

Yeah, I have a fix but it's quite tricky to validate.

@lizan lizan added area/configuration and removed triage Issue requires triage labels Jan 20, 2022
@github-actions
Copy link

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale stalebot believes this issue/PR has not been touched recently label Feb 20, 2022
@github-actions
Copy link

This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/configuration bug stale stalebot believes this issue/PR has not been touched recently
Projects
None yet
Development

No branches or pull requests

4 participants