Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hubble loses the FQDN reference after a few moments after it is resolved #1438

Open
rooque opened this issue Apr 3, 2024 · 5 comments
Open

Comments

@rooque
Copy link

rooque commented Apr 3, 2024

I'm having some strange behavior with FQDN destination in Hubble. I have 2 FQDNs, that 2 pods connects :

redis.sandbox.whitelabel.com.br
db.sandbox.whitelabel.com.br

At first, the FQDN appears both in the service map and in the list correctly. After a few moments, it starts treating the FQDNs' IPs as "world". (see images)

Images

Captura de Tela 2024-03-28 às 15 36 20 Captura de Tela 2024-03-28 às 15 30 07 Captura de Tela 2024-03-28 às 15 29 09 Captura de Tela 2024-03-28 às 15 28 58 Captura de Tela 2024-03-28 às 15 28 17

The problem is not just in the UI, in the CLI is the same

Apr  2 15:56:45.180: services/poc-bff-66f46654c-7565m:45148 (ID:103419) -> redis.sandbox.whitelabel.com.br:6379 (ID:16777220) to-stack FORWARDED (TCP Flags: ACK)
Apr  2 15:56:45.180: services/poc-bff-66f46654c-7565m:45148 (ID:103419) <- redis.sandbox.whitelabel.com.br:6379 (ID:16777220) to-endpoint FORWARDED (TCP Flags: ACK)
Apr  2 15:56:49.724: services/poc-microservice-595c9fb49b-gcwvh:58090 (ID:82048) -> redis.sandbox.whitelabel.com.br:6379 (ID:16777220) to-stack FORWARDED (TCP Flags: ACK)
Apr  2 15:56:49.724: services/poc-microservice-595c9fb49b-gcwvh:58090 (ID:82048) <- redis.sandbox.whitelabel.com.br:6379 (ID:16777220) to-endpoint FORWARDED (TCP Flags: ACK)
Apr  2 15:57:05.276: services/poc-bff-66f46654c-7565m:45148 (ID:103419) -> redis.sandbox.whitelabel.com.br:6379 (ID:16777220) to-stack FORWARDED (TCP Flags: ACK)
Apr  2 15:57:05.276: services/poc-bff-66f46654c-7565m:45148 (ID:103419) <- redis.sandbox.whitelabel.com.br:6379 (ID:16777220) to-endpoint FORWARDED (TCP Flags: ACK)

But after some time (30 seconds):

Apr  2 15:57:40.028: services/poc-microservice-595c9fb49b-gcwvh:58090 (ID:82048) -> 10.6.132.252:6379 (ID:16777220) to-stack FORWARDED (TCP Flags: ACK)
Apr  2 15:57:40.028: services/poc-bff-66f46654c-7565m:45148 (ID:103419) -> 10.6.132.252:6379 (ID:16777220) to-stack FORWARDED (TCP Flags: ACK)
Apr  2 15:57:40.028: services/poc-microservice-595c9fb49b-gcwvh:58090 (ID:82048) <- 10.6.132.252:6379 (ID:16777220) to-endpoint FORWARDED (TCP Flags: ACK)
Apr  2 15:57:40.028: services/poc-bff-66f46654c-7565m:45148 (ID:103419) <- 10.6.132.252:6379 (ID:16777220) to-endpoint FORWARDED (TCP Flags: ACK)

NetworkPolicy

apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
 name: bff-rule
 namespace: services
spec:
 endpointSelector:
   matchLabels:
     app: poc-bff
 ingress:
   - fromEndpoints:
       - {}
   - fromEndpoints:
       - matchLabels:
           io.kubernetes.pod.namespace: envoy-gateway-system
 egress:
   - toEndpoints:
       - {}
   - toEndpoints:
       - matchLabels:
           io.kubernetes.pod.namespace: kube-system
           k8s-app: kube-dns
     toPorts:
       - ports:
           - port: "53"
             protocol: ANY
         rules:
           dns:
             - matchPattern: "*"
   - toEndpoints:
       - matchLabels:
           io.kubernetes.pod.namespace: observability
   - toFQDNs:
       - matchPattern: "*.sandbox.whitelabel.com.br"

The FQDNs shows in this lists, always even after it starting showing as WORLD

# cilium-dbg fqdn cache list
Endpoint   Source       FQDN                               TTL   ExpirationTime             IPs            
1117       connection   redis.sandbox.whitelabel.com.br.   0     2024-04-02T14:16:09.553Z   10.6.132.252   
1459       connection   redis.sandbox.whitelabel.com.br.   0     2024-04-02T14:16:09.553Z   10.6.132.252   
1459       connection   db.sandbox.whitelabel.com.br.      0     2024-04-02T14:16:09.553Z   10.45.48.3 

Also: Every time I restart the pods, I shows the FQDNs for a very short time... Then it start showing as World again.

Cilium Config

Client: 1.15.2 7cf57829 2024-03-13T15:34:43+02:00 go version go1.21.8 linux/amd64
Daemon: 1.15.2 7cf57829 2024-03-13T15:34:43+02:00 go version go1.21.8 linux/amd64

  agent-not-ready-taint-key: node.cilium.io/agent-not-ready
  arping-refresh-period: 30s
  auto-direct-node-routes: 'false'
  bpf-lb-acceleration: disabled
  bpf-lb-external-clusterip: 'false'
  bpf-lb-map-max: '65536'
  bpf-lb-sock: 'false'
  bpf-map-dynamic-size-ratio: '0.0025'
  bpf-policy-map-max: '16384'
  bpf-root: /sys/fs/bpf
  cgroup-root: /run/cilium/cgroupv2
  cilium-endpoint-gc-interval: 5m0s
  cluster-id: '1'
  cluster-name: gke-1
  cni-exclusive: 'true'
  cni-log-file: /var/run/cilium/cilium-cni.log
  controller-group-metrics: write-cni-file sync-host-ips sync-lb-maps-with-k8s-services
  custom-cni-conf: 'false'
  debug: 'false'
  debug-verbose: ''
  dnsproxy-enable-transparent-mode: 'true'
  egress-gateway-reconciliation-trigger-interval: 1s
  enable-auto-protect-node-port-range: 'true'
  enable-bgp-control-plane: 'false'
  enable-bpf-clock-probe: 'false'
  enable-endpoint-health-checking: 'true'
  enable-endpoint-routes: 'true'
  enable-envoy-config: 'true'
  enable-external-ips: 'false'
  enable-health-check-loadbalancer-ip: 'true'
  enable-health-check-nodeport: 'true'
  enable-health-checking: 'true'
  enable-host-port: 'false'
  enable-hubble: 'true'
  enable-hubble-open-metrics: 'true'
  enable-ipv4: 'true'
  enable-ipv4-big-tcp: 'false'
  enable-ipv4-masquerade: 'true'
  enable-ipv6: 'false'
  enable-ipv6-big-tcp: 'false'
  enable-ipv6-masquerade: 'true'
  enable-k8s-networkpolicy: 'true'
  enable-k8s-terminating-endpoint: 'true'
  enable-l2-neigh-discovery: 'true'
  enable-l7-proxy: 'true'
  enable-local-redirect-policy: 'false'
  enable-masquerade-to-route-source: 'false'
  enable-metrics: 'true'
  enable-node-port: 'false'
  enable-policy: default
  enable-remote-node-identity: 'true'
  enable-sctp: 'false'
  enable-svc-source-range-check: 'true'
  enable-vtep: 'false'
  enable-well-known-identities: 'false'
  enable-wireguard: 'true'
  enable-xt-socket-fallback: 'true'
  external-envoy-proxy: 'true'
  hubble-disable-tls: 'false'
  hubble-export-file-max-backups: '5'
  hubble-export-file-max-size-mb: '10'
  hubble-listen-address: ':4244'
  hubble-metrics: >-
    dns drop tcp flow port-distribution icmp
    httpV2:exemplars=true;labelsContext=source_ip,source_namespace,source_workload,destination_ip,destination_namespace,destination_workload,traffic_direction
  hubble-metrics-server: ':9965'
  hubble-socket-path: /var/run/cilium/hubble.sock
  hubble-tls-cert-file: /var/lib/cilium/tls/hubble/server.crt
  hubble-tls-client-ca-files: /var/lib/cilium/tls/hubble/client-ca.crt
  hubble-tls-key-file: /var/lib/cilium/tls/hubble/server.key
  identity-allocation-mode: crd
  identity-gc-interval: 15m0s
  identity-heartbeat-timeout: 30m0s
  install-no-conntrack-iptables-rules: 'false'
  ipam: kubernetes
  ipam-cilium-node-update-rate: 15s
  ipv4-native-routing-cidr: 10.0.0.0/18
  k8s-client-burst: '20'
  k8s-client-qps: '10'
  kube-proxy-replacement: 'false'
  kube-proxy-replacement-healthz-bind-address: ''
  loadbalancer-l7: envoy
  loadbalancer-l7-algorithm: round_robin
  loadbalancer-l7-ports: ''
  max-connected-clusters: '255'
  mesh-auth-enabled: 'true'
  mesh-auth-gc-interval: 5m0s
  mesh-auth-queue-size: '1024'
  mesh-auth-rotated-identities-queue-size: '1024'
  monitor-aggregation: medium
  monitor-aggregation-flags: all
  monitor-aggregation-interval: 5s
  node-port-bind-protection: 'true'
  nodes-gc-interval: 5m0s
  operator-api-serve-addr: 127.0.0.1:9234
  operator-prometheus-serve-addr: ':9963'
  policy-cidr-match-mode: ''
  preallocate-bpf-maps: 'false'
  procfs: /host/proc
  prometheus-serve-addr: ':9962'
  proxy-connect-timeout: '2'
  proxy-max-connection-duration-seconds: '0'
  proxy-max-requests-per-connection: '0'
  remove-cilium-node-taints: 'true'
  routing-mode: native
  service-no-backend-response: reject
  set-cilium-is-up-condition: 'true'
  set-cilium-node-taints: 'true'
  sidecar-istio-proxy-image: cilium/istio_proxy
  skip-cnp-status-startup-clean: 'false'
  synchronize-k8s-nodes: 'true'
  tofqdns-dns-reject-response-code: refused
  tofqdns-enable-dns-compression: 'true'
  tofqdns-endpoint-max-ip-per-hostname: '50'
  tofqdns-idle-connection-grace-period: 0s
  tofqdns-max-deferred-connection-deletes: '10000'
  tofqdns-proxy-response-max-delay: 100ms
  unmanaged-pod-watcher-interval: '15'
  vtep-cidr: ''
  vtep-endpoint: ''
  vtep-mac: ''
  vtep-mask: ''
  wireguard-persistent-keepalive: 0s
  write-cni-conf-when-ready: /host/etc/cni/net.d/05-cilium.conflist

SysDump

cilium-sysdump-20240403-174046.zip

@saintdle
Copy link

saintdle commented Apr 4, 2024

Hi, I spent some time with some of the cilium maintainers to understand this issue better. Here's what has been found so far.

The fdqn cache keeps two kinds of records, lookups which is where a name is resolved to an IP from an external DNS server, this comes with a TTL record. And connections track active connections in the datapath from pods, but these do not have a TTL set.

In your fqdn cache it seems like there are no lookups because the TTL has expired, however, you still have connections because the application is still communicating externally. We expect that your application is keeping a long-lived connection and not making a subsequent lookup again, we couldn't see any DNS lookups from the pods in the hubble flows from the sysdump, and there are no SYN flows in the flows either to those identities.

Below is a quick capture from my lap showing the lookup records in the fdqn cache.

 k exec -n kube-system cilium-7hdqd -it -- cilium-dbg fqdn cache list
Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init)
Endpoint   Source       FQDN                                                                     TTL   ExpirationTime             IPs              
746        connection   jobs-app-kafka-brokers.tenant-jobs.svc.cluster.local.                    0     0001-01-01T00:00:00.000Z   10.244.2.48      
1279       lookup       api.github.com.                                                          2     2024-04-04T15:48:23.486Z   140.82.121.5     
1279       connection   loader.tenant-jobs.svc.cluster.local.                                    0     0001-01-01T00:00:00.000Z   10.109.69.105    
1279       connection   api.github.com.                                                          0     0001-01-01T00:00:00.000Z   140.82.121.6     
602        connection   elasticsearch-master.tenant-jobs.svc.cluster.local.                      0     0001-01-01T00:00:00.000Z   10.102.40.97     
350        connection   jobs-app-zookeeper-client.tenant-jobs.svc.cluster.local.                 0     0001-01-01T00:00:00.000Z   10.104.222.238   
350        connection   jobs-app-kafka-0.jobs-app-kafka-brokers.tenant-jobs.svc.cluster.local.   0     0001-01-01T00:00:00.000Z   10.244.2.48      

We looked into the following hubble code which pulls the IP/FQDN from the cache.

At the moment it seems like the behaviour is working as expected, or rather coded. However, we think there is an opportunity to improve this behaviour.

Using the lookup field looks like the right thing to do, because using the connection field, there is no guarantee that the FQDN and IP remain the same throughout a long-lived connection, the DNS record could be updated during that time to a new IP address.

However, because of that, we find the situation you have logged arises. We could fall back to using the connection item if there is no lookup item, and flag this in the hubble observe output command in some way, so that you know it's a best effort FQDN print out.

That would look something like this (example with an *):

Apr  2 15:56:45.180: services/poc-bff-66f46654c-7565m:45148 (ID:103419) -> redis.sandbox.whitelabel.com.br*:6379 (ID:16777220) to-stack FORWARDED (TCP Flags: ACK)

In this scenario where the TTL has expired for the lookup item in the cache, what would you like to happen?

@rooque
Copy link
Author

rooque commented Apr 17, 2024

Hello @saintdle !

What I expect is to see those FQDNs in hubble and not "world", even for long lived connections. If the lookup is expired but there still a connection, it should show the FQDN and not "world".

That's make sense?

ps. sorry for the delay.

@saintdle
Copy link

Hello @saintdle !

What I expect is to see those FQDNs in hubble and not "world", even for long lived connections. If the lookup is expired but there still a connection, it should show the FQDN and not "world".

That's make sense?

ps. sorry for the delay.

Yes sure, so in that case, when the FQDN is shown but it's from a connection, as the TTL has expired, then it would be best to mark it as such.

@macmiranda
Copy link

Hey @rooque just wondering if you have any issues with egress toFQDNs policies because of that. I see the same behavior on Hubble and when I tried to create a CNP using matchName I started getting dropped packets, though the hostname was allowed by the policy. Not saying it's related but it sorta makes sense that cilium would drop reserved:world packets.

@saintdle
Copy link

@rooque I stumbled on this in the docs, maybe it's useful for this use case currently
https://docs.cilium.io/en/latest/contributing/development/debugging/#unintended-dns-policy-drops

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants