Skip to content

Conversation

@alecholmes
Copy link
Owner

@alecholmes alecholmes commented Aug 14, 2025

Problem

The fleet plugin polls a backend for the latest config. Due to the backend consisting of multiple processes, each with its own in-memory fleet config cache, the fleet plugin may encounter out-of-order configs until caches cohere.

Example:

  • Config stored in server db is X
  • User updates config to Y (saved in server db)
  • Fleet client polls backend, gets X back (it hit process P1 where X is cached)
  • Fleet client polls backend, gets Y back (it hit process P2 where X is not cached)
  • Fleet client polls backend, gets X back (it hit process P1 where X is still cached)
  • Server cache timeout period passes, so all future calls to the backend return Y

Change

This updates the fleet config fetch logic to ignore received files with timestamps at or before any locally saved configs. This is fairly straightforward since the server sends a last modified timestamp and our local config file naming scheme is timestamp-based.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

@alecholmes
Copy link
Owner Author

Copy link

@pwhelan pwhelan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@alecholmes
Copy link
Owner Author

Valgrind output. The Conditional jump or move depends on uninitialised value(s) is a longstanding issue unrelated to my change.

==1156907== Memcheck, a memory error detector
==1156907== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==1156907== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==1156907== Command: /home/alec/dev/fluent-bit/build/bin/fluent-bit --config local.yaml
==1156907==
Fluent Bit v4.1.0
* Copyright (C) 2015-2025 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

______ _                  _    ______ _ _             ___  _____
|  ___| |                | |   | ___ (_) |           /   ||  _  |
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __/ /| || |/' |
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / /_| ||  /| |
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /\___  |\ |_/ /
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/     |_(_)___/


[2025/08/15 12:51:00.522259185] [ info] [fluent bit] version=4.1.0, commit=420ab74d0c, pid=1156907
[2025/08/15 12:51:00.562978471] [ info] [custom:calyptia:calyptia.0] read UUID (be1831c6-4b82-4323-90e3-3a06741ab925) from file: /tmp/calyptia-fleet/machine-id.conf
[2025/08/15 12:51:00.736185947] [ info] [custom:calyptia:calyptia.0] custom initialized!
[2025/08/15 12:51:00.741206298] [ info] [storage] ver=1.5.3, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2025/08/15 12:51:00.741862978] [ info] [simd    ] disabled
[2025/08/15 12:51:00.742316654] [ info] [cmetrics] version=1.0.5
[2025/08/15 12:51:00.742730538] [ info] [ctraces ] version=0.6.6
[2025/08/15 12:51:00.755799261] [ info] [input:fluentbit_metrics:fluentbit_metrics.0] initializing
[2025/08/15 12:51:00.756752573] [ info] [input:fluentbit_metrics:fluentbit_metrics.0] storage_strategy='memory' (memory only)
[2025/08/15 12:51:00.898149027] [ info] [input:calyptia_fleet:calyptia_fleet.1] initializing
[2025/08/15 12:51:00.898389366] [ info] [input:calyptia_fleet:calyptia_fleet.1] storage_strategy='memory' (memory only)
[2025/08/15 12:51:00.899364302] [ info] [input:calyptia_fleet:calyptia_fleet.1] initializing calyptia fleet input.
[2025/08/15 12:51:00.920911531] [ info] [input:calyptia_fleet:calyptia_fleet.1] loading configuration from /tmp/calyptia-fleet/4803d408a0d43a9281ef4ae7720dfd7a23003204d98a4a4ffedca2f036db7206/hello/cur.conf.
[2025/08/15 12:51:00.922307685] [ info] [input:calyptia_fleet:calyptia_fleet.1] changing to config dir: /tmp/calyptia-fleet/4803d408a0d43a9281ef4ae7720dfd7a23003204d98a4a4ffedca2f036db7206/hello/1755312617
[2025/08/15 12:51:00.963190598] [ info] [sp] stream processor started
[2025/08/15 12:51:00.966329204] [ info] [engine] Shutdown Grace Period=5, Shutdown Input Grace Period=2
[2025/08/15 12:51:05] [engine] caught signal (SIGHUP)
[2025/08/15 12:51:06.983456566] [ info] reloading instance pid=1156907 tid=0x4852020
[2025/08/15 12:51:07.11055501] [ info] [reload] stop everything of the old context
[2025/08/15 12:51:07.13383715] [ warn] [engine] service will shutdown when all remaining tasks are flushed
[2025/08/15 12:51:07.14515821] [ info] [input] pausing fluentbit_metrics.0
[2025/08/15 12:51:07.15371381] [ info] [input] pausing calyptia_fleet.1
[2025/08/15 12:51:07.536786678] [ info] [engine] service has stopped (0 pending tasks)
[2025/08/15 12:51:07.537155310] [ info] [input] pausing fluentbit_metrics.0
[2025/08/15 12:51:07.537370148] [ info] [input] pausing calyptia_fleet.1
[2025/08/15 12:51:07.588988446] [ info] [reload] start everything
[2025/08/15 12:51:07.590673855] [ info] [fluent bit] version=4.1.0, commit=420ab74d0c, pid=1156907
[2025/08/15 12:51:07.591565665] [ info] [custom:calyptia:calyptia.0] read UUID (be1831c6-4b82-4323-90e3-3a06741ab925) from file: /tmp/calyptia-fleet/machine-id.conf
[2025/08/15 12:51:07.599324864] [ info] [custom:calyptia:calyptia.0] custom initialized!
[2025/08/15 12:51:07.599471159] [ info] [storage] ver=1.5.3, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2025/08/15 12:51:07.599563411] [ info] [simd    ] disabled
[2025/08/15 12:51:07.599644163] [ info] [cmetrics] version=1.0.5
[2025/08/15 12:51:07.599724123] [ info] [ctraces ] version=0.6.6
[2025/08/15 12:51:07.601054483] [ info] [input:dummy:dummy.0] initializing
[2025/08/15 12:51:07.601177652] [ info] [input:dummy:dummy.0] storage_strategy='memory' (memory only)
[2025/08/15 12:51:07.613014601] [ info] [input:fluentbit_metrics:fluentbit_metrics.1] initializing
[2025/08/15 12:51:07.613159770] [ info] [input:fluentbit_metrics:fluentbit_metrics.1] storage_strategy='memory' (memory only)
[2025/08/15 12:51:07.626139617] [ info] [input:calyptia_fleet:calyptia_fleet.2] initializing
[2025/08/15 12:51:07.626313121] [ info] [input:calyptia_fleet:calyptia_fleet.2] storage_strategy='memory' (memory only)
[2025/08/15 12:51:07.626462957] [ info] [input:calyptia_fleet:calyptia_fleet.2] initializing calyptia fleet input.
[2025/08/15 12:51:07.639953189] [ info] [input:calyptia_fleet:calyptia_fleet.2] fleet collector initialized with interval: 5 sec 0 nsec
[2025/08/15 12:51:07.706010323] [ info] [output:stdout:stdout.0] worker #0 started
[2025/08/15 12:51:08.456849487] [ info] [output:calyptia:calyptia.1] connected to Calyptia, agent_id='1207ac86-e1d7-48ce-8b1c-359b3b4e2c38'
[2025/08/15 12:51:08.469031318] [ info] [output:calyptia:calyptia.1] agent registration successful
[2025/08/15 12:51:08.470697060] [ info] [sp] stream processor started
[2025/08/15 12:51:08.470929607] [ info] [engine] Shutdown Grace Period=5, Shutdown Input Grace Period=2
[0] dummy.0: [[1755287468.536862529, {}], {"message"=>"dummy-v2"}]
[0] dummy.0: [[1755287469.538228531, {}], {"message"=>"dummy-v2"}]
[0] dummy.0: [[1755287470.535299113, {}], {"message"=>"dummy-v2"}]
[0] dummy.0: [[1755287471.535813806, {}], {"message"=>"dummy-v2"}]
[2025/08/15 12:51:12.976106639] [ info] [input:calyptia_fleet:calyptia_fleet.2] not creating file with timestamp 1755312617 since it is not newer than existing files
[0] dummy.0: [[1755287472.731787678, {}], {"message"=>"dummy-v2"}]
==1156907== Thread 2 flb-pipeline:
==1156907== Conditional jump or move depends on uninitialised value(s)
==1156907==    at 0x28283C: output_pre_cb_flush (flb_output.h:682)
==1156907==    by 0x152C453: co_switch (aarch64.c:133)
==1156907==    by 0xFFFFFFFFFFFFFFFF: ???
==1156907==  Uninitialised value was created by a heap allocation
==1156907==    at 0x4865058: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==1156907==    by 0x2C6073: flb_malloc (flb_mem.h:80)
==1156907==    by 0x2C7087: flb_task_create (flb_task.c:422)
==1156907==    by 0x2C1B8F: flb_engine_dispatch (flb_engine_dispatch.c:311)
==1156907==    by 0x2BAA07: flb_engine_flush (flb_engine.c:175)
==1156907==    by 0x2BC4EF: flb_engine_handle_event (flb_engine.c:580)
==1156907==    by 0x2BC4EF: flb_engine_start (flb_engine.c:1003)
==1156907==    by 0x23A6F3: flb_lib_worker (flb_lib.c:835)
==1156907==    by 0x50DD5B7: start_thread (pthread_create.c:442)
==1156907==    by 0x5145EDB: thread_start (clone.S:79)
==1156907==
[0] dummy.0: [[1755287473.540568894, {}], {"message"=>"dummy-v2"}]
[0] dummy.0: [[1755287474.536181321, {}], {"message"=>"dummy-v2"}]
[0] dummy.0: [[1755287475.535880748, {}], {"message"=>"dummy-v2"}]
[0] dummy.0: [[1755287476.535456797, {}], {"message"=>"dummy-v2"}]
[2025/08/15 12:51:17.765647822] [ info] [input:calyptia_fleet:calyptia_fleet.2] not creating file with timestamp 1755312617 since it is not newer than existing files
[0] dummy.0: [[1755287477.537935031, {}], {"message"=>"dummy-v2"}]
[0] dummy.0: [[1755287478.534827317, {}], {"message"=>"dummy-v2"}]
[0] dummy.0: [[1755287479.535325343, {}], {"message"=>"dummy-v2"}]
[0] dummy.0: [[1755287480.536359714, {}], {"message"=>"dummy-v2"}]
[0] dummy.0: [[1755287481.536223894, {}], {"message"=>"dummy-v2"}]
[2025/08/15 12:51:22.751935791] [ info] [input:calyptia_fleet:calyptia_fleet.2] not creating file with timestamp 1755312617 since it is not newer than existing files
[0] dummy.0: [[1755287482.547349928, {}], {"message"=>"dummy-v2"}]
[0] dummy.0: [[1755287483.535864628, {}], {"message"=>"dummy-v2"}]
[0] dummy.0: [[1755287484.536138941, {}], {"message"=>"dummy-v2"}]
[0] dummy.0: [[1755287485.535178771, {}], {"message"=>"dummy-v2"}]
[0] dummy.0: [[1755287486.536445355, {}], {"message"=>"dummy-v2"}]
[2025/08/15 12:51:27.783271259] [ info] [input:calyptia_fleet:calyptia_fleet.2] not creating file with timestamp 1755312617 since it is not newer than existing files
[0] dummy.0: [[1755287487.544901501, {}], {"message"=>"dummy-v2"}]
[0] dummy.0: [[1755287488.535402366, {}], {"message"=>"dummy-v2"}]
[0] dummy.0: [[1755287489.535115919, {}], {"message"=>"dummy-v2"}]
[0] dummy.0: [[1755287490.535504901, {}], {"message"=>"dummy-v2"}]
[0] dummy.0: [[1755287491.535992594, {}], {"message"=>"dummy-v2"}]
[2025/08/15 12:51:33.97791186] [ info] [input:calyptia_fleet:calyptia_fleet.2] not creating file with timestamp 1755312617 since it is not newer than existing files
^C[2025/08/15 12:51:33] [engine] caught signal (SIGINT)
[0] dummy.0: [[1755287492.537632435, {}], {"message"=>"dummy-v2"}]
[2025/08/15 12:51:33.402123491] [ warn] [engine] service will shutdown in max 5 seconds
[2025/08/15 12:51:33.404681418] [ info] [input] pausing dummy.0
[2025/08/15 12:51:33.405047634] [ info] [input] pausing fluentbit_metrics.1
[2025/08/15 12:51:33.405203012] [ info] [input] pausing calyptia_fleet.2
[2025/08/15 12:51:33.534630515] [ info] [engine] service has stopped (0 pending tasks)
[2025/08/15 12:51:33.534878646] [ info] [input] pausing dummy.0
[2025/08/15 12:51:33.535092733] [ info] [input] pausing fluentbit_metrics.1
[2025/08/15 12:51:33.535291029] [ info] [input] pausing calyptia_fleet.2
[2025/08/15 12:51:33.537725287] [ info] [output:stdout:stdout.0] thread worker #0 stopping...
[2025/08/15 12:51:33.540782224] [ info] [output:stdout:stdout.0] thread worker #0 stopped
==1156907==
==1156907== HEAP SUMMARY:
==1156907==     in use at exit: 0 bytes in 0 blocks
==1156907==   total heap usage: 44,180 allocs, 44,180 frees, 22,466,535 bytes allocated
==1156907==
==1156907== All heap blocks were freed -- no leaks are possible
==1156907==
==1156907== For lists of detected and suppressed errors, rerun with: -s
==1156907== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

@alecholmes
Copy link
Owner Author

@pwhelan going to close this so I can reopen the pr against the oss repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants