From c8f16a376913018ee07d13ad39d4554f048047a0 Mon Sep 17 00:00:00 2001 From: Daijiro Fukuda Date: Fri, 25 Mar 2022 16:25:20 +0900 Subject: [PATCH 1/3] Fix wrong description of total retry time See also: https://github.com/fluent/fluentd/pull/3649 Before the fix and https://github.com/fluent/fluentd/pull/3640 , the both of retry behavior and the document were wrong. Actual behavior: `c + c + cb^1 + ... + cb^(k-1)` (k+1 retries totally) Total calculation: `cb^(-1) + c + cb^1 + ... + cb^(k-2)` Signed-off-by: Daijiro Fukuda --- buffer/README.md | 2 +- configuration/buffer-section.md | 3 ++- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/buffer/README.md b/buffer/README.md index b46fa687..a33f47f8 100644 --- a/buffer/README.md +++ b/buffer/README.md @@ -41,7 +41,7 @@ A chunk can fail to be written out to the destination for a number of reasons. T By default, Fluentd increases the wait interval exponentially for each retry attempt. For example, assuming that the initial wait interval is set to 1 second and the exponential factor is 2, each attempt occurs at the following time points: ```text -1 2 4 8 16 +0 1 3 7 15 x-x---x-------x---------------x------------------------- │ │ │ │ └─ 4th retry (wait = 8s) │ │ │ └───────────────── 3th retry (wait = 4s) diff --git a/configuration/buffer-section.md b/configuration/buffer-section.md index b33f90fc..6da89cb5 100644 --- a/configuration/buffer-section.md +++ b/configuration/buffer-section.md @@ -546,7 +546,8 @@ With `exponential_backoff`, `retry_wait` interval will be calculated as below: * c: constant factor, `@retry_wait` * b: base factor, `@retry_exponential_backoff_base` * k: number of retry times -* total retry time: `c + c * b^1 + (...) + c*b^k = c*b^(k+1) - 1` +* total retry time: `c + c*b^1 + (...) + c*b^(k-1) = c*(b^k - 1) / (b - 1)` + * = `2^k - 1` by default If this article is incorrect or outdated, or omits critical information, please [let us know](https://github.com/fluent/fluentd-docs-gitbook/issues?state=open). [Fluentd](http://www.fluentd.org/) is an open-source project under [Cloud Native Computing Foundation \(CNCF\)](https://cncf.io/). All components are available under the Apache 2 License. From a6aa7e95eb7444b9adaf4dc8fb788bb92427a7e9 Mon Sep 17 00:00:00 2001 From: Daijiro Fukuda Date: Thu, 31 Mar 2022 14:10:36 +0900 Subject: [PATCH 2/3] Fix wrong number of retry times See also: https://github.com/fluent/fluentd/pull/3649 Before the fix, the number was mistakenly 19. This fixes it to 18. (The document was also wrong.) n-th retry is triggered at `2^n - 1` seconds elapsed since the first flush. So 18-th retry is at 262143 seconds elapsed and this exceeds 72 hours (259200 seconds). If the next retry is going to exceed this time limit, the last retry will be made at exactly this time limit. So the last time is 18. Signed-off-by: Daijiro Fukuda --- output/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/output/README.md b/output/README.md index ec2fbb53..b21b911a 100644 --- a/output/README.md +++ b/output/README.md @@ -204,7 +204,7 @@ Default: `72` \(hours\) #### `retry_max_times` -The maximum number of times to retry to flush while failing. If `retry_timeout` is the default, the number is 17 with exponential backoff. +The maximum number of times to retry to flush while failing. If `retry_timeout` is the default, the number is 18 with exponential backoff. Default: `nil` From 06de24a5ef755e4bd2f0b0dbe56718938d450973 Mon Sep 17 00:00:00 2001 From: Daijiro Fukuda Date: Thu, 31 Mar 2022 14:12:12 +0900 Subject: [PATCH 3/3] Add description about last retry and add examples Signed-off-by: Daijiro Fukuda --- configuration/buffer-section.md | 6 ++++- output/README.md | 48 +++++++++++++++++++++++++++++++-- 2 files changed, 51 insertions(+), 3 deletions(-) diff --git a/configuration/buffer-section.md b/configuration/buffer-section.md index 6da89cb5..8ec5a7bf 100644 --- a/configuration/buffer-section.md +++ b/configuration/buffer-section.md @@ -494,7 +494,11 @@ Following are the flushing parameters for chunks to optimize performance \(laten * Default: 72h * The maximum time \(seconds\) to retry to flush again the failed chunks, - until the plugin discards the buffer chunks + until the plugin discards the buffer chunks. + + If the next retry is going to exceed this time limit, the last retry + + will be made at exactly this time limit. * `retry_forever` \[bool\] * Default: `false` * If true, plugin will ignore `retry_timeout` and `retry_max_times` diff --git a/output/README.md b/output/README.md index b21b911a..98af696d 100644 --- a/output/README.md +++ b/output/README.md @@ -181,6 +181,18 @@ If the bottom chunk write out fails, it will remain in the queue and Fluentd wil Writing out the bottom chunk is considered to be a failure if `Output#write` or `Output#try_write` method throws an exception. +The retry timings of `retry_timeout: 100s`. + +| N-th retry | Elapsed | +| :--- | :--- | +| 1th | 1s | +| 2th | 3s | +| 3th | 7s | +| 4th | 15s | +| 5th | 31s | +| 6th | 63s | +| 7th | 100s | + #### `retry_type` Specifies how to wait for the next retry to flush buffer. @@ -198,9 +210,9 @@ Default: `false` #### `retry_timeout` -The maximum seconds to retry to flush while failing, until the plugin discards the buffer chunks. +The maximum seconds to retry to flush while failing, until the plugin discards the buffer chunks. If the next retry is going to exceed this time limit, the last retry will be made at exactly this time limit. -Default: `72` \(hours\) +Default: `72h` \(72 hours\) #### `retry_max_times` @@ -268,5 +280,37 @@ This example sends logs to Elasticsearch using a file buffer `/var/log/td-agent/ NOTE: `` plugin receives the primary's buffer chunk directly. So, you need to check if your secondary plugin works with the primary setting. +The retry timings of `retry_timeout: 100s` with the secondary. + +| N-th retry | Elapsed | Output plugin | +| :--- | :--- | :--- | +| 1th | 1s | primary | +| 2th | 3s | primary | +| 3th | 7s | primary | +| 4th | 15s | primary | +| 5th | 31s | primary | +| 6th | 63s | primary | +| 7th | 80s | secondary | +| 8th | 81s | secondary | +| 9th | 83s | secondary | +| 10th | 87s | secondary | +| 11th | 95s | secondary | +| 12th | 100s | secondary | + +The retry timings of `retry_max_times: 10` with the secondary. + +| N-th retry | Elapsed | Output plugin | +| :--- | :--- | :--- | +| 1th | 1s | primary | +| 2th | 3s | primary | +| 3th | 7s | primary | +| 4th | 15s | primary | +| 5th | 31s | primary | +| 6th | 63s | primary | +| 7th | 127s | primary | +| 8th | 255s | primary | +| 9th | 511s | primary | +| 10th | 818s | secondary | + If this article is incorrect or outdated, or omits critical information, please [let us know](https://github.com/fluent/fluentd-docs-gitbook/issues?state=open). [Fluentd](http://www.fluentd.org/) is an open-source project under [Cloud Native Computing Foundation \(CNCF\)](https://cncf.io/). All components are available under the Apache 2 License.