Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial version #1

Merged
merged 1 commit into from
Oct 31, 2022
Merged

Add initial version #1

merged 1 commit into from
Oct 31, 2022

Conversation

stevehipwell
Copy link
Collaborator

This PR adds the initial version of the Fluentd Aggregator based on stevehipwell/fluentd-aggregator.

Before this can be merged the following needs to be completed.

  • Add Docker credentials to repo settings
  • Configure GHCR repo?

After this has been merged the following actions need to be completed.

  • Template version updates
  • Improve docs
  • Use Fluentd Docker image as the base (when it is a multi-arch image)

@stevehipwell
Copy link
Collaborator Author

@patrick-stephens the initial offering for discussion.

Copy link
Collaborator

@patrick-stephens patrick-stephens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks decent to me although a few things need adding:

  • Linting using Hadolint, Shellcheck, ActionLint, MarkdownLint
  • Some more documentation
  • Dependabot for actions and dockerfile

Unless you're planning on adding images other than Alpine I would simplify the CI and push all files up to the repo root.

Can you raise an issue if you don't want to do a big bang PR for these?

.github/workflows/ci.yaml Outdated Show resolved Hide resolved
.github/workflows/publish.yaml Outdated Show resolved Hide resolved
.github/workflows/publish.yaml Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
v1.14/alpine/Dockerfile Outdated Show resolved Hide resolved
v1.14/alpine/fluent.conf Outdated Show resolved Hide resolved
.github/workflows/publish.yaml Outdated Show resolved Hide resolved
.github/workflows/ci.yaml Outdated Show resolved Hide resolved
.editorconfig Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@stevehipwell
Copy link
Collaborator Author

@patrick-stephens I need to review how the workflow for this is designed, what I've got here doesn't actually make sense. As you alluded to in a comment above I don't have a good solution for the versioning, I am against the Fluentd Docker versioning strategy (-1.0 etc) as it isn't SemVer 2.

I'm now thinking that maybe this repo tracks the latest Fluentd minor release and bumps the major version when this changes (e.g. v1.14 -> v1 and v1.15 -> v2) which will allow Fluentd or plugin updates/additions to be represented by minor or patch versions. It would then be possible to patch old versions if ever required by taking a branch from a release tag and working from that to tag a new version.

Linting using Hadolint, Shellcheck, ActionLint, MarkdownLint

Do you have an example?

Dependabot for actions and dockerfile

Isn't this setup at the repo level in the UI?

Trivy + Dockle scans

Do you have an example?

@stevehipwell
Copy link
Collaborator Author

@patrick-stephens we probably want a Ruby expert to take a look at this as I suspect we might get better security scanning if we added the plugins via a different method. What would really make this easier would be if the fluentd-docker-image CI was updated for multi-arch builds then the quantity of code in the repo would drop significantly.

@patrick-stephens
Copy link
Collaborator

I'm happy if you want to land this just to get things moving and we tackle the rest as issues. We can make sure it's not GA or a major release until then.

@stevehipwell
Copy link
Collaborator Author

@patrick-stephens sorry I've been crazy busy but could you take a quick look at this again? It's now architected as a single rolling release with AMD64 and ARM64 builds for Alpine and Debian. I'll take a look at your previous comments but the idea would be to land it and release v0.0.1 before iterating on it.

@patrick-stephens
Copy link
Collaborator

Sounds good, I'll try to have a look tomorrow if possible

@patrick-stephens
Copy link
Collaborator

Do you want to mark as ready for review?

Copy link
Collaborator

@patrick-stephens patrick-stephens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me, some comments but all can be done later.

@@ -0,0 +1,3 @@
* text=auto eol=lf
*.{cmd,[cC][mM][dD]} text eol=crlf
*.{bat,[bB][aA][tT]} text eol=crlf
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably want to add Linux shell scripts as well as they don't like the wrong line ending either

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 1 says everything will use lf unless explicitly overridden, so only file types needing crlf need specifying.

.github/workflows/pull-request.yaml Show resolved Hide resolved
.github/workflows/merge.yaml Outdated Show resolved Hide resolved
.github/workflows/release.yaml Outdated Show resolved Hide resolved
@stevehipwell stevehipwell force-pushed the initial-version branch 4 times, most recently from 41a48b0 to 52f1b26 Compare September 15, 2022 12:53
@stevehipwell stevehipwell force-pushed the initial-version branch 2 times, most recently from 4a6fc5d to 5014a5f Compare October 4, 2022 13:44
@stevehipwell stevehipwell marked this pull request as ready for review October 4, 2022 13:56
@stevehipwell
Copy link
Collaborator Author

@patrick-stephens this is ready for review now and should be a lot more fully featured than the previous attempt; it's based on the latest version of my Fluentd Aggregator which I'm currently in the process of testing.

If there is a Ruby expert working on the Fluentd project who could provide the correct configuration to use Bundler from the Dockerfile instead of the inline script we could have Dependabot watch for updates of the Ruby Gems.

The failing Grype code scanning results need addressing; I'd suggest setting the repo failure level in the UI and then either ignore blocking issues in the UI or mitigate them (probably upstream of this repo). I suspect that any attempt to be more secure than the Fluentd Docker image is likely to be painful.

@patrick-stephens
Copy link
Collaborator

@cosmo0920 might be a good person to ask about the Fluentd bundling, I'm afraid I try to avoid Ruby :)

Copy link
Collaborator

@patrick-stephens patrick-stephens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me, only a few queries really.

One thing I would check is whether it is easier to combine those workflows to simplify future updates. There appears to be some duplication across commit, pull request and tags which means any update must be done to all three so I've had issues in the past with missing one out. I don't think this should block though.

.github/workflows/commit.yaml Show resolved Hide resolved
@@ -0,0 +1,60 @@
FROM alpine:3.16
LABEL maintainer "Fluentd developers <[email protected]>"
LABEL description="Fluentd aggregator OCI image based on Fluentd v1.15.2" version="0.0.1"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want the base version of Fluentd to be configurable via a build arg?

Note that the various actions can overwrite labels, and indeed set a load of the default OCI ones so it may be worth checking them.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm already using an action to do this, I'd just forgotten to remove the labels from the Dockerfiles. I've pushed up the change to do this now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want the base version of Fluentd to be configurable via a build arg?

I don't think we can do this as all of the other package versions need changing in coordination with this (it's also no longer in the description anyway).

@stevehipwell
Copy link
Collaborator Author

One thing I would check is whether it is easier to combine those workflows to simplify future updates. There appears to be some duplication across commit, pull request and tags which means any update must be done to all three so I've had issues in the past with missing one out. I don't think this should block though.

As YAML isn't a programming language I'd choose duplication over the complexity of supporting the different workflows in a single file. For DRYing these out I'd recommend breaking the common patterns out into composite actions rather than trying to use a single workflow.

@patrick-stephens
Copy link
Collaborator

One thing I would check is whether it is easier to combine those workflows to simplify future updates. There appears to be some duplication across commit, pull request and tags which means any update must be done to all three so I've had issues in the past with missing one out. I don't think this should block though.

As YAML isn't a programming language I'd choose duplication over the complexity of supporting the different workflows in a single file. For DRYing these out I'd recommend breaking the common patterns out into composite actions rather than trying to use a single workflow.

Or a reusable workflow but yeah. Whatever works for you I think: as I say doesn't really block anything here - my concern would be that you know it but anyone submitting a PR doesn't so it has to be covered in review.

@stevehipwell
Copy link
Collaborator Author

Or a reusable workflow but yeah. Whatever works for you I think: as I say doesn't really block anything here - my concern would be that you know it but anyone submitting a PR doesn't so it has to be covered in review.

That'd be the next logical progression if other repos were using the same automation, and something which I'm likely to do but wouldn't be exclusively for this use case. It's a shame GitHub doesn't allow more granular controls for GH actions sources. RE the review I'd suggest a CODEOWNERS file so any changes to the workflows next extra approval, IMHO this should be the case even if the actions were super simple.

@patrick-stephens
Copy link
Collaborator

Yeah that's a good shout @stevehipwell , I don't want to block the review though on it unless you want to quickly add one.

@stevehipwell
Copy link
Collaborator Author

@patrick-stephens I've added an empty CODEOWNERS file but I don't think we want to dry the automation out until we've got it working and have used it a few times.

Copy link

@cosmo0920 cosmo0920 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good but I added comments for label names convention.

fluent.conf Outdated Show resolved Hide resolved
fluent.conf Outdated
@@ -0,0 +1,18 @@
<source>
@type forward
@label @default

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use uppercase for label name:

-  @label @default
+  @label @DEFAULT

fluent.yaml Outdated
config:
- source:
$type: forward
$label: '@default'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

-      $label: '@default'
+      $label: '@DEFAULT'

fluent.yaml Outdated Show resolved Hide resolved
@stevehipwell
Copy link
Collaborator Author

If there is a Ruby expert working on the Fluentd project who could provide the correct configuration to use Bundler from the Dockerfile instead of the inline script we could have Dependabot watch for updates of the Ruby Gems.

@cosmo0920 would you be able to help with this?

@stevehipwell
Copy link
Collaborator Author

I can help you about using bundler on your Dockerfile.
But, it seems that it needn't pin the specific version of plugins.
So, we don't need to introduce any complexity including bundler.

@cosmo0920 there are a couple of reasons why I think we need specific versions. Firstly by pinning specific versions and using Dependabot we get notified about potential upstream changes, can choose to take them or not, and can then decide when a release is required. Secondly having the explicit versions makes the documentation clearer and tracking down issues much easier. Thirdly, and most importantly, CNCF security best practices (amongst others) say that dependencies should be immutable and captured in a SBOM. This is why I'd like to use Bundler to integrate with other tooling.

@cosmo0920
Copy link

Thanks for your reply. Hmm, it's reasonable to introduce bundler to manage Fluentd and its plugins' dependencies. 👍

@cosmo0920
Copy link

cosmo0920 commented Oct 25, 2022

For bundler migration:

First, we have to create Gemfile to manage minimal plugin dependencies by hands:

source "https://rubygems.org"

gem "oj", "3.13.19"
gem "json", "2.6.2"
gem "async", "1.30.3"
gem "async-http", "0.56.6"
gem "fluentd", "1.15.2"
gem "fluent-plugin-azure-loganalytics", "0.7.0"
gem "fluent-plugin-azurestorage-gen2", "0.3.3"
gem "fluent-plugin-cloudwatch-logs", "0.14.3"
gem "fluent-plugin-concat", "2.5.0"
gem "fluent-plugin-datadog", "0.14.2"
gem "fluent-plugin-elasticsearch", "5.2.3"
gem "fluent-plugin-grafana-loki", "1.2.18"
gem "fluent-plugin-kafka", "0.18.1"
gem "fluent-plugin-opensearch", "1.0.8"
gem "fluent-plugin-prometheus", "2.0.3"
gem "fluent-plugin-record-modifier", "2.1.1"
gem "fluent-plugin-rewrite-tag-filter", "2.4.0"
gem "fluent-plugin-route", "1.0.0"
gem "fluent-plugin-s3", "1.7.1"
gem "fluent-plugin-sqs", "3.0.0"

Second, generate complete dependency information by alpine or other lightweight but including ruby binary containers like as:

$ docker run --rm -it --mount type=bind,source="$(pwd)"/Gemfile,target=/Gemfile,readonly \
                ruby:alpine sh -c "apk add --no-cache --quiet git && bundle lock --print --remove-platform x86_64-linux-musl --add-platform ruby" > Gemfile.lock

Finally, use --gemfile option to manage Fluentd and its plugins dependencies on entrypoint.sh:

# other stuffs
    if ! echo $@ | grep -e ' \-p' -e ' \-\-gemfile' ; then
       set -- "$@" --gemfile /path/to/Gemfile
    fi

ref: https://docs.fluentd.org/deployment/plugin-management#gemfile-option

@cosmo0920
Copy link

For further plugin management, maybe we have to create Makefile or similar approach to provide short-circuit to update Fluentd and its plugin dependencies.

@stevehipwell
Copy link
Collaborator Author

Thanks @cosmo0920, would it be possible to apply the locked Gemfile during the Docker build rather than at runtime? I assume so but could you give me the correct command to do this?

Once I've got my questions answered I'll give the Gemfile a try on my stevehipwell/fluentd-aggregator image which is the basis for this one.

@cosmo0920
Copy link

cosmo0920 commented Oct 25, 2022

Ah, I forgot to mention to use the following step:

COPY Gemfile* /fluentd/

@stevehipwell
Copy link
Collaborator Author

@cosmo0920 I think it'd be best to apply the Gemfile (with lock) during the Docker build so what would be the correct bundler command to do that?

@cosmo0920
Copy link

Our official image should use these command:
https://github.com/fluent/fluentd-kubernetes-daemonset/blob/master/docker-image/v1.15/debian-elasticsearch7/Dockerfile#L23-L25

@stevehipwell
Copy link
Collaborator Author

@cosmo0920 I've tested Bundler in my other repo and have added it to this PR. I do have a couple of questions which you might be able to help me with?

  • Why do we need to install oj, async & async-http when they're development dependencies in the fluentd.gemspec?
  • Same question for json which isn't in the fluentd.gemspec?
  • If we need these dependencies can we update them within the scope of the fluentd.gemspec?
  • For the Debian image how come we're not installing tini & jemalloc from apt?

@cosmo0920
Copy link

cosmo0920 commented Oct 26, 2022

  • Why do we need to install oj, async & async-http when they're development dependencies in the fluentd.gemspec?

Just for historical reason. I think that we don't need to handle them on Gemfile.

This is because json gem is previously handled as default gem on Ruby 2.7. To work with Ruby 2.7 and its to be still mainline, we don't depend on json gem explicitly.

If we need these dependencies can we update them within the scope of the fluentd.gemspec?

No, we don't. Due to Ruby 2.7 support is needed for now.

For the Debian image how come we're not installing tini & jemalloc from apt?

From apt repository installation, sometimes it will install a bit of older versions. We need to install the specific version that we need to include the container.

@ashie
Copy link
Member

ashie commented Oct 26, 2022

  • Why do we need to install oj, async & async-http when they're development dependencies in the fluentd.gemspec?

In my understanding, oj is optional but most fluentd distributions bundle it to improve performance of handling JSON. if oj doesn't exist, yajl or json is used as a fallback.

refs:

Probably async & async-http are also similar reason.
They aren't needed mandatory but fluentd tries to load them first when it launches HTTP server:
https://github.com/fluent/fluentd/blob/5c41401fe51a1823cf86b2dcefad7e199fd576f7/lib/fluent/plugin_helper/http_server.rb#L17-L23

@ashie
Copy link
Member

ashie commented Oct 26, 2022

In other words, we confirm fluentd well with async, asyc-http and oj but not confirmed well without them (Our CI is always run with them).

Copy link

@cosmo0920 cosmo0920 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the current patch, I'd like to add +1. 👍

@stevehipwell
Copy link
Collaborator Author

@ashie @cosmo0920 I've added some further questions. Just for context we're only talking about Ruby 3 here and a single up to date Debian or Alpine version at any one time.

In other words, we confirm fluentd well with async, asyc-http and oj but not confirmed well without them (Our CI is always run with them).

That makes sense, so are we OK to update the versions within the constraints set out in the fluentd.gemspec? For example oj to v3.13.21.

What about for async, the fluentd.gemspec limits it to v1.x for development but in a Ruby 3 only scenario such as this can we use v2.x?

For the Debian image how come we're not installing tini & jemalloc from apt?

From apt repository installation, sometimes it will install a bit of older versions. We need to install the specific version that we need to include the container.

I don't think this is the case for tini anymore if we're using an up to date Debian image as the apt version is newer than the one in the Dockerfile. For jemalloc the apt version is significantly newer (v5.2.1) than the one on the Dockerfile (v4.5.0);, does only this version work?

@ashie
Copy link
Member

ashie commented Oct 27, 2022

That makes sense, so are we OK to update the versions within the constraints set out in the fluentd.gemspec? For example oj to v3.13.21.

Yes.

What about for async, the fluentd.gemspec limits it to v1.x for development but in a Ruby 3 only scenario such as this can we use v2.x?

We don't support async 2.0 yet even if on Ruby 3, it doesn't work well.
I'm now working on it but not yet complete: fluent/fluentd#3842

For jemalloc the apt version is significantly newer (v5.2.1) than the one on the Dockerfile (v4.5.0);, does only this version work?

Should use latest one, jemalloc v5 is recommended.

@stevehipwell stevehipwell force-pushed the initial-version branch 2 times, most recently from 73e2021 to 46c7bff Compare October 27, 2022 11:19
@stevehipwell
Copy link
Collaborator Author

@ashie @cosmo0920 I've updated the Debian Dockerfile to use the apt version of tini & jemalloc; this looks to work correctly in my local tests.

@stevehipwell stevehipwell force-pushed the initial-version branch 2 times, most recently from adbb4ae to 6765a65 Compare October 27, 2022 13:39
Signed-off-by: Steve Hipwell <[email protected]>
@stevehipwell
Copy link
Collaborator Author

@patrick-stephens what do you want to do about the Grype issues? Some of them might be valid but a lot of them are false positives. I think this is related to anchore/grype#603.

@patrick-stephens
Copy link
Collaborator

@stevehipwell I've not used Grype and for an initial drop I think it is fine - we should probably make sure we track them and resolve (either fix or ignore as appropriate).

@stevehipwell
Copy link
Collaborator Author

@patrick-stephens if you're happy I'll merge?

@patrick-stephens
Copy link
Collaborator

@stevehipwell please do, all fine by me.

@stevehipwell stevehipwell merged commit 935bf56 into main Oct 31, 2022
@stevehipwell stevehipwell deleted the initial-version branch October 31, 2022 16:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants