Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

out_file/out_secondary_file: Support ${chunk_id} placeholder. fix #1705 #1708

Merged
merged 3 commits into from
Oct 13, 2017
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions lib/fluent/plugin/out_file.rb
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,7 @@ def configure(conf)
dummy_record = Hash[dummy_record_keys.zip(['data'] * dummy_record_keys.size)]

test_meta1 = metadata_for_test(dummy_tag, Fluent::Engine.now, dummy_record)
test_path = extract_placeholders(@path_template, test_meta1)
test_path = extract_placeholders(@path_template.gsub(CHUNK_ID_PLACEHOLDER_PATTERN, 'test'), test_meta1)
unless ::Fluent::FileUtil.writable_p?(test_path)
raise Fluent::ConfigError, "out_file: `#{test_path}` is not writable"
end
Expand Down Expand Up @@ -178,7 +178,7 @@ def format(tag, time, record)
end

def write(chunk)
path = extract_placeholders(@path_template, chunk.metadata)
path = extract_placeholders(@path_template.gsub(CHUNK_ID_PLACEHOLDER_PATTERN, dump_unique_id_hex(chunk.unique_id)), chunk.metadata)
Copy link
Member

@mururu mururu Oct 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems better if the replacements of CHUNK_ID_PLACEHOLDER_PATTERN is done by extract_placeholders instead of each plugins. For now, only these two plugins use this method, but it is useful if all plugins can use this feature automatically in the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it is one idea.
For it, we need to change extract_placeholders API.

  1. 2nd argument changed to chunk, not chunk.metadata

For compatibility, we need to check 2nd argument type.

metadata = chunk.metadata if metadata.is_a?(Chunk)
  1. Add 3rd argument for chunk

def extract_placeholders(str, metadata, chunk = nil)

metadata is from chunk so this API is a little strange.

  1. Add 3rd argument for additional params

This avoids further API changes.

def extract_placeholders(str, metadata, extras = nil)
  if extras && extras.has_key?(:chunk)
  end
end

I think 1 is better because this doesn't break exsiting API and doesn't add more argument.
We can change Plugin API before stable announcement.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 > 1

Copy link
Contributor

@cosmo0920 cosmo0920 Oct 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree to change extract_placeholders API.
👍 > 1 and 3. Both of them look good to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 > 1

FileUtils.mkdir_p File.dirname(path), mode: @dir_perm

writer = case
Expand Down
4 changes: 2 additions & 2 deletions lib/fluent/plugin/out_secondary_file.rb
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ def multi_workers_ready?
end

def write(chunk)
path_without_suffix = extract_placeholders(@path_without_suffix, chunk.metadata)
path_without_suffix = extract_placeholders(@path_without_suffix.gsub(CHUNK_ID_PLACEHOLDER_PATTERN, dump_unique_id_hex(chunk.unique_id)), chunk.metadata)
path = generate_path(path_without_suffix)
FileUtils.mkdir_p File.dirname(path), mode: @dir_perm

Expand Down Expand Up @@ -106,7 +106,7 @@ def validate_compatible_with_primary_buffer!(path_without_suffix)
raise Fluent::ConfigError, "out_secondary_file: basename or directory has an incompatible placeholder #{ph}, remove tag placeholder, like `${tag}`, from basename or directory"
end

vars = placeholders.reject { |placeholder| placeholder.match(/tag(\[\d+\])?/) }
vars = placeholders.reject { |placeholder| placeholder.match(/tag(\[\d+\])?/) || (placeholder == 'chunk_id') }

if ph = vars.find { |v| !@chunk_keys.include?(v) }
raise Fluent::ConfigError, "out_secondary_file: basename or directory has an incompatible placeholder #{ph}, remove variable placeholder, like `${varname}`, from basename or directory"
Expand Down
3 changes: 2 additions & 1 deletion lib/fluent/plugin/output.rb
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ class Output < Base
CHUNK_KEY_PATTERN = /^[-_.@a-zA-Z0-9]+$/
CHUNK_KEY_PLACEHOLDER_PATTERN = /\$\{[-_.@$a-zA-Z0-9]+\}/
CHUNK_TAG_PLACEHOLDER_PATTERN = /\$\{(tag(?:\[\d+\])?)\}/
CHUNK_ID_PLACEHOLDER_PATTERN = /\$\{chunk_id\}/

CHUNKING_FIELD_WARN_NUM = 4

Expand Down Expand Up @@ -679,7 +680,7 @@ def get_placeholders_tag(str)
end

def get_placeholders_keys(str)
str.scan(CHUNK_KEY_PLACEHOLDER_PATTERN).map{|ph| ph[2..-2]}.reject{|s| s == "tag"}.sort
str.scan(CHUNK_KEY_PLACEHOLDER_PATTERN).map{|ph| ph[2..-2]}.reject{|s| (s == "tag") || (s == 'chunk_id') }.sort
end

# TODO: optimize this code
Expand Down
29 changes: 29 additions & 0 deletions test/plugin/test_out_file.rb
Original file line number Diff line number Diff line change
Expand Up @@ -557,6 +557,35 @@ def parse_system(text)
check_gzipped_result(path, formatted_lines * 3)
end

test '${chunk_id}' do
time = event_time("2011-01-02 13:14:15 UTC")
formatted_lines = %[2011-01-02T13:14:15Z\ttest\t{"a":1}\n] + %[2011-01-02T13:14:15Z\ttest\t{"a":2}\n]

write_once = ->(){
d = create_driver %[
path #{TMP_DIR}/out_file_chunk_id_${chunk_id}
utc
append true
<buffer>
timekey_use_utc true
</buffer>
]
d.run(default_tag: 'test'){
d.feed(time, {"a"=>1})
d.feed(time, {"a"=>2})
}
d.instance.last_written_path
}

path = write_once.call
if File.basename(path) =~ /out_file_chunk_id_([-_.@a-zA-Z0-9].*).20110102.log/
unique_id = Fluent::UniqueId.hex(Fluent::UniqueId.generate)
assert_equal unique_id.size, $1.size, "chunk_id size is mismatched"
else
flunk "chunk_id is not included in the path"
end
end

test 'symlink' do
omit "Windows doesn't support symlink" if Fluent.windows?
conf = CONFIG + %[
Expand Down
14 changes: 14 additions & 0 deletions test/plugin/test_out_secondary_file.rb
Original file line number Diff line number Diff line change
Expand Up @@ -260,6 +260,20 @@ def create_chunk(primary, metadata, es)
assert_equal "#{TMP_DIR}/dump.bin", path
end

test 'path with ${chunk_id}' do
d = create_driver %[
directory #{TMP_DIR}
basename out_file_chunk_id_${chunk_id}
]
path = d.instance.write(@c)
if File.basename(path) =~ /out_file_chunk_id_([-_.@a-zA-Z0-9].*).0/
unique_id = Fluent::UniqueId.hex(Fluent::UniqueId.generate)
assert_equal unique_id.size, $1.size, "chunk_id size is mismatched"
else
flunk "chunk_id is not included in the path"
end
end

data(
invalid_tag: [/tag/, '${tag}'],
invalid_tag0: [/tag\[0\]/, '${tag[0]}'],
Expand Down