-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add from_encoding parameter to in_tail plugin #1067
Add from_encoding parameter to in_tail plugin #1067
Conversation
Add configuration check new param "from_encoding", and test convert from input source is "Hello world" in japanese Hiragana that is encoded in cp932 to UTF-8.
Add new param "from_encoding". If "encoding" param is only specified, process is same way as ever to keep backword compatibility. If two params, "encoding" and "from_encoding" are specified, process uses ```String.encode!(to, from)```.
@@ -64,14 +64,10 @@ def initialize | |||
config_param :multiline_flush_interval, :time, default: nil | |||
desc 'Enable the additional watch timer.' | |||
config_param :enable_watch_timer, :bool, default: true | |||
desc 'The encoding after conversion of the input.' | |||
config_param :encoding, default: nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specify type of this parameter.
@tagomoris |
Add test pattern from_encoding is only specified.
end | ||
else | ||
if @from_encoding | ||
$log.warn "'from_encoding' parameter must be specified in conjunction with 'encoding' paramter." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use log
instead of $log
to enable plugin logger (log_level
for each plugin sections).
And I think it's ok to say just must be specified with "encoding" parameter
.
Use log instead of $log. Change the log messages so that the users are easy to understand. Return as soon as possible, if encoding parameters are not specified.
LGTM! |
@repeatedly Please check just once more. |
unless @encoding | ||
if @from_encoding | ||
log.warn "'from_encoding' parameter must be specified with 'encoding' parameter." | ||
log.warn "'from_encoding' is ignored" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This log is confusing.
If from_encoding
must be specified with encoding
, we should raise an ConfigError.
In addition, no need two line logs. 2nd line should be merged into 1st.
I deleted 2nd log message. Changed to raise error if 'encoding' and 'from_encoding' paramters are bad cofiguration.
Corresponding to ConfigError
@@ -251,7 +267,11 @@ def close_watcher_after_rotate_wait(tw) | |||
def flush_buffer(tw) | |||
if lb = tw.line_buffer | |||
lb.chomp! | |||
lb.force_encoding(@encoding) if @encoding | |||
if @encoding && @from_encoding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is microbenchmark result. I think default should be fast. So f2
implementation is better for almost users.
% ruby b_enc_perf.rb
Rehearsal --------------------------------------------------------
f1 with nil / nil 0.840000 0.000000 0.840000 ( 0.842853)
f2 with nil / nil 0.760000 0.000000 0.760000 ( 0.769146)
f1 with true / true 0.870000 0.000000 0.870000 ( 0.871896)
f2 with true / true 0.920000 0.000000 0.920000 ( 0.917469)
f1 with true / false 0.990000 0.000000 0.990000 ( 0.989552)
f2 with true / false 0.920000 0.010000 0.930000 ( 0.926458)
----------------------------------------------- total: 5.310000sec
user system total real
f1 with nil / nil 0.890000 0.000000 0.890000 ( 0.890333)
f2 with nil / nil 0.820000 0.000000 0.820000 ( 0.822715)
f1 with true / true 0.890000 0.000000 0.890000 ( 0.885648)
f2 with true / true 0.940000 0.000000 0.940000 ( 0.938831)
f1 with true / false 1.000000 0.000000 1.000000 ( 1.003493)
f2 with true / false 0.950000 0.000000 0.950000 ( 0.956331)
Code:
require 'benchmark'
class Foo
def initialize(enc, from_enc)
@enc = enc
@from_enc = from_enc
end
def f1(num)
num.times {
if @enc && @from_enc
1
elsif @enc
2
end
}
end
def f2(num)
num.times {
if @enc
if @from_enc
1
else
2
end
end
}
end
end
n = 20000000
Benchmark.bmbm do |x|
x.report('f1 with nil / nil') {
foo = Foo.new(nil, nil)
foo.f1(n)
}
x.report('f2 with nil / nil') {
foo = Foo.new(nil, nil)
foo.f2(n)
}
x.report('f1 with true / true') {
foo = Foo.new(true, true)
foo.f1(n)
}
x.report('f2 with true / true') {
foo = Foo.new(true, true)
foo.f2(n)
}
x.report('f1 with true / false') {
foo = Foo.new(true, false)
foo.f1(n)
}
x.report('f2 with true / false') {
foo = Foo.new(true, false)
foo.f2(n)
}
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for microbenchmark result!
For almost users, default setting should be faster than the setting is specified encoding.
Merged. Thank you for contribution! |
* Add test code to check from_encoding param Add configuration check new param "from_encoding", and test convert from input source is "Hello world" in japanese Hiragana that is encoded in cp932 to UTF-8. * Add from_encoding param to in_tail plugin Add new param "from_encoding". If "encoding" param is only specified, process is same way as ever to keep backword compatibility. If two params, "encoding" and "from_encoding" are specified, process uses ```String.encode!(to, from)```. * Specify type of encoding and from_encoding * Fix configuration encoding and from_encoding * Fix test code Add test pattern from_encoding is only specified. * Fix configure_encoding Use log instead of $log. Change the log messages so that the users are easy to understand. Return as soon as possible, if encoding parameters are not specified. * Use ConfigError instead of warnning-log I deleted 2nd log message. Changed to raise error if 'encoding' and 'from_encoding' paramters are bad cofiguration. * Fix test code Corresponding to ConfigError * Change implementation of encode statement For almost users, default setting should be faster than the setting is specified encoding.
* Add test code to check from_encoding param Add configuration check new param "from_encoding", and test convert from input source is "Hello world" in japanese Hiragana that is encoded in cp932 to UTF-8. * Add from_encoding param to in_tail plugin Add new param "from_encoding". If "encoding" param is only specified, process is same way as ever to keep backword compatibility. If two params, "encoding" and "from_encoding" are specified, process uses ```String.encode!(to, from)```. * Specify type of encoding and from_encoding * Fix configuration encoding and from_encoding * Fix test code Add test pattern from_encoding is only specified. * Fix configure_encoding Use log instead of $log. Change the log messages so that the users are easy to understand. Return as soon as possible, if encoding parameters are not specified. * Use ConfigError instead of warnning-log I deleted 2nd log message. Changed to raise error if 'encoding' and 'from_encoding' paramters are bad cofiguration. * Fix test code Corresponding to ConfigError * Change implementation of encode statement For almost users, default setting should be faster than the setting is specified encoding.
Encoding parameter was added in_tail on v.0.12.24 and I used it.
When I reading logs encoded in CP932+LF, fluentd output warn log like a below.