Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Event sent between two LS with Lumberjack with accentuated char, makes all field escaped. #2

Closed
jordansissel opened this issue May 17, 2015 · 1 comment

Comments

@jordansissel
Copy link
Contributor

(This issue was originally filed by @plarivee at elastic/logstash#1807)


A log line with an accentuated character messed up
all tags and field.

Setup is host ( ls-forwarder) => logstash ( shipper ) => logstash (
receiver ) => redis

So here is the case :

On the server:

logstash-forwarder config: 
#################### 
{ 
  "network": { 
   "servers": [ "XXX.XXX.XXX.254:5043" ], 
   "ssl ca": "mycert.crt", 
   "timeout": 15 
  }, 

  "files": [ 
    { "paths": [ 
                "/var/log/syslog", 
        "/var/log/messages", 
        "/var/log/*.log" 
      ], 
      "fields": { "type": "syslog", "host": "app1.mydomain.com" } 
    } 
  ] 
} 
######################## 

Sending test log line::

app1: # logger TEST TEST TEST ÉÉÉÉÉÉÉ ééééééé 

Now on the Shipper receiving the log from the server:

Shipper config :
###############
input {
lumberjack {
port => 5043
ssl_certificate => "mycert.crt"
ssl_key => "mycert.key"
add_field => {
"domain" => "mydomain.com"
"log-type" => "app-production"
}
tags => ["production"]
}
}
output {
lumberjack {
hosts => "xxx.xxx.xxx.xxx"
port => 5043
ssl_certificate => "theothercert.crt"
codec => "json"
}
}
################

stdout debug ::

{ 
       "message" => "Aug 12 09:32:33 app1 root: TEST TEST TEST ÉÉÉÉÉÉÉ 
ééééééé", 
      "@version" => "1", 
    "@timestamp" => "2014-08-12T13:32:34.386Z", 
          "tags" => [ 
        [0] "production" 
    ], 
        "domain" => "mydomain.com", 
      "log-type" => "app-production", 
          "file" => "/var/log/syslog", 
          "host" => "app1.mydomain.com", 
        "offset" => "13833", 
          "type" => "syslog" 
} 

Now, the shipper sends it to the receiver :
shipper => receiver ( lumberjack (logstash) => lumberjack (logstash) )

Receiver config :

################### 
input { 
  lumberjack { 
    port => 5043 
    ssl_certificate => "receiver01.crt" 
    ssl_key => "receiver01.key" 
        codec => "json" 
  } 
} 
output { 
  redis { host => ["XXX.XXX.XXX.101" ,"XXX.XXX.XXX.102" ] shuffle_hosts => 
true data_type => "list" key => "logstash"} 
} 
###################### 

stdout debug ::

{ 
       "message" => "{\"message\":\"Aug 12 09:32:33 app1 root: TEST TEST TEST ÉÉÉÉÉÉÉ ééééééé\",\"@version\":\"1\",\"@timestamp\":\"2014-08-12T13:32:34.386Z\",\"tags\":\"production\"],\"domain\":\"mydomain.com\",\"log-type\":\"app-production\",\"file\":\"/var/log/syslog\",\"host\":\"app1.mydomain.com\",\"offset\":\"13833\",\"t", 
          "@version" => "1", 
        "@timestamp" => "2014-08-12T13:32:35.008Z" 
} 

All fields / tags get their double quotes escaped and they are not treated
like fields anymore and part of the message. If there is not 'é','É', 'à'
etc... everything works #1. Tried without the codec config and with
json_lines instead.

without the codec => json, I lose all fields / tags from the message
with codec => json_lines , it's the same behavior of the fields getting
escaped.

Looks like an encoding / accentuated char thing being not treated well.

@ph
Copy link
Contributor

ph commented May 18, 2015

this was fixed and released

@ph ph closed this as completed May 18, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants