Fields with lenght > 32766 bytes #1592

delfer · 2015-11-30T15:15:49Z

I am using JSON extractor and one of my fields can be > 16383 UTF-8 chars.
Elasticsearch give me the error:

IllegalArgumentException[Document contains at least one immense term in field="msg.response" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms. The prefix of the first immense term is: '[110, 116, 114, 121, 62, 60, 107, 101, 121, 62, 115, 101, 114, 118, 105, 99, 101, 95, 116, 121, 112, 101, 60, 47, 107, 101, 121, 62, 60, 118]...', original message: bytes can be at most 32766 in length; got 38507]; nested: MaxBytesLengthExceededException[bytes can be at most 32766 in length; got 38507];

As result - log entries lost.
It can be multiply ways to resolve:

Do not index fileds with size > 32766 bytes
Cut this fileds (May be with some cut extractor)
Divide one long field into multiple short

I fond only one 'dummy' solution:

{
  "extractors": [
    {
      "condition_type": "regex",
      "condition_value": "^.{16383,}$",
      "converters": [],
      "cursor_strategy": "cut",
      "extractor_config": {
        "regex_value": "^(.{0,16383})"
      },
      "extractor_type": "regex",
      "order": 0,
      "source_field": "msg.response",
      "target_field": "response0",
      "title": "response cutter 0"
    },
    ...
   {
      "condition_type": "regex",
      "condition_value": "^.{16383,}$",
      "converters": [],
      "cursor_strategy": "cut",
      "extractor_config": {
        "regex_value": "^(.{0,16383})"
      },
      "extractor_type": "regex",
      "order": 0,
      "source_field": "msg.response",
      "target_field": "response9",
      "title": "response cutter 9"
    }
  ],
  "version": "1.2.2 (91c7822)"
}

Please, provide any usefull solution. Thank you.

The text was updated successfully, but these errors were encountered:

joschi · 2015-11-30T15:30:48Z

Duplicate of #873.

joschi closed this as completed Nov 30, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fields with lenght > 32766 bytes #1592

Fields with lenght > 32766 bytes #1592

delfer commented Nov 30, 2015

joschi commented Nov 30, 2015

Fields with lenght > 32766 bytes #1592

Fields with lenght > 32766 bytes #1592

Comments

delfer commented Nov 30, 2015

joschi commented Nov 30, 2015