Skip to content

[Automatic Import] Safely access non-identifier fields in Painless if context#205220

Merged
ilyannn merged 9 commits intoelastic:mainfrom
ilyannn:auto-import/quote-fields
Jan 3, 2025
Merged

[Automatic Import] Safely access non-identifier fields in Painless if context#205220
ilyannn merged 9 commits intoelastic:mainfrom
ilyannn:auto-import/quote-fields

Conversation

@ilyannn
Copy link
Copy Markdown
Contributor

@ilyannn ilyannn commented Dec 27, 2024

Release Note

Fixes how Automatic Import generates accesses for the field names that are not valid Painless identifiers.

Summary

Closes #205024

We add utility functions to access nested fields in Painless in a safe way and modify the existing ECS generation logic to use them.

This access happens using the object?.get("field") syntax for complex cases, while falling back to the familiar ctx.field for the cases where field is a valid Painless identifier and ctx is known to be non-nullable.

In the future this should be taken care of by the new $('a.b.c', defaultValue) accessor function (elastic/elasticsearch#101274). For now, it's not available:

{
  "error": {
...
    "type": "script_exception",
    "reason": "compile error",
    "processor_type": "set",
    "script": "$('cloud_provider', '') == 'azure'",
    "lang": "painless",
    "caused_by": {
      "type": "illegal_argument_exception",
      "reason": "invalid shortcut [$] for [field]; ensure [field] exists in this context"
    }
  },
  "status": 400
}

This takes care of the compile-time correctness of field accesses. Note that it is still possible for generated pipelines to fail in runtime on unexpected input, e.g. accessing a nested field a.b fails for the document of the form {"a": "string"}.

Testing

The two utility files we add are fully covered with unit tests:
image.

Here's the generated package for logs containing only the @timestamp field:

  - date:
      if: ctx.ai_202501022326__timestamp?.logs?.get("@timestamp") != null
      tag: date_processor_ai_202501022326__timestamp.logs.@timestamp
      field: ai_202501022326__timestamp.logs.@timestamp
      target_field: '@timestamp'
      formats:
        - yyyy-MM-dd'T'HH:mm:ss.SSSXXX
        - ISO8601

Checklist

  • Unit or functional tests were updated or added to match the most common scenarios
  • The PR description includes the appropriate Release Notes section, and the correct release_note:* label is applied per the guidelines

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Fixes for quality problems that affect the customer experience Feature:AutomaticImport release_note:fix Team:Security-Scalability Security Integrations Scalability Team v8.18.0 v9.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Automatic Import] Quote fields in Painless scripts

4 participants