Skip to content

Use of recursion rather than iteration in CompoundProcessor limits ingest pipeline length #84274

@droberts195

Description

@droberts195

A case where somebody was trying to import a CSV file with 3000 numeric fields revealed that it's possible to get a stack overflow exception when executing an ingest pipeline with many processors.

The ingest pipeline was ingest_pipeline.json, consisting of a CSV processor to parse the CSV followed by 3000 convert processors to convert the strings parsed from the CSV to numbers.

On executing this pipeline it fails with a stack overflow exception:

[2022-02-23T10:16:31,678][INFO ][o.e.c.m.MetadataCreateIndexService] [runTask-0] [test] creating index, cause [api], templates [], shards [1]/[1]
[2022-02-23T10:16:32,541][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [runTask-0] fatal error in thread [elasticsearch[runTask-0][write][T#12]], exiting
java.lang.StackOverflowError: null
        at java.lang.String.startsWith(String.java:2297) ~[?:?]
        at org.elasticsearch.ingest.IngestDocument$FieldPath.<init>(IngestDocument.java:877) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
        at org.elasticsearch.ingest.IngestDocument.getFieldValue(IngestDocument.java:102) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
        at org.elasticsearch.ingest.IngestDocument.getFieldValue(IngestDocument.java:122) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
        at org.elasticsearch.ingest.common.ConvertProcessor.execute(ConvertProcessor.java:185) ~[?:?]
        at org.elasticsearch.ingest.Processor.execute(Processor.java:41) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
        at org.elasticsearch.ingest.CompoundProcessor.innerExecute(CompoundProcessor.java:136) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
        at org.elasticsearch.ingest.CompoundProcessor.lambda$innerExecute$1(CompoundProcessor.java:154) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
        at org.elasticsearch.ingest.Processor.execute(Processor.java:46) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
        at org.elasticsearch.ingest.CompoundProcessor.innerExecute(CompoundProcessor.java:136) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
        at org.elasticsearch.ingest.CompoundProcessor.lambda$innerExecute$1(CompoundProcessor.java:154) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
        at org.elasticsearch.ingest.Processor.execute(Processor.java:46) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
        at org.elasticsearch.ingest.CompoundProcessor.innerExecute(CompoundProcessor.java:136) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
        at org.elasticsearch.ingest.CompoundProcessor.lambda$innerExecute$1(CompoundProcessor.java:154) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
        at org.elasticsearch.ingest.Processor.execute(Processor.java:46) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
        at org.elasticsearch.ingest.CompoundProcessor.innerExecute(CompoundProcessor.java:136) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
        at org.elasticsearch.ingest.CompoundProcessor.lambda$innerExecute$1(CompoundProcessor.java:154) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
        at org.elasticsearch.ingest.Processor.execute(Processor.java:46) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
        at org.elasticsearch.ingest.CompoundProcessor.innerExecute(CompoundProcessor.java:136) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
        at org.elasticsearch.ingest.CompoundProcessor.lambda$innerExecute$1(CompoundProcessor.java:154) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
        at org.elasticsearch.ingest.Processor.execute(Processor.java:46) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
        at org.elasticsearch.ingest.CompoundProcessor.innerExecute(CompoundProcessor.java:136) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
        at org.elasticsearch.ingest.CompoundProcessor.lambda$innerExecute$1(CompoundProcessor.java:154) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
        at org.elasticsearch.ingest.Processor.execute(Processor.java:46) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
.
.
.

(The stack trace continues with the same 3 calls over and over again.)

Could CompoundProcessor.innerExecute be changed to use iteration rather than recursion to avoid this?

The sample CSV file that goes with the ingest pipeline is test.csv.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions