Skip to content

Add hint to errors for missing fields based on Levenshtein distance #14466

@adriangb

Description

@adriangb

Is your feature request related to a problem or challenge?

We have something internally that parses SchemaError::FieldNotFound and calculates a normalized Levenshtein distance between field and valid_fields, returning a suggestion along the lines of Did you mean x, y, z? for valid_fields with a match of over 0.5.

Given that DataFusion already has a Levenshtein distance function it must have a dependency or implementation internally to calculate it, and with #13664 merged I think it should be possible to make this really nice:

SchemaError: unknown column `timesamp` at row 1 column X.
SELECT id, timesamp
           ^^^^^^^^
           Unknown column `timestamp`. Did you mean `timestamp`?
FROM table

This is something we could upstream if the project wants it, it's only a couple LOC.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions