Add support for DatetimeMS in MongoDB #3476
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes #3444
This PR adds support for
datetimevalues that cannot be handled by python.Before this change instances of
datetimethat are outside of[datetime.min; datetime.max]range will raise an error:This PR makes use of
DatetimeConversion.DATETIME_AUTOconversion setting.In practice it will mean that:
datetimeobjects that can be Python datetimes will be parsed as such and sent to Elasticsearch as datetimesdatetimeit will be serlaised as alongThis, though, can be problematic due to internal Elasticsearch type conversion (something that is already a problem with MongoDB connector to some extent):
Elasticsearch allows conversion of
longtodatetime, but does not allow the opposite.Demonstration:
In practice it means that if a field in MongoDB collection contains both valid and invalid datetimes (from python's perspective) then the ingestion might fail depending on order of insertion, because the mapping are dynamic.
For example, if first record will contain an out-of-range datetime, then the field will be inferred as
long, and next time a in-range datetime is met the ingestion will fail, because instances ofdatetimecannot be converted tolongin Elasticsearch.The way to fix it would be to manually define the mapping for the index, which is imperfect.
Alternative solution would be introduction of a flag in MongoDB connector configuration that exposes
datetime_conversionoption. Additionally we can have a flag that will be "treat datetimes as longs" if needed, but it looks like an overkill.Checklists
Pre-Review Checklist
config.yml.example)v7.13.2,v7.14.0,v8.0.0)Release Note
MongoDB connector: set default
datetime_conversiontoDatetimeConversion.DATETIME_AUTOto try to prevent errors when receiving out-of-rangedatetimevalues from MongoDB. See https://www.mongodb.com/docs/languages/python/pymongo-driver/current/data-formats/dates-and-times/#handling-out-of-range-datetimes for additional information.