Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SEDONA-670] Fix GeoJSON reader for DBR #1662

Merged
merged 1 commit into from
Oct 30, 2024

Conversation

Kontinuation
Copy link
Member

Did you read the Contributor Guide?

Is this PR related to a JIRA ticket?

What changes were proposed in this PR?

This PR works around an internal method incompatibility between open-source Apache Spark and DBR. The readFile method defined by open-source Apache Spark is:

def readFile(
  conf: Configuration,
  file: PartitionedFile,
  parser: JacksonParser,
  schema: StructType): Iterator[InternalRow]

While this function on DBR takes an extra Option[_] parameter:

def readFile(
  conf: Configuration,
  file: PartitionedFile,
  parser: JacksonParser,
  schema: StructType,
  badRecordsWriter: Option[BadRecordsWriter]): Iterator[InternalRow]

We workaround this problem by detecting the number of parameters of the readFile function using reflection, and pass the appropriate parameters to them.

How was this patch tested?

Passing existing tests and manually tested on DBR 15.4 LTS.

Did this PR include necessary documentation updates?

  • No, this PR does not affect any public API so no need to change the documentation.

@Kontinuation Kontinuation marked this pull request as ready for review October 30, 2024 06:31
@jiayuasu jiayuasu linked an issue Oct 30, 2024 that may be closed by this pull request
@jiayuasu jiayuasu added this to the sedona-1.7.0 milestone Oct 30, 2024
@jiayuasu jiayuasu added the bug label Oct 30, 2024
@jiayuasu jiayuasu merged commit bf11a3c into apache:master Oct 30, 2024
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unable to Load Geojson File using Sedona Context in Databricks
2 participants