-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-46568][PYTHON] Make Python data source options a case-insensitive dictionary #44564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-46568][PYTHON] Make Python data source options a case-insensitive dictionary #44564
Conversation
0cf2162 to
9075dab
Compare
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix Python flake8 check failure, @allisonwang-db .
flake8 checks failed:
./python/pyspark/sql/datasource.py:19:1: F401 'typing.final' imported but unused
from typing import final, Any, Dict, Iterator, List, Sequence, Tuple, Type, Union, TYPE_CHECKING
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM.
|
linter failure seems related. let's fix up. |
|
@HyukjinKwon fixed! |
|
Merged to master for Apache Spark 4. |
…ive dictionary
### What changes were proposed in this pull request?
This PR updates the `options` field to use a case-insensitive dictionary to keep the behavior consistent with the Scala side (which uses `CaseInsensitiveStringMap`). Currently, `options` are stored in a normal Python dictionary which can be confusing to users. For instance:
```python
class MyDataSource(DataSource):
def __init__(self, options):
self.api_key = options.get("API_KEY") # <- This is None
spark.read.format(..).option("API_KEY", my_key).load(...)
```
Here, `options` will not have this "API_KEY" as everything is converted to lowercase on the Scala side.
### Why are the changes needed?
To improve usability.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
New unit tests
### Was this patch authored or co-authored using generative AI tooling?
No
Closes apache#44564 from allisonwang-db/spark-46568-ds-options.
Authored-by: allisonwang-db <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
What changes were proposed in this pull request?
This PR updates the
optionsfield to use a case-insensitive dictionary to keep the behavior consistent with the Scala side (which usesCaseInsensitiveStringMap). Currently,optionsare stored in a normal Python dictionary which can be confusing to users. For instance:Here,
optionswill not have this "API_KEY" as everything is converted to lowercase on the Scala side.Why are the changes needed?
To improve usability.
Does this PR introduce any user-facing change?
No
How was this patch tested?
New unit tests
Was this patch authored or co-authored using generative AI tooling?
No