Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow re-writes to metadata JSON files #2392

Open
npatki opened this issue Feb 25, 2025 · 0 comments
Open

Allow re-writes to metadata JSON files #2392

npatki opened this issue Feb 25, 2025 · 0 comments
Labels
feature:metadata Related to describing the dataset feature request Request for a new feature

Comments

@npatki
Copy link
Contributor

npatki commented Feb 25, 2025

Problem Description

Currently, the metadata.save_to_json function is meant to throw an error if you provide a filename that already exists. We did this so that users wouldn't accidentally overwrite existing metadata files.

However, this become an issue if you want to read from a file, update the metadata (to correct it), and then write it back to the same file.

from sdv.metadata import Metadata

metadata = Metadata.load_from_json(filepath='metadata.json')
metadata.update_column(
    table_name='guests',
    column_name='guest_email',
    sdtype='email')
metadata.save_to_json(filepath='metadata.json')
ValueError: A file named 'metadata.json' already exists in this folder. Please specify a different filename.

Expected behavior

Allow users a parameter that bypasses this check. It should be off by default (maintaining status quo), but users could always toggle it to allow re-saving to the same file.

To keep consistency with AI connectors, we can call this parameter mode. Possible values are:

  • (default) 'write': The status quo. AKA write the metadata into a new file, and raise an error if the file already exists.
  • 'overwite': Write the metadata into the file. If it already exists, re-write the file
metadata.save_to_json(
  filepath='metadata.json',
  mode='overwrite'
)
@npatki npatki added feature request Request for a new feature feature:metadata Related to describing the dataset labels Feb 25, 2025
@gsheni gsheni changed the title Allow re-writes to metatadata JSON files Allow re-writes to metadata JSON files Feb 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature:metadata Related to describing the dataset feature request Request for a new feature
Projects
None yet
Development

No branches or pull requests

1 participant