Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rationalize quoting configs + properties #2986

Open
jtcohen6 opened this issue Dec 31, 2020 · 13 comments
Open

Rationalize quoting configs + properties #2986

jtcohen6 opened this issue Dec 31, 2020 · 13 comments
Labels
enhancement New feature or request paper_cut A small change that impacts lots of users in their day-to-day

Comments

@jtcohen6
Copy link
Contributor

jtcohen6 commented Dec 31, 2020

Describe the feature

Picks up from issues like #2468 and #2975, which are narrower in scope and offer more straightforward near-term fixes

  • Why do we call it quoting when configuring database/schema/identifier names, but then quote when describing properties of column names?
  • Why does each adapter have to implement its quoting character sort-of twice? (Adapter quoting should use self.Relation.quote_character #2243)
  • There's also quote_columns, which is a seed-only config item that lives on its own level but surely belongs inside the quoting config item (or will it be quote?!) as columns
  • Not to mention quoted, which while not itself a config, returns the quoted version of a column name from a Relation based on the configs above

Instead, we should have a single config/property, and I think it should be quote. This would take over from the current project-level quoting config:

quote:
  database: true|false    # or `project` on dbt-bigquery
  schema: true|false      # or `dataset` on dbt-bigquery
  identifier: true|false
  columns: true|false

The quote: {columns: true} would also replace quote_columns as a bespoke config for seeds. If that config is specified in dbt_project.yml, it can be superseded by:

  • setting quote: {} inside the config() block for a specific moel
  • quote: true|false set for a specific column in models/*.yml (it's implied that this really means quote: {column: true}
  • in a post-Set configs in schema.yml files #2401 world, a model can set its quote: {} config within models/*.yml, too

If quote is not set, it falls back to the default behavior of the adapter plugin, which also sets the character used for quoting (almost always " or `).

Questions

Here's what we have in the docs FAQs today for sources:

By default, dbt will not quote the database, schema, or identifier for the source tables that you've specified.

Should sources start respecting project-level quote settings? Or they continue to act independently, but we should enable turning this config-property on or off for all sources in dbt_project.yml:

sources:
  quote:
    schema: true

Describe alternatives you've considered

Retaining all of these configs/properties/adapter methods and documenting them exceptionally well so as to avoid confusion

Additional context

This isn't specific to any one database, though it is likely most helpful on databases that support special characters if quoted (Postgres, Redshift) or are particularly sensitive to quoting (Snowflake).

There's a round-up of all the known documentation related to quoting in dbt-labs/docs.getdbt.com#3518.

@jtcohen6 jtcohen6 added enhancement New feature or request 1.0.0 Issues related to the 1.0.0 release of dbt labels Dec 31, 2020
@leahwicz
Copy link
Contributor

  • Keep accepting the old way but don't advertise (don't break how users are currently using it)
  • Tricky- getting the inheritance to work
  • Related to the configs vs properties battle

@nathaniel-may
Copy link
Contributor

  • What is the exit criteria for this issue?
    • A new project-level config called "quote" that matches the behavior of the current quote and quoting configs.
    • It can be used in config blocks to override project-level quote settings in old or new style.
  • What are the high-level items of the work that need to be done (i.e. create x, split out y, etc.)
    • Add a new config to yaml parsing.
    • Add a new config to the internal config logic
    • Make sure overrides work (we probably don't get this for free.)
    • Testing
  • What are the open questions on this issue that still need answers?
    • Are is there anything this would do beyond what's expressible today or is it just ergonomic? A: Just ergonomic.
    • would we deprecate the old way of doing these things or just let them both ride? A: Not initially.
    • How do we want to handle sources? (question in ticket since they seem to work independently)
  • Are there blockers/prerequisites to starting this work?
    • I think it could be started.
    • It's related to configs vs property work so make sure those don't overlap. neat inheritance of these configs would need to take into account that some quoting stuff is dbt configs. some quoting stuff is dbt properties.

@github-actions
Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

@github-actions github-actions bot added the stale Issues that have gone stale label May 11, 2022
@jtcohen6 jtcohen6 removed the stale Issues that have gone stale label May 11, 2022
@jtcohen6
Copy link
Contributor Author

I still care about this one :)

@adamcunnington-mlg
Copy link

I really care about this one!

All I want to do is ensure that dbt quotes all column names that I reference or create (via sql selects) and my only option right now is to explicitly define every single column in every single model. I've not actually tried doing that but I suspect that will only quote OUTPUT columns in a model, not columns that I select during my sql.

@github-actions
Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

@github-actions
Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

@github-actions github-actions bot added the stale Issues that have gone stale label Jan 14, 2024
@github-actions github-actions bot removed the stale Issues that have gone stale label Feb 27, 2024
@dbeatty10
Copy link
Contributor

Here's a couple other issues related to quoting (specifically about applying proper escaping prior to quoting):

@dbeatty10
Copy link
Contributor

@gwenwindflower
Copy link

this is in adapters now, but also adding that seeds are not consistently quoted and it would be cool if they were also a config option under the proposed quote config.

dbt-labs/dbt-adapters/issues/178

@dbeatty10
Copy link
Contributor

@alison985
Copy link

Adding ~quoting related issue for snapshots. #10356 cc: @jeremyyeo

@graciegoheen graciegoheen added the paper_cut A small change that impacts lots of users in their day-to-day label Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request paper_cut A small change that impacts lots of users in their day-to-day
Projects
None yet
Development

No branches or pull requests

8 participants