Make yaml check error on duplicate keys #81

asottile · 2015-09-16T22:59:01Z

We make this mistake all the time:

# foo.yaml

'key1': false
'key2': true
'key1': true

ssbarnea · 2018-07-01T07:24:47Z

I am not sure about check-yaml implementation but I am using the yamllint linter which is probably the most advanced, and it does catch duplicate keys.

  - repo: https://github.com/adrienverge/yamllint.git
    rev: v1.11.1
    hooks:
      - id: yamllint
        files: \.(yaml|yml)$

Obviously that this should not be taken as a reason not to fix it official pre-commit hooks. I would myself support getting yamllint into main repo, maybe even removing the check-yaml one.

vin01 · 2018-07-12T19:01:55Z

Related: pyyaml issue (in NEW state but inactive).

Similar situation in Goland: go-yaml/yaml#154, go-swagger/go-swagger#1464

yamllint handles it well: adrienverge/yamllint@f6bab05

Instead of re-implementing the same in check-yaml, I would also like to replace the same with yamllint. Will be happy to send a PR if that makes sense.

asottile · 2018-07-12T19:03:45Z

yamllint is already supported via pre-commit so it probably just makes more sense to suggest yaml lint for this behaviour (instead of replacing / rewriting this hook)

ssbarnea · 2018-07-13T09:48:12Z

@asottile The reality is that at this moment yamllint hook is not as visible as the check-yaml on https://pre-commit.com/hooks.html page.

Just search for yaml on this page and imagine that you are a developer that is interested about pre-commit but never used it. I bet 9/10 of them will pick the check-yaml hook which is hosted in the main repository.

I would say that we should better find a way to avoid this confusion and hint users towards the current best yamllint maybe? I would personally prefer to have it directly in pre-commit

asottile · 2018-07-13T14:56:27Z

I suspect the answer to that is to "improve search".

I'm currently trying to reduce the dependencies in this repository as it is already quite slow to install and has had some conflict issues in the past. Introducing another dependency for yaml linting would go in the opposite direction. Additionally I want to give credit where credit is due by leading users to the repositories which wrote these great tools! Lastly, yaml lint and check-yaml have very different goals - - the former being a full blown linter and the latter just being a quick syntax check.

The eventual goal for this repository is to only deal with simple, unopinionated, language agnostic checks. It currently isn't quite doing that because of the python-specific checks but I hope to slowly move those out.

ssbarnea · 2018-07-13T15:43:31Z

Let's assume that pre-commit success will explode and it will be inundated with a huge number of hooks available. As each test is usually inexpensive people will be likely to want to add more to their projects, like 20-30 or even more. If each of this would point to another git repository installation would become a real issue.

I am in favour of having hooks defined in a centralized repository (registry?) (without loosing ability to have additional repos) but at the same time I do not expect this repository to be used for development of these hook. Hook dependencies should be installed per hook instead of per repository.

One way or another I think you will struggle with "simple", it will grow regardless your effort to keep the goal small. I think that the trick here is to decouple development and maintenance of hooks from their publication in a centralised registry.

There is also one thing that concerns me a lot, security: without a central "trustable" registry it would be too easy to add some cool hook today which could inject some nasty stuff into thousands of project two years from now. I am not saying that someone will do it by design, but once you have 10 git repositories pointing to 10 different individuals on github, you are creating an easy way for someone to hack you build process: they need to hack only one of these accounts. This is one place where package repositories are doing a little bit better job. For example if something bad happens on pypi, i can contact them by email or irc report the malicious item and in (hopefully) minutes it will be taken down. I doubt this will happen so quick with github repositories or other git servers.

One thing for sure, future is going to be interesting ;)

asottile · 2018-07-13T15:59:27Z

If each of this would point to another git repository installation would become a real issue.

I'm not convinced of this -- I have several thousand git repos cloned on my machine and it's not currently an issue. pre-commit also is smart about sharing these in a central cache (it's not an NxN problem, just an N problem).

I am in favour of having hooks defined in a centralized repository (registry?)

A good portion of the design is intentionally away from centralized control -- because it is entirely done through git it makes it very easy for users to pick and choose, modify in site-specific ways, or even run entirely off the grid with private repositories. I don't see this changing, it's one of the core design decisions. The closest to a "central registry" is the hooks.html page (and associated json file) -- yes the discoverability and such could be better here but for now it mostly serves its purpose.

Hook dependencies should be installed per hook instead of per repository.

This would actually make the disk space problem much much worse -- in almost all cases the installed environment's disk usage dwarfs the repository size.

There is also one thing that concerns me a lot, security

I mean, if you find a repository you don't want to use just remove it from your configuration :P. That said I do at some point want to add a hardening option (sha1 locking) but I haven't found a sound design for it that isn't just a merge-conflict-by-design.

asottile · 2018-12-28T22:18:55Z

#351 adds checking for duplicate keys

matt212 · 2019-01-31T16:32:00Z

can we disable the duplicate keys error message since anyof
swagger-api/swagger-ui#3803 is still not support
atleast i can use different payload with same endpoint ! and have a workaround on same

asottile · 2019-01-31T16:35:36Z

@matt212 could you provide a sample yaml file? the yaml specification forbids duplicate keys. the linked issue seems to have a list of maps (not duplicate keys)

matt212 · 2019-01-31T19:53:01Z

@asottile i have same endpoint with different payload structure and since we have cannot choose from multiple payload option oneOf swagger-api/swagger-ui#3803
hence one can use same endpoint twice with different payloads. however that swagger identifies as duplicate keys

asottile · 2019-01-31T20:16:42Z

This repo has nothing to do with swagger as far as I know, you'll need to provide a minimal example and invocation so I can see what you're even trying to do. Please include commands you're running and file contents and what you expect

matt212 · 2019-01-31T20:20:36Z

@asottile https://gist.github.com/matt212/97b996a8f9e98caac8afaf83d6f427e2

asottile · 2019-01-31T20:36:10Z

@matt212 according to the swagger spec that is invalid: https://swagger.io/docs/specification/paths-and-operations/

This also means that it is impossible to have multiple paths that differ only in query string [...] This is because OpenAPI considers a unique operation as a combination of a path and the HTTP method

matt212 · 2019-01-31T21:16:02Z

@asottile i knew about what my problem statement is if i have one endpoint and multiple payload how to use it in swagger or in openapi3 mode btw small workaround is "oneof" works in openapi3 but not in swagger 😞

asottile · 2019-01-31T21:19:17Z

🤷‍♂️

if you want a workaround you can use args: [--unsafe], but like, I don't see how this helps with swagger because it can't represent dupe keys in json anyway

matt212 · 2019-01-31T21:23:00Z

@asottile 😞😞

asottile closed this as completed Jul 12, 2018

asottile mentioned this issue Jul 19, 2018

New pre-commit hooks for Java, Kotlin, Golang and YAML formatting #307

Closed

kowalcj0 mentioned this issue Sep 20, 2019

yaml check complains about valid use of multiple anchors & aliases << #412

Closed

asottile mentioned this issue Feb 3, 2020

Move this hook definition to pre-commit-hooks shellcheck-py/shellcheck-py#6

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make yaml check error on duplicate keys #81

Make yaml check error on duplicate keys #81

asottile commented Sep 16, 2015

ssbarnea commented Jul 1, 2018

vin01 commented Jul 12, 2018

asottile commented Jul 12, 2018

ssbarnea commented Jul 13, 2018

asottile commented Jul 13, 2018

ssbarnea commented Jul 13, 2018

asottile commented Jul 13, 2018

asottile commented Dec 28, 2018

matt212 commented Jan 31, 2019 •

edited

Loading

asottile commented Jan 31, 2019

matt212 commented Jan 31, 2019 •

edited

Loading

asottile commented Jan 31, 2019

matt212 commented Jan 31, 2019

asottile commented Jan 31, 2019

matt212 commented Jan 31, 2019

asottile commented Jan 31, 2019

matt212 commented Jan 31, 2019

Make yaml check error on duplicate keys #81

Make yaml check error on duplicate keys #81

Comments

asottile commented Sep 16, 2015

ssbarnea commented Jul 1, 2018

vin01 commented Jul 12, 2018

asottile commented Jul 12, 2018

ssbarnea commented Jul 13, 2018

asottile commented Jul 13, 2018

ssbarnea commented Jul 13, 2018

asottile commented Jul 13, 2018

asottile commented Dec 28, 2018

matt212 commented Jan 31, 2019 • edited Loading

asottile commented Jan 31, 2019

matt212 commented Jan 31, 2019 • edited Loading

asottile commented Jan 31, 2019

matt212 commented Jan 31, 2019

asottile commented Jan 31, 2019

matt212 commented Jan 31, 2019

asottile commented Jan 31, 2019

matt212 commented Jan 31, 2019

matt212 commented Jan 31, 2019 •

edited

Loading

matt212 commented Jan 31, 2019 •

edited

Loading