Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pydantic model for dvc.yaml #9606

Closed
wants to merge 2 commits into from
Closed

Conversation

skshetry
Copy link
Member

Can be useful on comparing with voluptuous schema. Is compatible with pydantic both v1 and v2.

For now, it's only for playing around, and does not replace voluptuous based schema or dvcyaml-schema that is used in vscode.

cc @aguschin

@codecov
Copy link

codecov bot commented Jun 14, 2023

Codecov Report

Patch coverage has no change and project coverage change: -0.21 ⚠️

Comparison is base (97348cf) 90.53% compared to head (11348cb) 90.32%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #9606      +/-   ##
==========================================
- Coverage   90.53%   90.32%   -0.21%     
==========================================
  Files         471      472       +1     
  Lines       36040    36132      +92     
  Branches     5185     5189       +4     
==========================================
+ Hits        32628    32636       +8     
- Misses       2818     2902      +84     
  Partials      594      594              
Impacted Files Coverage Δ
dvc/pydantic_schema.py 0.00% <0.00%> (ø)

... and 2 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@efiop
Copy link
Contributor

efiop commented Jun 14, 2023

@skshetry Is this intended to be ready for review? What's the purpose/user for adding this on the side as dead code?

@skshetry
Copy link
Member Author

Is this intended to be ready for review? What's the purpose/user for adding this on the side as dead code?

Eventually, I want this to be a foundation for parsing dvc.yaml/dvc.lock/.dvc files.
voluptuous based schema is getting complicated to maintain and is very hard to understand.
And that whole code is just for validation. We still need separate data structures, parsing logic, etc.

We also have some needs in parametrization, where we want to know the type of some fields at runtime, which is not possible today. And we are already maintaining something like this in dvcyaml-schema.

That's a lot of things, which pydantic may or may not be able to fulfil, but I wanted to at least experiment. Also wanted to test pydantic2 and compare performance with voluptuous based schema.

Is this intended to be ready for review

This is more or less ready for review. The validation errors may be confusing, but that's not the point of this PR.

@efiop
Copy link
Contributor

efiop commented Jun 14, 2023

@skshetry Can we just add this once we actually are going to use it? I'm really hesitant to add such a dead code without any purpose. Seems like you can do the testing that you want without merging this into the upstream. Or do you need it in upstream specifically?

@skshetry
Copy link
Member Author

Can we just add this once we actually are going to use it? I'm really hesitant to add such a dead code without any purpose. Seems like you can do the testing that you want without merging this into the upstream. Or do you need it in upstream specifically?

If it's not going to get merged here, this is going nowhere, and will have the same fate as #5531. This is experimental for now, and the changes that are needed are too many that it is hard to keep up in my own fork. Neither we will be able to convince spending time on something that is not upstream (it's already hard to do that in upstream).

It is easy to keep parity between two schema for the time being. I won't make any promises here, and if it ends up going nowhere, it'll definitely be a replacment for dvcyaml-schema.

@efiop
Copy link
Contributor

efiop commented Jun 15, 2023

@skshetry It looks like a separate code, what's the problem in maintaining it? Doesn't look like you are touching any existing code even.

No one except for you will probably even remember to update it for parity. Overall feels like dead code, hence why I'm trying to figure out what the plan is for this. We have some dead code already like in-module testing snippets, it doesn't seem right to add more dead code for no strong purpose.

If you think that this is worth migrating to, let's discuss it and plan it. I don't remember discussing #5531 during planning. If this is important we can definitely allocate time for it.

@dberenbaum
Copy link
Contributor

I see positive reactions from @aguschin and @daavoo here. Any thoughts from you or the rest of @iterative/dvc on this one?

I'm adding it to items to discuss in refinement in case it doesn't get resolved by then so it doesn't get stuck in limbo.

@skshetry
Copy link
Member Author

skshetry commented Jun 16, 2023

No one except for you will probably even remember to update it for parity. Overall feels like dead code

Tbh I have been the de-facto maintainer of the file parsing in dvc and in dvcyaml-schema, which should not be this way, but unfortunately that is the case.

I don't quite understand the "dead code" argument, I think I have been honest with that this being experimental, and the direction I want the subsystem to take in.

A lot of things that needs to be solved here are technical. If it was a solved problem, I would have come with a giant PR. Also, we have a lot of complexity and UI requirements (think of nicer error messages that we return today) that it is not possible to do in one shot. I am being conservative here.

It'll gradually be worked on. (Isn't everything a dead code that is in an active development? 😄

I'd prefer to discuss this on technical merits.

@efiop
Copy link
Contributor

efiop commented Jun 16, 2023

@skshetry Your work is appreciated, it really is. I'm just trying to understand the plan for this, as I have no idea where this is going to go. Can you lay down a plan in an issue or something with purpose and steps to achieve it? (the steps could be vague, that's fine). If you only want to migrate from voluptuous to pydantic, shouldn't that be doable reasonably fast, as it is just swapping one library for another? Or am I not understanding the complexity or scope or other changes?

@skshetry
Copy link
Member Author

Closing the PR for now.

@skshetry skshetry closed this Sep 28, 2023
@skshetry skshetry deleted the pydantic2 branch September 28, 2023 13:43
@SoyGema
Copy link

SoyGema commented Oct 20, 2023

Under a context that is not relevant to this conversation and without being aware of this specific initiative , maybe worth citing that found that voluptuous is no longer actively maintained and failing with pip-rating . This could nudge the decision beyond any DVC functionality and add a maintenance dimension to the discussion.

Captura de pantalla 2023-10-20 a las 16 41 26

@SoyGema SoyGema mentioned this pull request Oct 20, 2023
2 tasks
@dacbd
Copy link

dacbd commented Oct 24, 2023

FTR, I'm also in favor of consolidating validation tooling around pydantic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants