Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate discriminator? #2143

Closed
tedepstein opened this issue Feb 16, 2020 · 104 comments
Closed

Deprecate discriminator? #2143

tedepstein opened this issue Feb 16, 2020 · 104 comments
Labels
discriminator Moved to Moonwalk Issues that can be closed or migrated as being addressed in Moonwalk
Milestone

Comments

@tedepstein
Copy link
Contributor

tedepstein commented Feb 16, 2020

I just created an example to illustrate what I think is idiomatic use of discrimator, so I could help answer #2141. I find it helpful to use JSON Schema Lint so I can validate in real-time. To make sure the discriminating logic worked correctly in a standard JSON Schema validator (not aware of OAS discriminator), I used standard JSON Schema keywords to duplicate that logic.

This begs the question: now that 3.0 supports a large-enough subset of JSON Schema to describe discriminated subtypes, and 3.1 is planned to support full JSON Schema, do we still need discriminator?

@handrews mentions the same idea here in #2031, so I think the idea deserves its own issue for future version planning.

I can see that discriminator might have some value for code generators. It might be easier for a code generator to be told explicitly to discriminate based on a given property, rather than relying on a JSON Schema validator to identify the matched subtype, or recognizing a pattern involving oneOf, enum (or const), etc.

But discriminator, as it's defined, is kind of problematic. @handrews pointed out that it ignores some available JSON Schema semantics. And I've observed, generally, that the documentation is difficult (for me) to follow, and seems to leave some questions unanswered. Without trying to create a comprehensive list here:

  • I have doubts about the combination of mapping and default name matching shown in the last example of this section.
  • It seems that mapping is supposed to be supplemental to name matching, rather than replacing it. In that case, is there a way for the discriminator to ensure that the discriminator property value is one of a known list of subtypes? Do we always need a separate oneOf to validate this?
  • Does name matching (without mapping) assume it's going to find the named subtype schema in #/components/schemas, or is there some other expectation?

Maybe this is more weight than we need to carry, and we'd be better off leaving this problem to JSON Schema and some new vocabulary for code generation, O-O type system mapping, or something like that.

@mewalig
Copy link

mewalig commented Feb 16, 2020

FYI-- the JSON Schema Lint link is incorrect or isn't working https://github.com/OAI/OpenAPI-Specification/issues/jsonschemalint.com

@tedepstein
Copy link
Contributor Author

Thanks, @mewalig. Fixed.

@mewalig
Copy link

mewalig commented Feb 16, 2020

Are there any simple examples of how "discriminator" in OpenAPI would compare to the analogous schema as described using JSON Schema?

Also, do any validators exist today that will properly enforce either the OpenAPI discriminator or the JSON Schema discriminator equivalent? I understand that this repo is not intended for tools, but in evaluating whether, out of two standards, one should be ditched, it would be nice to be able to play around a bit more with each in order to form a better opinion

@tedepstein
Copy link
Contributor Author

Are there any simple examples of how "discriminator" in OpenAPI would compare to the analogous schema as described using JSON Schema?

If you take my example schema from #2141 and remove the discriminator, that's how you'd do it in JSON Schema.

Also, do any validators exist today that will properly enforce either the OpenAPI discriminator or the JSON Schema discriminator equivalent?

Any compliant JSON Schema validator should be able to enforce a JSON Schema like the one I posted.

I don't know about tools that enforce OpenAPI discriminator, but it sounds like you have found some already. You might want to try them again with the schema I provided (modified back to your original properties property name, otherwise you'd need to adjust your test examples).

@mewalig
Copy link

mewalig commented Feb 16, 2020

Yes, absolutely, please get rid of discriminator. My reasons explained in more detail at #2141

@handrews
Copy link
Member

@tedepstein thanks for filing this!

@mewalig a possible way to solve this problem based on adding some extension keywords to the most recent draft of JSON Schema might look like the following.

First, let's just look at how to make inheritance work better at all:

{
    "oneOf": [
        {   
            "$ref": "#/$defs/child1"
        },  
        {   
            "$ref": "#/$defs/child2"
        },  
        {   
            "$ref": "#/$defs/child3"
        }
    ],  
    "unevaluatedProperties": false,
    "$defs": {
        "base": {
            "className": "TheBaseClass",
            "type": "object",
            ... 
        },  
        "child1": {
            "$ref": "#/$defs/base",
            "classRelation": "is-a",
            "className": "Foo",
            ... 
        },  
        "child2": {
            "$ref": "#/$defs/base",
            "classRelation": "is-a",
            "className": "Bar",
            ...
        },  
        "child3": {
            "$ref": "#/$defs/base",
            "classRelation": "is-a",
            "className": "SomethingElse",
            ... 
        }   
    }   
}

Here we've introduced two new keywords, className and classRelation. These are annotations rather than assertions, meaning they have no impact on validation.

className tells code generators that each of the definitions is intended to represent an object-oriented class, and gives that class an explicit name (rather than relying on the defs names, e.g. "child1", "child2", etc., which are implementation details of the schema, not interfaces).

classRelation is a semantic clarifier of the adjacent $ref, indicating that the $ref target is to be treated as the base class for the current class by code generators. Because $ref doesn't always mean inheritance, so when we want it to do that, we should be explicit about it.

Code Generation

When a code generator looks at this, it is going to statically analyze the schema without an instance. It will ignore the oneOf, but notice that there are four classes here, and that three of them designate the fourth to be their base class. That is enough information to generate the correct classes. The oneOf is superfluous for code generation, which is good because you might just have a schema document consisting of the $defs part, and only use them with a oneOf somewhere else. I think this part is pretty straightforward, let me know if it is not clear.

Validation

When you validate an instance against this schema, the oneOf becomes important. Only one of those oneOf branches (at most) will validate. So if your validator supports annotations in validation output, and it matches the 2nd child entry (Bar), you'll get the following information, more or less:

  • a className annotation of "Bar" from /oneOf/1/$ref/className
  • a classRelation annotation of "is-a" from /oneOf/1/$ref/classRelation
  • a className annotation of "TheBaseClass" from /oneOf/1/$ref/$ref/className

Note that the "from" pointers are showing the dynamic scope, the path traversed at runtime including references. Note also that the names "base", "child1", "child2", etc. from under $defs do not appear, because we reach those schemas through a $ref dynamically, not by static traversal of the schema.

Instantiation

If we're trying to instantiate a class from a JSON instance, first we validate it and get the above annotation outputs. We notice that there are two className values attached. But one of those is accompanied by a classRelation annotation in the same schema object.

So we immediately know that this JSON instance data is a valid "Bar", and that a "Bar" "is-a" ...
That's probably enough information to instantiate the thing right there, as the code generation should have already taken inheritance into account, and you know the data is valid.

If we want to make sure the other className is not what we really should be using, we look a little deeper. The dynamic schema path prefix of these "Bar" and "is-a" annotations is /oneOf/1/$ref. classRelation works next to a $ref, so we look for /oneOf/1/$ref/$ref, and sure enough we find that it is also a named class, "TheBaseClass." So now we know that "Bar" "is-a" "TheBaseClass".

We notice that there aren't any other className annotations in these results, and we obviously want to instantiate the derived class (that's what you use oneOf for! otherwise you'd just $ref the base class). So we're done, and we can be certain that we want to instantiate "Bar" and not "TheBaseClass."

Optimization

Technically, this is all we need. Whether you look at this schema statically (as a code generator would) or dynamically (as an instantiator would), you can figure out everything you need from your task. And none of it gets in the way of validation (unknown keywords are ignored by the validator).

One of the reasons OAS has discriminator is that it can be confusing to figure out how data maps to subclasses. And potentially expensive to do that validation. We could come up with another keyword that would optimize that, although honestly I'd want to understand the use case a bit more. An advantage of just going with className and classRelation (or something like them- I pretty much made these up on the spot and have not thought them through) is that it works with arbitrarily complex subclass differences. You aren't forced to have a type enumeration field.

But you could definitely have one. I started trying to write it out, but it gets pretty annoying because either you end up with magical behavior (the way the value of the field suddenly has to match some other thing in the schema structure, and has to validate, which means the same value has to appear in two different places and be kept in sync), or it provides minimal performance benefits because you still have to do some validation.

I'd want to understand what needs to be optimized in this case before throwing in more keywords.

@liquidaty
Copy link

liquidaty commented Feb 17, 2020

Thanks for the thoughtful response.

I'd want to understand what needs to be optimized in this case before throwing in more keywords.

Just to clarify, my suggestion would not be to add any keywords, but rather, to remove them-- in particular, to remove "discriminator" and everything associated with it.

Using the schema provided by @tedepstein, I was able to do everything I wanted in OpenAPI/JSON Schema-- without the discriminator keyword-- and it worked better without discriminator, because it was clearer (in my opinion), and none of the validation tools I could find work with discriminator, but at least one (the first one I tried) worked using the approach without discriminator (I understand this forum is not meant to cover tool issues, but from an end user standpoint, since I would rather use than build tools for OpenAPI, discriminator is of little value to me if I can't validate it).

That said, I'm not familiar enough with the issues to know that there isn't some other use case that discriminator solves, which can't be solved without it. However, if, or to the extent that, such case does not exist, my preference would be to eliminate discriminator.

@handrews
Copy link
Member

@liquidaty

Just to clarify, my suggestion would not be to add any keywords, but rather, to remove them-- in particular, to remove discriminator and everything associated with it.

Yes, my example would remove discriminator and the Discriminator Object entirely. Or at least works without it- removal is a function of the deprecation policy, so I doubt it would be removed before 4.0.

Using the schema provided by @tedepstein, I was able to do everything I wanted in OpenAPI/JSON Schema-- without the discriminator keyword

Just to be clear, we're talking about this example schema? #2141 (comment)

Doesn't it use discriminator? That looks like one in the "ObjectBaseType" schema.

@tedepstein
Copy link
Contributor Author

tedepstein commented Feb 18, 2020

@handrews , that is a switch-hitting schema that includes discriminator, but also includes everything a standard JSON Schema parser would need in order to validate a message:

  • A wrapper schema (called Object in this example) that uses oneOf to direct the validator to one of the concrete subtypes. Doesn't rely on special discriminator logic to determine the subtype and perform an additional validation against that subtype.

  • A single-valued enum assertion on the designated discriminator property in the concrete subtype schemas, to explicitly identify the subtype so oneOf works reliably. (This would be a const assertion, but that's not supported in OAS 3.0, thus enum.)

@mewalig couldn't get any available OpenAPI validator to work with discriminator, but the pattern above worked fine with a standard JSON Schema validator. That's what reminded me of your comment in #2031 about getting rid of discriminator, and prompted me to open this issue.

If you remove discriminator from that example schema, the only thing missing is a straightforward way for a code generator to:

  • know about the class hierarchy; and
  • optimize validation based on the designated discriminator property, or at least coordinate with the JSON Schema validator to know which subtype was matched by the oneOf.

At first glance, the pattern you've described seems to address both of these.

@mewalig
Copy link

mewalig commented Feb 18, 2020

fwiw, personally, if a code generator works fine with a discriminator-free schema, I can't think of any value in having a redundant-and-more-verbose alternative schema just to generate some alternative but functionally equivalent code, and if there was more value in having the alternative code, I would find it better for the code generator to figure it out based on the schema pattern. That would eliminate the possibility of inconsistency between the information related to ```discriminator`` and the information in the rest of the schema.

@tedepstein
Copy link
Contributor Author

@mewalig , I don't think anyone here is making an argument to keep discriminator indefinitely. We all want to deprecate it and see it replaced by a standard JSON Schema pattern. But this isn't something that happens overnight, because there is a large and diverse community of OpenAPI users and tool providers who need to weigh in.

Also, the earliest release in which we could deprecate discriminator would be 3.1, and that assumes the community agrees that it's OK to deprecate features in a minor release. I don't know if we have really discussed that. Anyway, just saying, don't hold your breath waiting for a final confirmation that we're removing discriminator. :-)

@handrews
Copy link
Member

@tedepstein

Doesn't rely on special discriminator logic to determine the subtype and perform an additional validation against that subtype.

It says something (I'm not sure what, but something) that I totally forgot about this behavior, which is in fact my biggest objection to the keyword! 🤣

@handrews
Copy link
Member

@tedepstein We merged a deprecation of nullable, not sure if that counts as a deprecation of a "feature", particularly given that there is a simple alternative that can be easily migrated to.

discriminator would be more tricky, and I would not expect keywords such as those I used above to be added for OAS 3.1. Which is why I speculated about deprecating it in OAS 3.2 if such a thing ends up going out (or OAS 4 if we drop minor releases or just don't do another one in the 3.x sequence).

@mewalig
Copy link

mewalig commented Feb 20, 2020

At the very least, I think it would be immensely helpful for the documentation (e.g. the spec and related example) to:

  1. make it clear that discriminator is entirely optional, and
  2. provide the analogous example that achieves the same result without the use of discriminator

It might also be worth mentioning that #2 relies on fewer specification features, which are common to JSON Schema, and as such are better supported by validation (and other?) tools.

Those changes alone would have saved me more than 10 hours of time going in circles with discriminator only to end up with a better solution that doesn't use it (not to mention the time so generously contributed by other folks on this issue and #2141), and judging by other online forums where similar questions are asked but not comprehensively answered, I would think others would similarly benefit. Adding that to the docs would also help to pave the way for a smoother transition if/when deprecation happens.

@sm-Fifteen
Copy link

I still see discriminator as a valid way of representing a tagged union, where a single field is used to determine the actual value type, as opposed to the more standard way of validating combined schemas without it. JSON Schema supports far more complex conditional schema validation, which technically subsumes what discriminator tries to do in terms of features, but as far as API typing goes, I think something as flexible as conditional typing has the risk of making contracts harder to verify.

Unions as currently supported by OpenAPI (through oneOf, anyOf and allOf) and tagged unions (what discriminator does) are probably as far as an API specification can go typing-wise without leaking implementation details (classes being an OOP concept). As far as I can tell, conditional validation makes it harder to represent all expressable schemas as valid data types.

@tedepstein
Copy link
Contributor Author

tedepstein commented Feb 29, 2020

@sm-Fifteen , thanks for articulating this:

I still see discriminator as a valid way of representing a tagged union, where a single field is used to determine the actual value type, as opposed to the more standard way of validating combined schemas without it.

That's the benefit of discriminator: It provides a direct, unambiguous way of expressing the intent to model a tagged union, and gives processors a straightforward way to interpret the tags.

Without these clear markers, pattern recognition is much more complex.

  • Processors have to deal with different variants of the pattern, e.g. using const, enum or pattern to assert the discriminator value.
  • They have to decide how to handle cases where some branch of the schema diverges from the pattern, e.g. by introducing additional assertions in a oneOf subschema
  • They may have to improvise their own custom annotations to disambiguate or to provide additional metadata, like name mapping.

The case for deprecating discriminator will be stronger as and when there's an equivalent JSON Schema vocabulary for this. If that doesn't happen on its own, maybe OpenAPI can contribute a vocabulary, alone or in collaboration with other interested parties.

@mewalig
Copy link

mewalig commented Mar 2, 2020

Since a functional JSON equivalent to discriminator already exists, is there any reason to keep discriminator, aside from legacy continuity (which admittedly is important), other than convenience (i.e. easier read, write or generate code from the schema)?

If that's the goal, then another possibility might be to have a "type" value of class, which would essentially be a specialized form of object (or alternatively, having a reserved keyword/value pair for the object type e.g. "subtype": "class"). This class object could then just follow a traditional IDL class syntax. Using this approach, the example might look like:

{
    "oneOf": [
        {   
            "$ref": "#/$defs/base"
        }
    ],
    "$defs": {
        "base": {
            "type": "class",
            "name": "TheBaseClass",
            "properties": {
              "base_prop": ...
            }
        },  
        "child1": {
            "type": "class",
            "name": "Foo",
            "extends": "base",
            "properties": {
              "foo_prop": ...
            }
            ... 
        },  
        "child2": {
            "type": "class",
            "name": "Bar",
            "extends": "base",
            "properties": {
              "bar_prop": ...
            }
            ...
        },  
        "child3": {
            "type": "class",
            "name": "SomethingElse",
            "extends": "base",
            "properties": {
              "other_prop": ...
            }
            ... 
        }   
    }   
}

and the corresponding JSON like this:

{
  "base_prop": ...,
  "other_prop": ... // class ```child3``` is implied
}

or

{ // explicit class
  "className": "Bar", // alternatively: "classId": "child2"
  "base_prop": ...,
  "other_prop": ... // invalid: not a property of Bar class
}

It seems to me that this would be easy to convert to equivalent JSON Schema, which would allow for easy validation of the OpenAPI schema, because the validator could just convert to JSON Schema and then validate that. The same would be true for code generators. In addition, there would be a lot of benefits if it allows OpenAPI to reuse familiar, mature and proven IDL constructs and/or syntax (lower barriers to adoption, higher functional value, less need for revision / change, etc)

@tedepstein
Copy link
Contributor Author

is there any reason to keep discriminator, aside from legacy continuity (which admittedly is important), other than convenience (i.e. easier read, write or generate code from the schema)?

Clarity of intent. Standardized, unambiguous way to denote tagged unions, rather than relying on a loosely defined convention. "Convenience" translates into availability of code generators, validators and other tools: more of them, better consistency, and higher quality. All good things for the OpenAPI ecosystem and user community. These are the reasons described in recent posts here.

I'm not arguing that it's ideal to keep discriminator in its current form, in OpenAPI, forever. But I would like to see it replaced by a JSON Schema vocabulary that has the same level (at least) of clarity, consistency, and simplicity.

@mewalig
Copy link

mewalig commented Mar 2, 2020

Clarity of intent. Standardized, unambiguous way to denote tagged unions, rather than relying on a loosely defined convention.

Those are exactly the reasons that, at least imho, a class type that then followed in the footsteps of a mature IDL would be an ideal solution. I know that change would be significant though.

I'm not arguing that it's ideal to keep discriminator in its current form, in OpenAPI, forever. But I would like to see it replaced by a JSON Schema vocabulary [...]

Any idea if that is currently being contemplated for JSON Schema? I'd definitely be interested to learn more about that.

[...] that has the same level (at least) of clarity, consistency, and simplicity

fwiw, personally I would think this is a very low bar. In searching for more examples, I found a lot of posts asking unanswered questions about how to use discriminator, and my impression was that there doesn't seem to be much as far as documentation or complete examples, other than maybe one simple example which might be hard to translate into real-world use cases.

@handrews
Copy link
Member

handrews commented Mar 3, 2020

@mewalig we (the JSON Schema project) have enabled extensible vocabularies in the new draft. We are hoping that OpenAPI folks will take the lead on code generation proposals. Not necessarily the OpenAPI project itself (although I'd be happy with that), but the OpenAPI community. OAS is the primary driver of codegen. Also, there are a lot more of y'all than there are of us.

@mkistler
Copy link

In the hope that it will inform this discussion, I’ll describe how we (IBM) are using discriminators in our APIs and code generation tools.

Here’s an example use of discriminator in the API for the IBM Discovery service:

      "QueryAggregation": {
        "type": "object",
        "description": "An aggregation produced by  Discovery to analyze the input provided.",
        "discriminator": {
          "propertyName": "type",
          "mapping": {
            "histogram": "#/components/schemas/Histogram",
            "max": "#/components/schemas/Calculation",
            "min": "#/components/schemas/Calculation",
            "average": "#/components/schemas/Calculation",
            "sum": "#/components/schemas/Calculation",
            "unique_count": "#/components/schemas/Calculation",
            "term": "#/components/schemas/Term",
            "filter": "#/components/schemas/Filter",
            "nested": "#/components/schemas/Nested",
            "timeslice": "#/components/schemas/Timeslice",
            "top_hits": "#/components/schemas/TopHits"
          }
        },
        "properties": {
          "type": {
            "type": "string",
            "description": "The type of aggregation command used. For example: term, filter, max, min, etc."
          },

QueryAggregation is "class" of schemas that may be returned from a query that requests aggregation of results in various forms. The various aggregation types vary quite significantly, but notice that some types, e.g. "max", "min", "average", share a common schema. In this particular case, the "child" schemas are composed using allOfwith QueryAggregation as one element and then the specific properties of the child in a second element. E.g.


      "Calculation": {
        "allOf": [
          {
            "$ref": "#/components/schemas/QueryAggregation"
          },
          {
            "properties": {
              "field": {
                "type": "string",
                "description": "The field where the aggregation is located in the document."
              },
              "value": {
                "type": "number",
                "format": "double",
                "description": "Value of the aggregation."
              }
            }
          }
        ]
      },

Next I'll describe how this is used in our tooling. The first thing to say about our SDK generation tooling is that it does not do any validation based on JSON schema. Some may consider this heresy, but we use the schemas in the API def purely for modeling.

In Java and similar type-strict languages, the QueryAggregation schema is rendered as a public class (it is not abstract, but if the composition were "flipped" to use oneOf it would be). The "child" schemas are rendered as subclasses of QueryAggregation, e.g. Calculation.

The discriminator in the QueryAggregation schema is rendered into the QueryAggregation class as static metadata:

  protected static String discriminatorPropertyName = "type";
  protected static java.util.Map<String, Class<?>> discriminatorMapping;
  static {
    discriminatorMapping = new java.util.HashMap<>();
    discriminatorMapping.put("histogram", Histogram.class);
    discriminatorMapping.put("max", Calculation.class);
    discriminatorMapping.put("min", Calculation.class);
    discriminatorMapping.put("average", Calculation.class);
    discriminatorMapping.put("sum", Calculation.class);
    discriminatorMapping.put("unique_count", Calculation.class);
    discriminatorMapping.put("term", Term.class);
    discriminatorMapping.put("filter", Filter.class);
    discriminatorMapping.put("nested", Nested.class);
    discriminatorMapping.put("timeslice", Timeslice.class);
    discriminatorMapping.put("top_hits", TopHits.class);
  }

This metadata is used by our deserialization logic to trigger and guide the use of a custom TypeAdapter that is created in DiscriminatorBasedTypeAdapterFactory. The custom TypeAdapter uses the value of the discriminator to choose a concrete class, based on the discriminatorMapping metadata in the class, to be produced by the deserialization logic.

If there were no discriminator, the generated code would look very different. We would instead create a QueryAggregation class containing the union of all the properties of the child schemas, and the returned class would be an instance of this "generic" QueryAggregation class.

Sorry for the long post. I hope this has been clear and informative.

@mewalig
Copy link

mewalig commented Mar 22, 2020

That is a great example. Is there any reason your generated code would (or should) be different for the equivalent JSON Schema (which I have attempted to generate in the below Case 2)? Obviously, the second one is much less compact, and I am not suggesting that it is better, or that anyone should have to use that instead of discriminator. Rather, I am wondering if discriminator might be more useful if treated (and documented) as optional syntactic sugar (which tools can simply be convert into straight JSON Schema, which in turn offers various other benefits such as validation tool support).

Case 1 (same as your example):

      "QueryAggregation": {
        "type": "object",
        "description": "An aggregation produced by  Discovery to analyze the input provided.",
        "discriminator": {
          "propertyName": "type",
          "mapping": {
            "histogram": "#/components/schemas/Histogram",
            "max": "#/components/schemas/Calculation",
            ...

Case 2:

      "QueryAggregation": {
        "type": "object",
        "description": "An aggregation produced by  Discovery to analyze the input provided.",
        "oneOf": [
          {
            "allOf": [
              {
                "required": ["type"],
                "properties": {
                  "type": { "enum": ["histogram"] }
                }
              },
              { "$ref": "#/components/schemas/Histogram" }
            ]
          },
          {
            "allOf": [
              {
                "required": ["type"],
                "properties": {
                  "type": { "enum": ["max"] }
                }
              },
              { "$ref": "#/components/schemas/Calculation" }
            ]
          }
        ]
        ...

@darrelmiller
Copy link
Member

@mewalig While case 2 is appealing due to the removal of the need for the extra discriminator keyword, it does make codegen a bit more challenging. The appearance of a discriminator keyword is a signal that there is a derived type scenario. It is possible that a oneOf keyword might signal the same thing, but it is a bit more opaque to recognize that the "constant" enum is the discriminator.
It is interesting to consider that if validation can short-circuit, then it might be possible to efficiently use schema validation to identity the appropriate schema rather than depending on a discriminator mapping.

@handrews
Copy link
Member

Although, I just noticed that discriminator also works with anyOf:

When used in conjunction with the anyOf construct, the use of the discriminator can avoid ambiguity where multiple schemas may satisfy a single payload.

I think this still works as long as it's understood that this wouldn't short-circuit validation (if the validator is also collecting annotations, you need to check all anyOf branches because there may be an important annotation produced by any of them, for example the ones that make unevaluatedProperties work).

And discriminator also works with allOf although the use case isn't clear to me there. Is that for an inheritance chain and the discriminator indicates the most derived class? Regardless, since you need to evaluate all allOf branches no matter what during validation, I think selectBy works there as well.

@mkistler
Copy link

I think the common pattern for using a discriminator with allOf is different than the ones we have been discussing for oneOf and anyOf. With allOf, there is a "base schema" that defines the discriminator and possibly some other common properties, and then there are separate schemas for each of the "variants" that "allOf" the base schema with the particular properties for that variant. There is a good example in the OAS 3.03 spec here -- the third or fourth example in that section.

I think the discriminator with allOf is a holdover from Swagger, where oneOf and anyOf were not supported, so allOf was the only way to express this kind of structure. Now that we have oneOf and anyOf, I don't think the allOf expression has any advantages over the more straightforward expressions using these applicators, so this is another aspect of the discriminator I think we could consider dropping if it made for easier translation to/from JSON schema.

@ChALkeR
Copy link

ChALkeR commented Nov 17, 2020

@mkistler, re: #2143 (comment)
{ "petType": "whatever" } will pass with the second example, but won't pass with discriminator, to my understanding (not because of discriminator, but because of oneOf). Can be fixed by adding petType to required and specifying it as a enum. Also, "type": "object".

It's easy to make a mistake while reimplementing discriminator to a basic JSON Schema.

So, while that is possible, that's another argument why discriminator might add value.

@handrews
Copy link
Member

@ChALkeR no, since "whatever" will pass multiple branches of the oneOf (because the validation result of if is discarded, so if you fail it and there's no else then the schema object passes) it causes oneOf to fail.

But you can make that more explicit by just need to tacking on an else: false on each branch of the oneOf, so it's not complicated and easily documented (or generated by tools or caught by a linter).

@ChALkeR
Copy link

ChALkeR commented Nov 17, 2020

@handrews Ah.

no, since "whatever" will pass multiple branches of the oneOf

It will fail on anything then, because anything will pass multiple branches of the oneOf, unless else is added.
So there is still a mistake there.

@handrews
Copy link
Member

@ChALkeR ah, yes, that is correct- that's what I get for looking at schemas 10 minutes after I wake up 😴

Either way, it's a trivial fix. Alternatively, you could replace the if/then with a two-element allOf, although unlike with if/then there's no guarantee of order of evaluation with allOf. Most implementations will do them in order though because that's easier than anything else.

if/then/else shows the intent better, though.

selectBy: '0/petType'
oneOf:
- allOf:
   - properties:
       petType:
         const: 'cat'
   - $ref: '#/components/schemas/Cat'
- allOf:
   - properties:
       petType:
         const: 'dog'
   - $ref: '#/components/schemas/Dog'
- allOf:
   - properties:
       petType:
         const: 'lizard'
   - $ref: '#/components/schemas/Lizard'
- allOf:
   - properties:
       petType:
         const: 'monster'
   - $ref: 'https://gigantic-server.com/schemas/Monster/schema.json'

@ChALkeR
Copy link

ChALkeR commented Nov 17, 2020

@handrews Why is there a 0/ in selectBy? Where can I find a spec for selectBy?
Upd: ah, that's a relative json pointer, missed that.

Why is that a relative json pointer, though? Can it ascend upwards? What would be an example for that?

@handrews
Copy link
Member

@ChALkeR it can ascend upwards. And in the next draft, it can do some limited sideways motion so you can express things like "the next array element". With selectBy I would not use it in those ways, but there are use cases in JSON Hyper-Schema which, in its recent form, is not widely implemented and more or less on hold for now, but the use cases are still valid. There are examples in that spec- it has to do with link templates which might need to be attached somewhere such as to each element in an array, but the template might resolve partially from the value in the array (different for each link) and partially from data outside of the array (the same for each link).

@ChALkeR
Copy link

ChALkeR commented Nov 21, 2020

@handrews I still don't understand the usecase of relative json pointers with selectBy (more specifically, of those that don't start with 0/) in this scenario. Could you provide some concrete example, if possible, please?

Or, if they are not usable with selectBywhy would that be a relative json pointer or even to just a top-level property name? That looks like an overcomplication and I don't understand the usecase yet, but I might be missing something.
Also — how does it work in that case?

As for the example above — it looks good in principle except for the scenario where there is only one branch.
I.e., after refactoring, this schema can (and is relatively likely) to catastrophically fail when/if all branches but one are removed / commented out. A carefully designed specialiazed construction wouldn't have such a problem.

Also, It's not immediately clear to me what should the annotation look like if more than one branch pass.

@handrews
Copy link
Member

@ChALkeR I'd rather not dive into an involved explanation of all possible use cases here without some indication that there's broad interest, so let's see if anyone else would like to see more.

@liquidaty
Copy link

Agree w @ChALkeR that without further reasons/examples to the contrary, which aren't readily apparent (though, would be interested to see some if anyone can provide), it would seem more intuitive, less prone to error, more concise and easier overall to omit the 0/ prefix

@handrews
Copy link
Member

handrews commented Nov 22, 2020

@ChALkeR @liquidaty so I'm trying to remember exactly how this (several years old) proposal is supposed to work. It pre-dates some work we did on the processing model, and might need tweaking. This is why I don't want to spend the effort to sort all of that out unless there's a clear interest in putting something like this into OAS 3.1. If the TSC expresses interest in that, I'd be happy to go work up a proposal that I'm sure works in all cases and provides efficiency, etc.

On the more narrow topic: selectBy is an instance pointer and the same schema can be applied to multiple locations in the instance (e.g. if you stuff the given example under an items: keyword to make it apply to every element in the array). It is not possible to write an absolute pointer for such a scenario.

As for why you might use a negative number, it's possible that the schema to apply to elements in an array might switch based on a property outside of the array (because it applies to the whole array). Depending on how complex the structure is, referencing the outer field from an inner subschema might be substantially less complicated than needing to replicate a bunch of structure to ensure that the field is at the "top level" of the switch.

But really, it's the first case (same schema, many instance locations) that makes it flatly impossible to implement this with an absolute pointer.

@darrelmiller
Copy link
Member

It will be easier to deprecate this once we have a codegen vocabulary that replaces these semantics.

@Saud96525
Copy link

#3216 (comment)

@handrews
Copy link
Member

I'm going to mark this as "Moved to Moonwalk" and close it as there is no way we can meaningfully deprecate discriminator in 3.x and retain compatibility. We don't have a deprecation policy, and we don't have a replacement option (nor is any clearly on the horizon in a reasonable time frame). We also don't need to deprecate things that will be changed in the next major release as it's understood that many things will change dramatically.

@handrews
Copy link
Member

Forgot to link to where this is tracked for Moonwalk:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discriminator Moved to Moonwalk Issues that can be closed or migrated as being addressed in Moonwalk
Projects
Development

No branches or pull requests