Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Service renames #734

Merged
merged 7 commits into from
Mar 12, 2021
Merged

Service renames #734

merged 7 commits into from
Mar 12, 2021

Conversation

mtdowling
Copy link
Member

@mtdowling mtdowling commented Mar 7, 2021

This pull request adds the ability for service shapes to use shapes that have conflicting shape names in the service closure.

Motivation

All the shapes referenced as part of a service are called the service closure. The service closure is computed by recursively following each reference to operation and resource shapes defined on the service, then to their operations and resources, and then to the shapes they reference. With some exceptions defined in the specification, Smithy requires every shape in the closure to have a unique shape name regardless of their namespaces. Given that Smithy supports defining shapes in namespaces, this limitation that shapes names must be unique was counter-intuitive, and there was previously no recourse when the situation occurs. Smithy models are often used by teams to combine the models of several independent microservices into a unified service, so it's a common occurrence that teams end up with name conflicts when building their "frontend" services.

Background

The intentional limitation that shape names must be case-insensitively unique in a service closure is intended to help with code generation to generate unambiguous type names. Borrowing from DDD terminology, the goal is to give the service a ubiquitous language by treating a service as a bounded context.

  • Not every programming language supports namespaces
  • Generating names based on the namespace and shape name leads to obviously generated type names
  • Actually using namespaces could lead to situations where end-users need may need to choose between two identical auto-completed type names when working with a client
  • Namespaces would lead to very complex packaging requirements if disparate namespaces are used (for example, needing to publish individual packages for each namespace used as part of a service to prevent things like split packages in Java, coordinating and publishing shared models with dependencies, etc.)

Solution

The solution proposed in this pull request is to introduce a rename property to the service shape to disambiguate shape name conflicts in the service closure. This property is a map of shape IDs contained in the service to the name to give the shape when used in the context of the service.

  • Each shape ID must be part of the service closure.
  • Each map value MUST match the identifier production used for shape IDs.
  • No renamed shape name can case-insensitively match the effective name of any other shape (where effective name is defined as the renamed name of the shape if the shape is renamed, or the name of the shape).
  • Renaming a shape does not give the shape a new shape ID. Renaming a shape simply gives the shape an unambiguous name in the context of the service.
  • Member shapes MAY NOT be renamed. Renaming members would imply creating synthetic copies of aggregate shapes with new shape IDs.
  • Resource, operation, and error shapes MAY NOT be renamed.
    • Renaming shapes is intended for incidental naming conflicts, not for renaming the fundamental concepts of a service.
    • Most conflicts that occur in naming when combining models occurs in the shapes that make up resource and operation shapes, not in the resource and operation shapes themselves. Independent teams hoping to build a cohesive frontend service will need to coordinate at least on the concepts of a service, and that includes deciding which team owns which resources and operations.
    • Forbidding the renaming of resources and operations can often make other transformations of the model simpler; for example, when determining the type name of a resource in another format the service context is not needed (e.g., AWS CloudFormation resources).
    • Renaming errors brings its own set of problems since errors are typically represented polymorphically in protocols using the shape ID's shape name. Given this current limitation, and the rationale that a service should coordinate on how they present errors to clients, a good starting position is to forbid renaming shapes marked with the error trait. This restriction could be lifted in the future if required.
  • Shapes from other namespaces marked as private MAY be renamed. Incidental conflicts can easily occur in private or public shapes. The rename property does not add a new selector relationship from the service to the renamed shapes, and because of this, the rules around referencing shapes marked as private from other namespaces do not apply to a service.
  • Renames are not limited to shapes in conflict. When integrating a model into another model, shape names could be overly generic and should be disambiguated early to ensure the service is unambiguous as it grows and adds new concepts.

The following example defines a service that contains two shapes named "Widget" in its closure. This model is invalid because on the conflicting names:

namespace smithy.example

service MyService {
    version: "2017-02-11",
    operations: [GetSomething]
}

operation GetSomething {
    output: GetSomethingOutput,
}

structure GetSomethingOutput {
    widget1: Widget,
    fooWidget: foo.example#Widget,
}

structure Widget {}

The conflicting Widget shapes can be disambiguated by renaming one of the "Widget" shapes used in the service:

service MyService {
    version: "2017-02-11",
    operations: [GetSomething],
    rename: {
        "foo.example#Widget": "FooWidget"
    }
}

Impact on code generation

Code generators should now take into account the rename property of a service to determine the name of a shape for the target language. Code generators previously based the name of a generated type solely on the name of a shape's shape ID, but the now must now have the context of the service being generated to choose an appropriate type name for shapes, particularly conflicting shapes.

Impact on OpenAPI conversion

Like code generators, when generating OpenAPI models from Smithy, the converter must now use the rename property of the service.

Impact on non-Smithy AWS SDKs

Several AWS SDKs have not yet migrated to Smithy models and code generation. These SDKs will not need to do any work in order for this proposal to work. The conversions from Smithy to the format used by these SDKs will automatically rename shapes as part of the generated model.

Alternatives considered

Do nothing, copy and paste

If a conflict occurs when integrating models, the shapes that are in conflict can be copied and renamed to names that no longer conflict with the rest of the model. This solution is simple in implementation and the specification, but it's an awful experience for the team that has this problem. Sometimes renaming a shape requires renaming and editing an entire tree of interconnected shapes; any member that targeted the previous shape name needs to be updated, traits need to be updated, etc. This solution is also dangerous too; the copied shapes are now disjoint from their source. As the source model evolves, the copied shapes only evolve if another explicit copy is performed, or if some kind of automated copy is performed. However, there is still an opportunity for the copied shapes to become out of sync because traits can be applied to the copied shapes that have no impact on the source shapes.

Relax the unique name requirement and use build tooling

Another option is to relax the requirement that shape names are unique across namespaces and rely on codegen configuration in smithy-build.json files to disambiguate shape names. For example:

{
    "smithy": "1.0",
    "plugins": {
        "java-codegen": {
            "shapes": ["a#Facet", "a#Doohickey", "com.foo#Facet"],
             "rename": {
                "com.foo#Facet": "FooFacet"
             }
        },
        "php-codegen": {
            "shapes": ["a#Facet", "a#Doohickey", "com.foo#Facet"],
             "rename": {
                "com.foo#Facet": "FooFacet"
             }
        }
    }
}

The biggest issue with this approach is that it doesn’t scale well if a service needs to be generated in multiple programming languages and potentially by multiple teams (for example, that is the situation with AWS SDKs). By putting rename metadata in the model, the team that owns the service has full control over how the shapes in their service are referenced in things like code and documentation. Placing this configuration in build configuration files pushes the burden onto consumers and can easily lead to inconsistencies.

Use a rename trait

This option is similar to the option proposed, but instead of making rename a property of the service, it becomes a trait on the service:

@rename(foo.example#Widget": "FooWidget")
service MyService {
    version: "2017-02-11",
    operations: [GetSomething],
}

This has the same effect, and really comes down to a matter of taste. Because the concept of a "service closure" is specific to a service and defined by properties within the service like operations and resources, it seems more natural to define renames inside the service as well.

Add a new context shape

A new shape could be added called context that is referenced by a service to handle renames.

namespace smithy.example

service MyService {
    version: "2012-06-1",
    operations: [SomeOperationA, SomeOperationB],
    context: MyServiceContext, 
}

context MyServiceContext {
    rename: {
        com.foo#Facet: "FooFacet"
    }
}

This has the benefit that contexts could be reused by multiple services, and could even be used to generate just types for shapes rather than clients or servers. smithy-build.json files could reference the service to generate a client or server, or the context to generate shapes in a context:

{
     "smithy": "1.0",
    "plugins": {
        "java-codegen": {
            "context": "smithy.example#MyServiceContext"
        }
    }
}

This option has several issues:

  1. If a conflict is detected in a service closure, the context must be updated. This means that the context and the service aren’t actually independent of each other.
  2. A service provides other meaningful information that code generators would need to generate just types without a client (for example, protocol configurations).
  3. Introducing a new shape would likely need a major version bump of the specification and would be a breaking change to the implementation of Smithy (mainly due to how visitors are implemented in Java).

This commit adds the ability to rename shapes within the context of the
closure of a service. This helps to allow models that aggregate other
models of disparate teams to have a recourse when shapes across
namespaces conflict when they are used in the same service.
@rcoh
Copy link
Contributor

rcoh commented Mar 8, 2021

Seems reasonable. Would appreciate some sort of ShapeNameIndex so that in our SymbolVisitor implementations we could easily capture the derived name for a shape.

My one main worry is that now shape.id.getName() becomes essentially unsafe to reference during code generation which seems like a bit of a footgun.

I wonder if that should be deprecated and a service-aware name accessor should be added?

Another wrinkle: What happens if two errors with the same name (but renamed within the closure) are added to the errors for an operation? Is that verboten?

@mtdowling
Copy link
Member Author

Since there are valid use cases where a service context isn't needed, shape ID names shouldn't be deprecated (for example, Smithy to Smithy transformations and the implementation of the Smithy model itself).

For making this easier, rather than adding a new knowledge index, I added a getContextName method to the ServiceShape class. So if you have a reference to a service, you can pass a shape ID or shape to getContextName to get the possibly renamed shape name:

myService.getContextName(someShape);

But maybe it would be better to put that method on a ShapeId and pass in the service instead?

someShapeId.getServiceName(service);

Another wrinkle: What happens if two errors with the same name (but renamed within the closure) are added to the errors for an operation? Is that verboten?

Great question. For context: errors are one of the few places where a polymorphic type needs to be communicated. For example, clients need to look at the response x-amzn-errorType or __type and figure out what to deserialize it into. Since the type names used in protocols like AWS rest-json are all based on shape ID shape names, they don't include the namespace as part of the name, and the name is not based on the service context name, the conflict would cause ambiguity (i.e., which FooException is this?).

We could address this in a few ways:

  1. Just like how we forbid resource and operation shapes from being renamed, forbid error shape names from being renamed. This is probably the simplest option, but it doesn't provide a recourse for naming conflicts.
  2. Allow conflicting shape ID names for errors to be renamed, but forbid conflicts in an operation. So OperationA can refer to a#FooException, and OperationB can refer to b#FooException, and as long as one is renamed, there's no issue. I doubt this would have an impact on most use cases, and the biggest cost here is just the added lawyering to the spec and validation. With this constraint, clients know the operation context in which an operation was called and can then properly disambiguate between two conflicting error shape names.
  3. Implement option 1 or 2, but instead enforce this constraint only in protocol-specific validation. Since this is an issue specific to currently implemented protocols like AWS rest-json, maybe it should be protocol specific validation? The major drawback here is that a team might not know about this requirement until they add protocol traits to their service.

@rcoh
Copy link
Contributor

rcoh commented Mar 8, 2021

[Discussed offline]

  • On accessing the name of a ShapeId, we decided it makes sense to add an additional accessor getName(ServiceShape) to return the name of a shape in the context of a service.
  • On error disambiguation, we decided to prevent errors from being renamed. If this poses a problem for service teams, we can revisit this decision at a later date.

In terms of codegen, a couple of protocol tests should be sufficient to make sure that teams are properly respecting the rename flag for structures.

* Add more discoverable methods to ShapeId to get a contextual name.
* Add a protocol test to ensure generators can handle renames.
* Update spec to forbid renaming errors.
* Update validator to forbid error renames, and give better error
  messages.
* Update ServiceShape#getContextName to ServiceShape#getContextualName
@mtdowling
Copy link
Member Author

I've updated the proposal and the pull request with the discussed changes. Thanks for the suggestions!

@mtdowling mtdowling requested a review from JordonPhillips March 9, 2021 18:09
@mtdowling mtdowling requested a review from kstich March 10, 2021 04:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants