Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

all: support gradual code repair while moving a type between packages #18130

Closed
rsc opened this issue Dec 1, 2016 · 225 comments
Closed

all: support gradual code repair while moving a type between packages #18130

rsc opened this issue Dec 1, 2016 · 225 comments

Comments

@rsc
Copy link
Contributor

rsc commented Dec 1, 2016

Original Title: proposal: support gradual code repair while moving a type between packages

Go should add the ability to create alternate equivalent names for types, in order to enable gradual code repair during codebase refactoring. This was the target of the Go 1.8 alias feature, proposed in #16339 but held back from Go 1.8. Because we did not solve the problem for Go 1.8, it remains a problem, and I hope we can solve it for Go 1.9.

In the discussion of the alias proposal, there were many questions about why this ability to create alternate names for types in particular is important. As a fresh attempt to answer those questions, I wrote and posted an article, “Codebase Refactoring (with help from Go).” Please read that article if you have questions about the motivation. (For an alternate, shorter presentation, see Robert's Gophercon lightning talk. Unfortunately, that video wasn't available online until October 9. Update, Dec 16: here's my GothamGo talk, which was essentially the first draft of the article.)

This issue is not proposing a specific solution. Instead, I want to gather feedback from the Go community about the space of possible solutions. One possible avenue is to limit aliases to types, as mentioned at the end of the article. There may be others we should consider as well.

Please post thoughts about type aliases or other solutions as comments here.

Thank you.

Update, Dec 16: Design doc for type aliases posted.
Update, Jan 9: Proposal accepted, dev.typealias repository created, implementation due at the start of the Go 1.9 cycle for experimentation.


Discussion summary (last updated 2017-02-02)

Do we expect to need a general solution that works for all declarations?

If type aliases are 100% necessary, then var aliases are maybe 10% necessary, func aliases are 1% necessary, and const aliases are 0% necessary. Because const already has = and func could plausibly use = too, the key question is whether var aliases are important enough to plan for or implement.

As argued by @rogpeppe (#16339 (comment)) and @ianlancetaylor (#16339 (comment)) in the original alias proposal and as mentioned in the article, a mutating global var is usually a mistake. It probably doesn't make sense to complicate the solution to accommodate what is usually a bug. (In fact, if we can figure out how, it would not surprise me if in the long term Go moves toward requiring global vars to be immutable.)

Because richer var aliases are likely not important enough to plan for, it seems like the right choice here is to focus only on type aliases. Most of the comments here seem to agree. I won't list everyone.

Do we need a new syntax (= vs => vs export)?

The strongest argument for new syntax is the need to support var aliases, either now or in the future (#18130 (comment) by @Merovius). It seems okay to plan not to have var aliases (see previous section).

Without var aliases, reusing = is simpler than introducing new syntax, whether => like in the alias proposal, ~ (#18130 (comment) by @joegrasse), or export (#18130 (comment) by @cznic).

Using = in would also exactly match the syntax of type aliases in Pascal and Rust. To the extent that other languages have the same concepts, it's nice to use the same syntax.

Looking ahead, there could be a future Go in which func aliases exist too (see #18130 (comment) by @nigeltao), and then all declarations would permit the same form:

const C2 = C1
func F2 = F1
type T2 = T1
var V2 = V1

The only one of these that wouldn't establish a true alias would be the var declaration, because V2 and V1 can be redefined independently as the program executes (unlike the const, func, and type declarations which are immutable). Since one main reason for variables is to allow them to vary, that exception would at least be easy to explain. If Go moves toward immutable global vars, then even that exception would disappear.

To be clear, I am not suggesting func aliases or immutable global vars here, just working through the implications of such future additions.

@jimmyfrasche suggested (#18130 (comment)) aliases for everything except consts, so that const would be the exception instead of var:

const C2 = C1 // no => form
func F2 => F1
type T2 => T1
var V2 => V1
var V2 = V1 // different from => form

Having inconsistencies with both const and var seems more difficult to explain than having just an inconsistency for var.

Can this be a tooling- or compiler-only change instead of a language change?

It's certainly worth asking whether gradual code repair can be enabled purely by side information supplied to the compiler (for example, #18130 (comment) by @btracey).

Or maybe if the compiler can apply some kind of rule-based preprocessing to transform input files before compilation (for example, #18130 (comment) by @tux21b).

Unfortunately, no, the change really can't be confined that way. There are at least two compilers (gc and gccgo) that would need to coordinate, but so would any other tools that analyze programs, like go vet, guru, goimports, gocode (code completion), and others.

As @bcmills said (#18130 (comment)), “a ‘non-language-change’ mechanism which must be supported by all implementations is a de facto language change — it’s just one with poorer documentation.”

What other uses might aliases have?

We know of the following. Given that type aliases in particular were deemed important enough for inclusion in Pascal and Rust, there are likely others.

  1. Aliases (or just type aliases) would enable creating drop-in replacements that expand other packages. For example see https://go-review.googlesource.com/#/c/32145/, especially the explanation in the commit message.

  2. Aliases (or just type aliases) would enable structuring a package with a small API surface but a large implementation as a collection of packages for better internal structure but still present just one package to be imported and used by clients. There's a somewhat abstract example described at Proposal: Alias declarations for Go #16339 (comment).

  3. Protocol buffers have an "import public" feature whose semantics is trivial to implement in generated C++ code but impossible to implement in generated Go code. This causes frustration for authors of protocol buffer definitions shared between C++ and Go clients. Type aliases would provide a way for Go to implement this feature. In fact, the original use case for import public was gradual code repair. Similar issues may arise in other kinds of code generators.

  4. Abbreviating long names. Local (unexported or not-package-scoped) aliases might be handy to abbreviate a long type name without introducing the overhead of a whole new type. As with all these uses, the clarity of the final code would strongly influence whether this is a suggested use.

What other issues does a proposal for type aliases need to address?

Listing these for reference. Not attempting to solve or discuss them in this section, although a few were discussed later and are summarized in separate sections below.

  1. Handling in godoc. (all: support gradual code repair while moving a type between packages #18130 (comment) by @nigeltao and all: support gradual code repair while moving a type between packages #18130 (comment) by @jimmyfrasche)

  2. Can methods be defined on types named by alias? (all: support gradual code repair while moving a type between packages #18130 (comment) by @ulikunitz)

  3. If aliases to aliases are allowed, how do we handle alias cycles? (all: support gradual code repair while moving a type between packages #18130 (comment) by @thwd)

  4. Should aliases be able to export unexported identifiers? (all: support gradual code repair while moving a type between packages #18130 (comment) by @thwd)

  5. What happens when you embed an alias (how do you access the embedded field)? (all: support gradual code repair while moving a type between packages #18130 (comment) by @thwd, also spec: embedding a type alias is confusing #17746)

  6. Are aliases available as symbols in the built program? (all: support gradual code repair while moving a type between packages #18130 (comment) by @thwd)

  7. Ldflags string injection: what if we refer to an alias? (all: support gradual code repair while moving a type between packages #18130 (comment) by @thwd; this only arises if there are var aliases.)

Is versioning a solution by itself?

"In that case maybe versioning is the whole answer, not type aliases."
(#18130 (comment) by @iainmerrick)

As noted in the article, I think versioning is an complementary concern. Support for gradual code repair, such as with type aliases, gives a versioning system more flexibility in how it builds a large program, which can be difference between being able to build the program and not.

Can the larger refactoring problem be solved instead?

In #18130 (comment), @niemeyer points out that there were actually two changes for moving os.Error to error: the name changed but so did the definition (the current Error method used to be a String method).

@niemeyer suggests that perhaps we can find a solution to the broader refactoring problem that fixes types moving between packages as a special case but also handles things like method names changing, and he proposes a solution built around "adapters".

There is a fair amount of discussion in the comments that I can't easily summarize here. The discussion isn't over, but so far it is unclear whether "adapters" can fit into the language or be implemented in practice. It does seem clear that adapters are at least one order of magnitude more complex than type aliases.

Adapters need a coherent solution to the subtyping problems noted below as well.

Can methods be declared on alias types?

Certainly aliases do not allow bypassing the usual method definition restrictions: if a package defines type T1 = otherpkg.T2, it cannot define methods on T1, just as it cannot define methods directly on otherpkg.T2. That is, if type T1 = otherpkg.T2, then func (T1) M() is equivalent to func (otherpkg.T2) M(), which is invalid today and remains invalid. However, if a package defines type T1 = T2 (both in the same package), then the answer is less clear. In this case, func (T1) M() would be equivalent to func (T2) M(); since the latter is allowed, there is an argument to allow the former. The current design doc does not impose a restriction here (in keeping with the general avoidance of restrictions), so that func (T1) M() is valid in this situation.

In #18130 (comment), @jimmyfrasche suggests that instead defining "no use of aliases in method definitions" would be a clear rule and avoid needing to know what T is defined as to know if func (T) M() is valid. In #18130 (comment), @rsc points out that even today there are certain T for which func (T) M() is not valid: https://play.golang.org/p/bci2qnldej. In practice this doesn't come up because people write reasonable code.

We will keep this possible restriction in mind but wait until there is strong evidence that it is needed before introducing it.

Is there a cleaner way to handle embedding and, more generally, field renames?

In #18130 (comment), @Merovius points out that an embedded type that changes its name during a package move will cause problems when that new name must eventually be adopted at the use sites. For example if user type U has an embedded io.ByteBuffer that moves to bytes.Buffer, then while U embeds io.ByteBuffer the field name is U.ByteBuffer, but when U is updated to refer to bytes.Buffer, the field name necessarily changes to U.Buffer.

In #18130 (comment), @neild points out that there is at least a workaround if references to io.ByteBuffer must be excised: the package P that defines U can also define 'type ByteBuffer = bytes.Buffer' and embed that type into U. Then U still has a U.ByteBuffer, even after io.ByteBuffer is gone entirely.

In #18130 (comment), @bcmills suggests the idea of field aliases, to allow a field to have multiple names during a gradual repair. Field aliases would allow defining something like type U struct { bytes.Buffer; ByteBuffer = Buffer } instead of having to create the top-level type alias.

In #18130 (comment), @rsc raises yet another possibility: some syntax for 'embed this type with this name', so that it is possible to embed a bytes.Buffer as the field name ByteBuffer, without needing a top-level type or an alternate name. If that existed, then the type name could be updated from io.ByteBuffer to bytes.Buffer while preserving the original name (and not introducing a second, nor a clumsy exported type).

These all seem worth exploring once we have more evidence of large-scale refactorings blocked by problems with fields changing names. As @rsc wrote, "If type aliases help us get to the point where lack of field aliases is the next big roadblock for large-scale refactorings, that will be progress!"

There was a suggestion of restricting the use of aliases in embedded fields or changing the embedded name to use the target type's name, but those make the alias introduction break existing definitions that must then be fixed atomically, essentially preventing any gradual repair. @rsc: "We discussed this at some length in #17746. I was originally on the side of the name of an embedded io.ByteBuffer alias being Buffer, but the above argument convinced me I was wrong. @jimmyfrasche in particular made some good arguments about the code not changing depending on the definition of the embedded thing. I don't think it's tenable to disallow embedded aliases completely."

What is the effect on programs using reflection?

Programs using reflection see through aliases. In #18130 (comment), @atdiar points out that if a program is using reflection to, for example, find the package in which a type is defined or even the name of a type, it will observe the change when the type is moved, even if a forwarding alias is left behind. In #18130 (comment), @rsc confirmed this and wrote "Like the situation with embedding, it's not perfect. Unlike the situation with embedding, I don't have any answers except maybe code shouldn't be written using reflect to be quite that sensitive to those details."

The use of vendored packages today also changes package import paths seen by reflect, and we have not been made aware of significant problems caused by that ambiguity. This suggests that programs are not commonly inspecting reflect.Type.PkgPath in ways that would be broken by use of aliases. Even so, it's a potential gap, just like embedding.

What is the effect on separate compilation of programs and plugins?

In #18130 (comment), @atdiar raises the question of the effect on object files and separate compilation. In #18130 (comment), @rsc replies that there should be no need to make changes here: if X imports Y and Y changes and is recompiled, then X needs to be recompiled too. That's true today without aliases, and it will remain true with aliases. Separate compilation means being able to compile X and Y in distinct steps (the compiler does not have to process them in the same invocation), not that it is possible to change Y without recompiling X.

Would sum types or some kind of subtyping be an alternative solution?

In #18130 (comment), @iand suggests "substitutable types", "a list of types that may be substituted for the named type in function arguments, return values etc.". In #18130 (comment), @j7b suggests using algebraic types "so we also get an empty interface equivalent with compile time type checking as a bonus". Other names for this concept are sum types and variant types.

In general this does not suffice to allow moving types with gradual code repair. There are two ways to think about this.

In #18130 (comment), @bcmills takes the concrete way, pointing out that algebraic types have a different representation than the original, which makes it not possible to treat the sum and the original as interchangeable: the latter has type tags.

In #18130 (comment), @rsc takes the theoretical way, expanding on #18130 (comment) by @gri pointing out that in a gradual code repair, sometimes you need T1 to be a subtype of T2 and sometimes vice versa. The only way for both to be subtypes of each other is for them to be the same type, which not concidentally is what type aliases do.

As a side tangent, in addition to not solving the gradual code repair problem, algebraic types / sum types / union types / variant types are by themselves hard to add to Go. See
the FAQ answer and the Go 1.6 AMA discussion for more.

In #18130 (comment), @thwd suggests that since Go has a subtyping relationship between concrete types and interfaces (bytes.Buffer can be seen as a subtype of io.Reader) and between interfaces (io.ReadWriter is a subtype of io.Reader in the same way), making interfaces "recursively covariant (according to the current variance rules) down to their method arguments" would solve the problem provided that all future packages only use interfaces, never concrete types like structs ("encourages good design, too").

There are three problems with that as a solution. First, it has the subtyping issues above, so it doesn't solve gradual code repair. Second, it doesn't apply to existing code, as @thwd noted in this suggestion. Third, forcing the use of interfaces everywhere may not actually be good design and introduces performance overheads (see for example #18130 (comment) by @Merovius and #18130 (comment) by @zombiezen).

Restrictions

This section collects proposed restrictions for reference, but keep in mind that restrictions add complexity. As I wrote in #18130 (comment), "we should probably only implement those restrictions after actual experience with the unrestricted, simpler design helps us understand whether the restriction would bring enough benefits to pay for its cost."

Put another way, any restriction would need to be justified by evidence that it would prevent some serious misuse or confusion. Since we haven't implemented a solution yet, there is no such evidence. If experience did provide that evidence, these will be worth returning to.

Restriction? Aliases of standard library types can only be declared in standard library.

(#18130 (comment) and #18130 (comment) by @iand)

The concern is "code that has renamed standard library concepts to fit a custom naming convention", or "long spaghetti chains of aliases across multiple packages that end up back at the standard library", or "aliasing things like interface{} and error".

As stated, the restriction would disallow the "extension package" case described above involving x/image/draw.

It's unclear why the standard library should be special: the problems would exist with any code. Also, neither interface{} nor error is a type from the standard library. Rephrasing the restriction as "aliasing predefined types" would disallow aliasing error, but the need to alias error was one of the motivating examples in the article.

Restriction? Alias target must be package-qualified identifier.

(#18130 (comment) by @jba)

This would make it impossible to make an alias when renaming a type within a package, which may be used widely enough to necessitate a gradual repair (#18130 (comment) by @bcmills).

It would also disallow aliasing error as in the article.

Restriction? Alias target must be package-qualified identifier with same name as alias.

(proposed during alias discussion in Go 1.8)

In addition to the problems of the previous section with limiting to package-qualified identifiers, forcing the name to stay the same would disallow the conversion from io.ByteBuffer to bytes.Buffer in the article.

Restriction? Aliases should be discouraged in some way.

"How about hiding aliases behind an import, just like for "C" and “unsafe”, to further discourage it's usage? In the same vein, I would like the aliases syntax to be verbose and stand out as a scaffold for on going refactoring." - #18130 (comment) by @xiegeo

"Should we also automatically infer that an aliased type is legacy and should be replaced by the new type? If we enforce golint, godoc and similar tools to visualize the old type as deprecated, it would limit the abuse of type aliasing very significantly. And the final concern of aliasing feature being abused would be resolved." - #18130 (comment) by @rakyll

Until we know that they will be used wrong, it seems premature to discourage usage. There may be good, non-temporary uses (see above).

Even in the case of code repair, either the old or new type may be the alias during the transition, depending on the constraints imposed by the import graph. Being an alias does not mean the name is deprecated.

There is already a mechanism for marking certain declarations as deprecated (see #18130 (comment) by @jimmyfrasche).

Restriction? Aliases must target named types.

"Aliases shouldn't not apply to unnamed type. Their is no "code repair" story in moving from one unnamed type to another. Allowing aliases on unnamed types means I can no longer teach Go as simply named and unnamed types." - #18130 (comment) by @davecheney

Until we know that they will be used wrong, it seems premature to discourage usage. There may be good uses with unnamed targets (see above).

As noted in the design doc, we do expect to change the terminology to make the situation clearer.

@rsc rsc added the Proposal label Dec 1, 2016
@rsc rsc added this to the Proposal milestone Dec 1, 2016
@variadico
Copy link

I like how visually uniform this looks.

const OldAPI => NewPackage.API
func  OldAPI => NewPackage.API
var   OldAPI => NewPackage.API
type  OldAPI => NewPackage.API

But since we can almost gradually move most elements, maybe the simplest
solution is just to allow an = for types.

const OldAPI = NewPackage.API
func  OldAPI() { NewPackage.API() }
var   OldAPI = NewPackage.API
type  OldAPI = NewPackage.API

@zquestz
Copy link

zquestz commented Dec 1, 2016

So first, I just wanted to thank you for that excellent write-up. I think the best solution is to introduce type aliases with an assignment operator. This requires no new keywords/operators, uses a familiar syntax, and should solve the refactoring problem for large code bases.

@iand
Copy link
Contributor

iand commented Dec 1, 2016

As Russ's article points out, any alias-like solution needs to gracefully solve #17746 and #17784

@travisjeffery
Copy link

travisjeffery commented Dec 1, 2016

Thank you for the write up of that article.

I find the type-only aliases using the assignment operator to be best:

type OldAPI = NewPackage.API

My reasons:

  • It's simpler.
    The alternative solution => having subtly different meaning based on its operand feels out of place for Go.
  • It's focused and conservative.
    The issue at hand with types is solved and you don't need to worry about imagining the complications of the generalized solution.
  • It's aesthetic.
    I think it looks more pleasing.

All of these above: the result being simple, focused, conservative, and aesthetic make it easy for me to picture of it being a part of Go.

@cznic
Copy link
Contributor

cznic commented Dec 1, 2016

If the solution would be limited to types only then the syntax

type NewFoo = old.Foo

already considered before, as discussed in the @rsc's article, looks very good to me.

If we would like to be able to do the same for constants, variables and functions, my preferred syntax would be (as proposed before)

package newfmt

import (
	"fmt"
)

// No renaming.
export fmt.Printf // Note: Same as `export Printf fmt.Printf`.

export (
        fmt.Sprintf
        fmt.Formatter
)

// Renaming.
export Foo fmt.Errorf // Foo must be exported, ie. `export foo fmt.Errorf` would be invalid.

export (
	Bar fmt.Fprintf
	Qux fmt.State
)

As discussed before, the disadvantage is that a new, top-level only keyword is introduced, which is admittedly akward, even though technically feasible and fully backwards compatible. I like this syntax because it reflects the pattern of imports. It would seem natural to me that exports would be permitted only in the same section where imports are allowed, ie. between the package clause and any var, type, constant or function TLD.

The renaming identifiers would be declared in the package scope, however, the new names are not visible in the package declaring them (newfmt in the example above) above with respect to redeclaration, which is disallowed as usual. Given the previous example, TLDs

var v = Printf // undefined: Printf.
var Printf int // Printf redeclared, previous declaration at newfmt.go:8.

In the importing package the renaming identifiers are visible normally, as any other exported identifier of the (newftm's) package block.

package foo

import "newfmt"

type bar interface {
	baz(qux newfmt.Qux) // qux type is identical to fmt.State.
}

In conclusion, this approach does not introduce any new local name binding in newfmt, which I believe avoids at least some of the problems discussed in #17746 and solves #17784 completely.

@4ad
Copy link
Member

4ad commented Dec 1, 2016

My first preference is for a type-only type NewFoo = old.Foo.

If a more general solution is desired, I agree with @cznic that a dedicated keyword is better than a new operator (especially an asymetric operator with confusing directionality[1]). That being said, I don't think the export keyword conveys the right meaning. Neither the syntax, nor semantics mirrors import. What about alias?

I understand why @cznic doesn't want the new names to be accesible in the package declaring them, but, to me at least, that restriction feels unexpected and artificial (although I perfectly well understand the reason behind it).

[1] I have been using Unix for almost 20 years, and I still can't create a symlink on the first try. And I usually fail even on the second try, after I have read the manual.

@iand
Copy link
Contributor

iand commented Dec 1, 2016

I would like to propose an additional constraint: type aliases to standard library types may only be declared in the standard library.

My reasoning is that I don't want to work with code that has renamed standard library concepts to fit a custom naming convention. I also don't want to deal with long spaghetti chains of aliases across multiple packages that end up back at the standard library.

@quentinmit
Copy link
Contributor

@iand: That constraint would block the use of this feature to migrate anything into the standard library. Case in point, the current migration of Context into the standard library. The old home of Context should become an alias for the Context in the standard library.

@iand
Copy link
Contributor

iand commented Dec 1, 2016

@quentinmit that is unfortunately true. It also limits the use case for golang.org/x/image/draw in this CL https://go-review.googlesource.com/#/c/32145/

My real concern is with people aliasing things like interface{} and error

@joegrasse
Copy link

If it is decided to introduce a new operator, I would like to propose ~. In the English language, it is generally understood to mean "similar to", "approximately", "about", or "around". As @4ad above stated, the => is an asymetric operator with confusing directionality.

For example:

const OldAPI ~ NewPackage.API
func  OldAPI ~ NewPackage.API
var   OldAPI ~ NewPackage.API
type  OldAPI ~ NewPackage.API

@jba
Copy link
Contributor

jba commented Dec 1, 2016

@iand if we limit the right-hand side to a package-qualified identifier, then that would eliminate your specific concern.

It would also mean you couldn't have aliases to any types in the current package, or to long type expressions like map[string]map[int]interface{}. But those uses have nothing to do with the main goal of gradual code repair, so maybe they are no great loss.

@rsc
Copy link
Contributor Author

rsc commented Dec 1, 2016

@cznic, @iand, others: Please note that restrictions add complexity. They complicate the explanation of the feature, and they add cognitive load for any user of the feature: if you forget about a restriction, you have to puzzle through why something you thought should work doesn't.

It's often a mistake to implement restrictions on a trial of a design solely due to hypothetical misuse. That happened in the alias proposal discussions, and it made the aliases in the trial unable to handle the io.ByteBuffer => bytes.Buffer conversion from the article. Part of the goal of writing the article is to define some cases we know we want to be able to handle, so that we don't inadvertently restrict them away.

As another example, it would be easy to make a misuse argument to disallow non-pointer receivers, or to disallow methods on non-struct types. If we'd done either of those, you couldn't create enums with String() methods for printing themselves, and you couldn't have http.Headers both be a plain map and provide helper methods. It's often easy to imagine misuses; compelling positive uses can take longer to appear, and it's important to create space for experimentation.

As yet another example, the original design and implementation for pointer vs value methods did not distinguish between the method sets on T and *T: if you had a *T, you could call the value methods (receiver T), and if you had a T, you could call the pointer methods (receiver *T). This was simple, with no restrictions to explain. But then actual experience showed us that allowing pointer method calls on values led to a specific class of confusing, surprising bugs. For example, you could write:

var buf bytes.Buffer
io.Copy(buf, reader)

and io.Copy would succeed, but buf would have nothing in it. We had to choose between explaining why that program ran incorrectly or explaining why that program didn't compile. Either way there were going to be questions, but we came down on the side of avoiding incorrect execution. Even so, we still had to write a FAQ entry about why the design has a hole cut out of it.

Again, please remember that restrictions add complexity. Like all complexity, restrictions need significant justification. At this stage in the design process it is good to think about restrictions that might be appropriate for a particular design, but we should probably only implement those restrictions after actual experience with the unrestricted, simpler design helps us understand whether the restriction would bring enough benefits to pay for its cost.

@rsc
Copy link
Contributor Author

rsc commented Dec 1, 2016

Also, my hope is that we can reach a tentative decision about what to try and then have something ready for experimentation at the beginning of the Go 1.9 cycle (ideally the day the cycle opens). Having more time to experiment will have many benefits, among them an opportunity to learn whether a particular restriction is compelling. One mistake with alias was not committing a complete implementation until near the end of the Go 1.8 cycle.

@btracey
Copy link
Contributor

btracey commented Dec 1, 2016

One thing about the original alias proposal is that in the intended use case (enabling refactoring) the actual use of the alias type should only be temporary. In the protobuffer example, the io.BytesBuffer stub was deleted once the gradual repair had been concluded.

If the alias mechanism should only be seen temporarily, does it actually require a language change? Perhaps instead there could be a mechanism to supply gc with a list of "aliases". gc could temporarily make the substitutions, and the author of the downstream codebase could gradually remove items in this file as fixes are merged. I realize this suggestion also has tricky consequences, but it at least encourages a temporary mechanism.

@Merovius
Copy link
Contributor

Merovius commented Dec 1, 2016

I will not participate in the bikeshedding about syntax (I basically don't care), with one exception: If adding aliases is decided and if it's decided to restrict them to types, please use a syntax that is consistently extensible to at least var, if not also func and const (all proposed syntactical constructs allow for all, except type Foo = pkg.Bar). The reason is that, while I agree that cases where aliases for var make the difference might be rare, I don't think they are non-existent and as such believe that we might well at some point decide to add them too. At that point we definitely will want to have all alias declarations be consistent, it would be bad if it's type Foo = pkg.Bar and var Foo => pkg.Bar.

I'd also slightly argue for having all four. The reasons are

  1. there is a distinction for var and I do sometimes use it. For example I often expose a global var Debug *log.Logger, or reassign global singletons like http.DefaultServeMux to intercept/remove registrations of packages that add handlers to it.

  2. I also think that, while func Foo() { pkg.Bar() } does the same thing as func Foo => pkg.Bar, the intention of the latter is much clearer (especially if you already know about aliases). It clearly states "this isn't really meant to be here". So while technically identical, the alias syntax might serve as documentation.

It's not the hill I'd die on, though; type-aliases alone for now would be fine with me, as long as there is the option to extend them later.

I'm also super glad that this was written up like it was. It summarizes a bunch of opinions I had about API design and stability for a while and will, in the future, serve as a simple reference to link people too :)

However, I also want to emphasize that there where additional use cases covered by aliases that are different from the doc (and AIUI the more general intention of this issue, which is to find some solution to solve gradual repair). I am very glad if the community can agree on the concept of enabling gradual repair, but if a different decision from aliases is decided to reach it, I'd also think that in that case there should be simultaneously talk about if and how to support things like the protobuf public imports or the x/image/draw use case of drop-in replacement packages (both somewhat near to my heart too) with a different solution. @btracey's proposal of a go-tool/gc flag for aliases is an example where I believe that, while it covers gradual repair relatively well, it is not really acceptable for those other usecases. You can't really expect everyone who wants to compile something that uses x/image/draw to pass those flags, they should just be able to go get.

@bcmills
Copy link
Contributor

bcmills commented Dec 1, 2016

@jba

@iand if we limit the right-hand side to a package-qualified identifier, then that would eliminate your specific concern.

It would also mean you couldn't have aliases to any types in the current package, […]. But those uses have nothing to do with the main goal of gradual code repair, so maybe they are no great loss.

Renaming within a package (e.g. to a more idiomatic or consistent name) is certainly a type of refactoring one might reasonably want to do, and if the package is used widely then that necessitates gradual repair.

I think a restriction to only package-qualified names would be a mistake. (A restriction to only exported names might be more tolerable.)

@bcmills
Copy link
Contributor

bcmills commented Dec 1, 2016

@btracey

Perhaps instead there could be a mechanism to supply gc with a list of "aliases". gc could temporarily make the substitutions, and the author of the downstream codebase could gradually remove items in this file as fixes are merged.

A mechanism for gc would either mean that the code is only buildable with gc during the repair process, or that the mechanism would have to be supported by the other compilers (e.g. gccgo and llgo) too. A "non-language-change" mechanism which must be supported by all implementations is a de facto language change — it's just one with poorer documentation.

@rsc
Copy link
Contributor Author

rsc commented Dec 1, 2016

@btracey and @bcmills, and not just the compilers: any tool that analyzes source code, like guru or anything else people have built. It's certainly a language change no matter how you slice it.

@btracey
Copy link
Contributor

btracey commented Dec 1, 2016

Okay, thanks.

@jimmyfrasche
Copy link
Member

Another possibility is aliases for everything except consts (and @rsc please forgive me for proposing a restriction!)

For consts, => is really just a longer way to write =. There's no new semantics, as with types and vars. There's no saved keystrokes as with funcs.

That would resolve #17784 at least.

The counterargument would be that tooling could treat the cases differently and that it could be an indicator of intent. That's a good counterargument, but I don't think it outweighs the fact that it's basically two ways to do exactly the same thing.

That said, I'm fine with just type aliases for now, they are certainly the most important. I definitely agree with @Merovius that we should strongly consider retaining the option for adding var and func aliases in the future, even if those doesn't happen for some time.

@xiegeo
Copy link

xiegeo commented Dec 1, 2016

How about hiding aliases behind an import, just like for "C" and “unsafe”, to further discourage it's usage? In the same vein, I would like the aliases syntax to be verbose and stand out as a scaffold for on going refactoring.

@josharian
Copy link
Contributor

As an attempt to open up the design space a little, here are some ideas. They're not fleshed out. They're probably bad and/or impossible; the hope is mainly to trigger new/better ideas in others. And if there's any interest, we can explore further.

The motivating idea for (1) and (2) is to somehow use conversion instead of aliases. In #17746, aliases ran into issues around having multiple names for the same type (or multiple ways to spell the same name, depending on whether you think of aliases as like #define or as like hard links). Using conversion sidesteps that by keeping the types distinct.

  1. Add more automatic conversion.

When you call fmt.Println("abc") or write var e interface{} = "abc", "abc" is automatically converted to an interface{}. We could change the language so that when you have declared type T struct { S }, and T has no non-promoted methods, the compiler will automatically convert between S and T as necessary, including recursively inside other structs. T could then serve as a de-facto alias of S (or vice versa) for gradual refactoring purposes.

  1. Add a new "looks like" kind of type.

Let type T ~S declare a new type T that is a type that "looks like S". More precisely, T is "any type convertible to and from type S". (As always, syntax could be discussed later.) Like interface types, T cannot have methods; to do basically anything at all with T, you need to convert it to S (or a type convertible to/from S). Unlike interface types, there is no "concrete type", conversion between S to T and T to S involves no representation changes. For gradual refactoring, these "looks like" types would allow authors to write APIs accepting both old and new types. ("Looks like" types are basically a highly restricted, simplified union type.)

  1. Type tags

Bonus super-hideous idea. (Please don't bother telling me this is awful--I know it. I'm only trying to spur new ideas in others.) What if we introduced type tags (like struct tags), and used special type tags to set up and control aliases, like say type T S "alias:\"T\"". Type tags will have other uses as well and it provides scope for more specification of aliases by the package author than merely "this type is an alias"; for example, the author of the code could specify embedding behavior.

@nigeltao
Copy link
Contributor

nigeltao commented Dec 1, 2016

If we do try aliases again, it might be worth thinking about "what does godoc do", similar to the "what does iota do" and "what does embedding do" issues.

Specifically, if we have

type  OldAPI => NewPackage.API

and NewPackage.API has a doc comment, are we expected to copy/paste that comment next to "type OldAPI", are we expected to leave it un-commented (with godoc automatically providing a link or automatically copy/pasting), or will there be some other convention?

@nigeltao
Copy link
Contributor

nigeltao commented Dec 1, 2016

Somewhat tangential, while the primary motivation is and should be supporting gradual code repair, a minor use case (going back to the alias proposal, since that is a concrete proposal) could be to avoid a double function-call overhead when presenting a single function backed by multiple, build-tag-dependent implementations. I'm only hand-waving right now, but I feel like aliases could have been useful in the recent https://groups.google.com/d/topic/golang-nuts/wb5I2tjrwoc/discussion "Avoiding function call overhead in packages with go+asm implementations" discussion.

@jimmyfrasche
Copy link
Member

@nigeltao re godoc, I think:

It should always link to the original, regardless.

If there's docs on the alias, those should be displayed, regardless.

If there are not docs on the alias, it's tempting to have godoc display the original docs, but the name of the type would be wrong if the alias also changed the name, the docs could refer to items not in the current package, and, if it's being used for gradual refactoring, there could be a message that says "Deprecated: use X" when you're looking at X.

However, maybe that wouldn't matter for the majority of use cases. Those are things that could go wrong, not things that will go wrong. And some of them could be detected by linting, like renamed aliases and accidentally copying deprecation warnings.

@tux21b
Copy link
Contributor

tux21b commented Dec 1, 2016

I am not sure if the following idea had been posted before, but what's about a mostly tool-based "gofix" / "gorename" like approach? To elaborate:

  • any package can contain a set of rewriting rules (e.g. mapping pkg.Ident => otherpkg.Ident)
  • those rewriting rules can be specified with //+rewrite ... tags inside arbitrary go files
  • those rewriting rules are not limited to ABI compatible changes, it's also possible to do other things (e.g. pkg.MyFunc(a) => pkg.MyFunc(context.Contex(), a))
  • a gofix like tool can be used to apply all transformations to the current repository. This makes it easy for users of a package to update their code.
  • it's not necessary to call the gofix tool in order to compile successfully. A library that still wants to use the old API of a dependency X (to stay compatible with old and new versions of X) can still do so. The go build command should apply the transformations (specified in the rewrite tags of package X) on-the-fly without changing the files on disk.

The last steps might complicate / slow-down the compiler a bit, but it's basically just a pre-processor and the amount of rewrite rules should be kept small anyway. So, enough brainstorming for today :)

@uluyol
Copy link
Contributor

uluyol commented Dec 1, 2016

Using aliases to avoid function call overhead seems like a hack to work around the compiler's inability to inline non-leaf functions. I don't think implementation deficiencies should influence the language spec.

@bcmills
Copy link
Contributor

bcmills commented Feb 3, 2017

We do need a new term for these newly created types because any type can now have a name.

Some ideas:

  • "distinguished" or "distinct" (as in, can be distinguished from other types)
  • "unique" (as in, it is a type different from all other types)
  • "concrete" (as in, it is an entity that exists in the runtime)
  • "identifiable" (as in, the type has an identity)

@griesemer
Copy link
Contributor

@bcmills We've been thinking about distinguished, unique, distinct, branded, colored, defined, non-alias, etc. types. "Concrete" is misleading because an interface can be colored as well, and an interface is the incarnation of an abstract type. "Identifiable" also seems misleading because a "struct{int}" is just as identifiable as any explicitly (non-alias) named type.

@bcmills
Copy link
Contributor

bcmills commented Feb 3, 2017

I would recommend against:

  • "colored" (in non-programming contexts the phrase "colored types" carries strong racial-bias connotations)
  • "non-alias" (it's confusing, since the target of the alias may or may not be what was formerly called a "named type")
  • "defined" (aliases are defined too, they're just defined to be aliases)

"branded" could work: it carries a "types as cattle" connotation but that doesn't strike me as intrinsically bad.

@jimmyfrasche
Copy link
Member

Unique and distinct seem like the stand out options so far.

They're simple and understandable without a lot of additional context or knowledge. If I didn't know the distinction, I think I'd at least have a general sense of what they imply. I can't say that about the other choices.

Once you learn the term it doesn't matter, but a connotative name avoids unnecessary barriers to internalizing the distinction.

@rsc
Copy link
Contributor Author

rsc commented Feb 3, 2017

This is the definition of a bikeshed argument. Robert has a pending CL at https://go-review.googlesource.com/#/c/36213/ that seems perfectly fine.

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/36213 mentions this issue.

gopherbot pushed a commit that referenced this issue Feb 6, 2017
To avoid confusion caused by the term "named type" (which now just
means a type with a name, but formerly meant a type declared with
a non-alias type declaration), a type declaration now comes in two
forms: alias declarations and type definitions. Both declare a type
name, but type definitions also define new types.

Replace the use of "named type" with "defined type" elsewhere in
the spec.

For #18130.

Change-Id: I49f5ddacefce90354eb65ee5fbf10ba737221995
Reviewed-on: https://go-review.googlesource.com/36213
Reviewed-by: Rob Pike <[email protected]>
@LionNatsu
Copy link
Contributor

I want to bring up the issue of go fix again.

To be clear that I am not suggesting 'take down' the alias. Maybe it is some thing useful and suitable for other jobs, that is another story.

It's something very important IMO that the title is about moving type. I have no wish to perplex the issue. Our aim is to deal with a kind of interface changes in a project. When we come to a change on interface, it is not true that we hope all the users use these two interface (old & new) as the same eventually, and that is why we say 'gradual code repair'. We hope that users remove/change the usage of the old one.

I still consider tool as the best method to repair the code, something like the idea which @tux21b suggested. For example:

$ cat "$GOROOT"/RENAME
# This file could be used for `go fix`
[package]
x/net/context=context
[type]
io.ByteBuffer=bytes.Buffer

$ go fix -rename "$GOROOT"/RENAME [packages]
# -- or --
# use a standard libraries rename table as default
$ go fix -rename [packages]
# -- or --
# include this fix as default
$ go fix [packages]

The only reason @rsc say no here is that changes will affect other tools. But I think it's not true in this work flow: if there is an out-of-date package (e.g. a dependency) uses the deprecated name/path of package, e.g. x/net/context, we can fix the code at first, just like the doc says how to migrate code to new version, but not hard-coding, via a configurable table in text format. Then you may use any tools whenever you like as same as Go of the new version. There is a side-effect: it will modify code.

@rsc
Copy link
Contributor Author

rsc commented Feb 7, 2017

@LionNatsu, I think you are right, but I think that's a separate issue: should we adopt conventions for packages to explain to potential clients how to update their code in response to API changes in a mechanical way? Perhaps, but we'd have to figure out what those conventions are. Can you open a separate issue for this topic, pointing back at this conversation? Thanks.

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/36691 mentions this issue.

gopherbot pushed a commit that referenced this issue Feb 10, 2017
…rsions

We missed this in https://golang.org/cl/36213.
Thanks to Chris Hines for pointing it out.

For #18130.

Change-Id: I6279ab19966c4391c4b4458b21fd2527d3f949dd
Reviewed-on: https://go-review.googlesource.com/36691
Reviewed-by: Ian Lance Taylor <[email protected]>
@crawshaw
Copy link
Member

With this proposal at tip, I can now create this package:

package safe

import "unsafe"

type Pointer = unsafe.Pointer

which allows programs to create unsafe.Pointer values without importing unsafe directly:

package main

import "safe"

func main() {
	x := []int{4, 9}
	y := *(*int)(safe.Pointer(uintptr(safe.Pointer(&x[0])) + 8))
	println(y)
}

The original alias declarations design doc calls out this as explicitly supported. It is not explicit in this newer type alias proposal, but it works.

On the alias declaration issue the rational for this is: "The reason we allow aliasing for unsafe.Pointer is that it's already possible to define a type that has unsafe.Pointer as underlying type." #16339 (comment)

While that's true, I think allowing an alias of unsafe.Pointer introduces something new: programs can now create unsafe.Pointer values without explicitly importing unsafe.

To write the program above before this proposal, I would have to move the safe.Pointer cast into a package that imports unsafe. This may make it a bit harder to audit programs for their use of unsafe.

@bcmills
Copy link
Contributor

bcmills commented Feb 26, 2017

@crawshaw, couldn't you have just done this before?

package safe

import (
  "reflect"
  "unsafe"
)

func Pointer(p interface {}) unsafe.Pointer {
  switch v := reflect.ValueOf(p); v.Kind() {
  case reflect.Uintptr:
    return unsafe.Pointer(uintptr(v.Uint()))
  default:
    return unsafe.Pointer(v.Pointer())
  }
}

I believe that would allow exactly the same program to compile, with the same lack of import in package main.

(It wouldn't necessarily be a valid program: the uintptr-to-Pointer conversion includes a function call, so it doesn't meet the unsafe package constraint that "both conversions must appear in the same expression, with only the intervening arithmetic between them". However, I suspect it would be possible to construct an equivalent, valid program without importing unsafe from main by making use of things like reflect.SliceHeader.)

@dr2chase
Copy link
Contributor

Seems like exporting a hidden unsafe type is just another rule to add to the audit.

@crawshaw
Copy link
Member

Yes, I wanted to point out that directly aliasing unsafe.Pointer makes code harder to audit, enough so that I hope no one ends up doing so.

@griesemer
Copy link
Contributor

@crawshaw Per my comment, this was also true before we had type aliasing. The following is valid:

package a

import "unsafe"

type P unsafe.Pointer
package main

import "./a"
import "fmt"

var x uint64 = 0xfedcba9876543210
var h = *(*uint32)(a.P(uintptr(a.P(&x)) + 4))

func main() {
	fmt.Printf("%x\n", h)
}

That is, in package main, I can do unsafe arithmetic using a.P even though there's no unsafe package and a.P is not an alias. This was always possible.

Is there something else you are referring to?

@crawshaw
Copy link
Member

My mistake. I thought that didn't work. (I was under the impression that the special rules applied to unsafe.Pointer would not propagate to new types defined from it.)

@griesemer
Copy link
Contributor

The spec is actually not clear on this. Looking at the implementation of go/types, it turns out that my initial implementation required unsafe.Pointer exactly, not just some type that happened to have an underlying type of unsafe.Pointer. I just found #6326 which is when I changed go/types to be gc compliant.

Perhaps we should disallow this for regular type definitions and also disallow aliases of unsafe.Pointer. I can't see any good reason for allowing it and it does compromise the explicitness of having to import unsafe for unsafe code.

@griesemer
Copy link
Contributor

I created #19306.

@bradfitz
Copy link
Contributor

bradfitz commented May 3, 2017

This happened. I don't think anything remains here.

@bradfitz bradfitz closed this as completed May 3, 2017
gopherbot pushed a commit that referenced this issue Jun 7, 2017
As motivated by https://golang.org/design/18130-type-alias which says:

https://github.com/golang/proposal/blob/master/design/18130-type-alias.md#relationship-to-byte-and-rune

> The language specification already defines byte as an alias for
> uint8 and similarly rune as an alias for int32, using the word alias
> as an informal term. It is a goal that the new type declaration
> semantics not introduce a different meaning for alias. That is, it
> should be possible to describe the existing meanings of byte and
> uint8 by saying that they behave as if predefined by:
>
>     type byte = uint8
>     type rune = int32

So, do that. Seems to work.

Updates #18130

Change-Id: I0740bab3f8fb23e946f3542fdbe819007a99465a
Reviewed-on: https://go-review.googlesource.com/45017
Reviewed-by: Ian Lance Taylor <[email protected]>
Reviewed-by: Robert Griesemer <[email protected]>
Run-TryBot: Brad Fitzpatrick <[email protected]>
TryBot-Result: Gobot Gobot <[email protected]>
@golang golang locked and limited conversation to collaborators May 3, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests