diff --git a/docs/Generics/chapters/generic-environments.tex b/docs/Generics/chapters/archetypes.tex similarity index 86% rename from docs/Generics/chapters/generic-environments.tex rename to docs/Generics/chapters/archetypes.tex index 3bb22bd6ed1c5..d8bc1b0a243c4 100644 --- a/docs/Generics/chapters/generic-environments.tex +++ b/docs/Generics/chapters/archetypes.tex @@ -2,15 +2,11 @@ \begin{document} -\chapter{Generic Environments}\label{genericenv} +\chapter{Archetypes}\label{genericenv} \lettrine{A}{n archetype encapsulates} a type parameter together with a generic signature. This self-contained representation can answer ``what protocols do you conform to'' and similar questions without further context. Contrast this with type parameters, which are always interpreted with respect to a separately-given generic signature. Archetypes are created by mapping type parameters into a \emph{generic environment}, an object which associates a generic signature with some additional state. -\index{SILGen}% -\index{IRGen}% -\index{runtime type metadata}% -\index{expression}% -While there can be more than one generic environment for a single generic signature, exactly one of them is the \emph{primary generic environment}, its archetypes being \emph{primary archetypes}. We will discuss these first. Primary archetypes appear when the constraint solver assigns types to expressions. SILGen lowers these expressions to SIL instructions which consume and produce SIL values, whose types can also in turn contain archetypes. Finally, when IRGen generates code for a generic function, archetypes become \emph{values} representing the runtime type metadata provided by the caller of the generic function. To completely nail down the semantic distinction between type parameters and primary archetypes, consider this simple program: +While there can be more than one generic environment for a single generic signature, exactly one of them is the \emph{primary generic environment}, its archetypes being \emph{primary archetypes}. We will discuss these first. Primary archetypes appear when the constraint solver assigns types to \index{expression}expressions. \index{SILGen}SILGen lowers these expressions to \index{SIL}SIL instructions which consume and produce SIL values, whose types can also in turn contain archetypes. Finally, when \index{IRGen}IRGen generates code for a generic function, archetypes become \emph{values} representing the \index{runtime type metadata}runtime type metadata provided by the caller of the generic function. To completely nail down the semantic distinction between type parameters and primary archetypes, consider this simple program: \begin{Verbatim} struct G { var iter: I @@ -60,7 +56,7 @@ \chapter{Generic Environments}\label{genericenv} \end{quote} However, a restricted form of mapping out of an environment is defined for non-primary archetypes, by projecting the type parameter from an \emph{individual} archetype. -\paragraph{Archetype equality.} Per Section~\ref{reducedtypes}, \index{same-type requirement}same-type requirements define an equivalence relation on the type parameters of a generic signature. Two type parameters that belong to the same equivalence class essentially represent two different ``spellings'' of the same \index{reduced type}reduced type parameter. In constrast, the underlying type parameter of an archetype is always reduced. So, a single archetype represents an entire equivalence class of type parameters in a fixed generic environment, and every type parameter in this equivalence class maps to the same archetype. This has several consequences: +\paragraph{Archetype equality.} Per \SecRef{type params}, \index{same-type requirement}same-type requirements define an equivalence relation on the type parameters of a generic signature. Two type parameters that belong to the same equivalence class essentially represent two different ``spellings'' of the same \index{reduced type}reduced type parameter. In constrast, the underlying type parameter of an archetype is always reduced. So, a single archetype represents an entire equivalence class of type parameters in a fixed generic environment, and every type parameter in this equivalence class maps to the same archetype. This has several consequences: \begin{itemize} \item The three relations of \index{reduced type equality}reduced type equality, \index{canonical type equality}canonical type equality and \index{type pointer equality}type pointer equality coincide on archetypes. Note that types \emph{containing} archetypes may still be canonically equal without being pointer equal, if they differ by type sugar in other positions. @@ -138,7 +134,7 @@ \chapter{Generic Environments}\label{genericenv} \end{scope} \end{tikzpicture} \end{quote} -Unlike primary archetypes, opaque archetypes are visible outside of any lexical scope; using the substitution map, they can represent a reference to a (substituted) return type of the owner declaration. Opaque archetypes are also valid inside interface types, in particular they also represent the return type in the interface type of the owner declaration itself. Details are in Chapter~\ref{opaqueresult}. +Unlike primary archetypes, opaque archetypes are visible outside of any lexical scope; using the substitution map, they can represent a reference to a (substituted) return type of the owner declaration. Opaque archetypes are also valid inside interface types, in particular they also represent the return type in the interface type of the owner declaration itself. Details are in \ChapRef{opaqueresult}. \index{call expression} \index{expression} \item An \textbf{opened generic environment} is created when an existential value is opened at a call site. Formally, opened archetypes represent the concrete payload stored inside an existential type. @@ -155,25 +151,25 @@ \chapter{Generic Environments}\label{genericenv} \end{scope} \end{tikzpicture} \end{quote} -Every opening expression gets a fresh UUID, and therefore a new opened generic environment. In the AST, opened archetypes cannot ``escape'' outside of their opening expression; in SIL, they are introduced by an opening instruction and are similarly scoped by the dominance relation on the control flow graph. Details in Chapter~\ref{existentialtypes}. +Every opening expression gets a fresh UUID, and therefore a new opened generic environment. In the AST, opened archetypes cannot ``escape'' outside of their opening expression; in SIL, they are introduced by an opening instruction and are similarly scoped by the dominance relation on the control flow graph. Details in \ChapRef{existentialtypes}. \end{itemize} \section{Local Requirements}\label{local requirements} -The \IndexDefinition{local requirements}\emph{local requirements} of an archetype describe the behavior of the archetype's type parameter. Abstractly, a local requirement of an archetype $\archetype{T}$ is a \index{derived requirement}derived requirement whose subject type is \texttt{T}. Concretely, the local requirements are the results of \index{generic signature query}generic signature queries against the archetype's generic signature (Section~\ref{genericsigqueries}): +The \IndexDefinition{local requirements}\emph{local requirements} of an archetype describe the behavior of the archetype's type parameter. Abstractly, a local requirement of an archetype $\archetype{T}$ is a \index{derived requirement}derived requirement whose subject type is \texttt{T}. Concretely, the local requirements are the results of \index{generic signature query}generic signature queries against the archetype's generic signature (\SecRef{genericsigqueries}): \begin{itemize} \item \textbf{Required protocols:} a minimal and canonical list of protocols the archetype is known to conform to. This is the result of the \Index{getRequiredProtocols()@\texttt{getRequiredProtocols()}}\texttt{getRequiredProtocols()} generic signature query. \item \textbf{Superclass bound:} an optional superclass type that the archetype is known to be a subclass of. This is the result of the \Index{getSuperclassBound()@\texttt{getSuperclassBound()}}\texttt{getSuperclassBound()} generic signature query. \item \textbf{Requires class flag:} a boolean indicating if the archetype is class-constrained. This is the result of the \Index{requiresClass()@\texttt{requiresClass()}}\texttt{requiresClass()} generic signature query. \item \textbf{Layout constraint:} an optional layout constraint the archetype is known to satisfy. This is the result of the \Index{getLayoutConstraint()@\texttt{getLayoutConstraint()}}\texttt{getLayoutConstraint()} generic signature query. \end{itemize} -By storing their local requirements, archetype take on certain behaviors of the fully-concrete types, namely that they can be interpreted without an explicitly-provided generic signature. In a sense, an archetype is really the ``most general'' concrete type satisfying a set of generic requirements. (As a minor point, we don't actually perform all of the above generic signature queries in succession when constructing an archetype. A special \index{getLocalRequirements()@\texttt{getLocalRequirements()}}\texttt{getLocalRequirements()} query returns all of the necessary information in one shot.) +By storing their local requirements, archetype take on certain behaviors of the fully-concrete types, namely that they can be interpreted without an explicitly-provided generic signature. In a sense, an archetype is really the ``most general'' concrete type satisfying a set of generic requirements. (As a minor point, we don't actually perform all of the above generic signature queries in succession when constructing an archetype. A special \Index{getLocalRequirements()@\texttt{getLocalRequirements()}}\texttt{getLocalRequirements()} query returns all of the necessary information in one shot.) -\paragraph{Global conformance lookup.} We've seen \index{global conformance lookup}global conformance lookup with nominal types (Section~\ref{conformance lookup}) and type parameters (Section~\ref{abstract conformances}). It also generalizes to archetypes. Let $\archetype{T}$ be an archetype, and \texttt{P} be a protocol. +\paragraph{Global conformance lookup.} We've seen \index{global conformance lookup}global conformance lookup with nominal types (\SecRef{conformance lookup}) and type parameters (\SecRef{abstract conformances}). It also generalizes to archetypes. Let $\archetype{T}$ be an archetype, and \texttt{P} be a protocol. \begin{enumerate} \item If a superclass requirement $\ConfReq{T}{C}$ can be \index{derived requirement}derived from the archetype's generic signature, and the class type \texttt{C} conforms to \texttt{P}, then the archetype \emph{conforms concretely}. Global conformance lookup recursively calls itself with the archetype's superclass type. \[\protosym{P}\otimes\archetype{T}=\protosym{P}\otimes\texttt{C}=\ConfReq{C}{P}\] -The result is actually wrapped in an inherited conformance, which doesn't do much except report the conforming type as $\archetype{T}$ rather than \texttt{C} (Section~\ref{inheritedconformance}). +The result is actually wrapped in an inherited conformance, which doesn't do much except report the conforming type as $\archetype{T}$ rather than \texttt{C} (\SecRef{inheritedconformance}). \item If the conformance requirement $\ConfReq{T}{P}$ can be derived from the generic signature, the archetype \emph{conforms abstractly}, and thus global conformance lookup returns an abstract conformance: \[\protosym{P}\otimes\archetype{T}=\ConfReq{$\archetype{T}$}{P}\] \item Otherwise, $\archetype{T}$ does not conform to \texttt{P}, and global conformance lookup returns an invalid conformance. @@ -187,7 +183,7 @@ \section{Local Requirements}\label{local requirements} \protosym{Q}\otimes \archetype{T} = \ConfReq{C<$\archetype{U}$, $\archetype{V}$>}{Q} \] -\paragraph{Qualified name lookup.} An archetype can serve as the base type for a \index{qualified lookup}qualified name lookup. Recall from Section~\ref{contextsubstmap} that qualified lookup searches within the reachable declaration contexts of the given base type. The reachable declaration contexts of an archetype are the protocols it conforms to, the class declaration of its superclass type, and any protocols the superclass conforms to. One can write a protocol composition type with each of these as terms; a qualified lookup with an archetype will find the same members as a qualified lookup with this protocol composition type as the base type. +\paragraph{Qualified name lookup.} An archetype can serve as the base type for a \index{qualified lookup}qualified name lookup. Recall from \ChapRef{decls} that qualified lookup searches within the reachable declaration contexts of the given base type. The reachable declaration contexts of an archetype are the protocols it conforms to, the class declaration of its superclass type, and any protocols the superclass conforms to. One can write a protocol composition type with each of these as terms; a qualified lookup with an archetype will find the same members as a qualified lookup with this protocol composition type as the base type. To continue with our example, consider a member reference \index{expression}expression where the base type is $\archetype{T}$. The reachable declaration contexts are the class declaration of \texttt{C}, the protocol declaration of \texttt{P}, and the protocol declaration of \texttt{Q}. The corresponding protocol composition type is just \texttt{C<$\archetype{U}$, $\archetype{V}$> \& P} (we said that \texttt{C} conforms to \texttt{Q}, so \texttt{Q} is not an explicit part of this composition). @@ -216,8 +212,8 @@ \section{Local Requirements}\label{local requirements} In the generic signature \verb||, the archetype \archetype{S} conforms to \texttt{Sequence}, and \archetype{S.Iterator} conforms to \texttt{IteratorProtocol}, both abstractly: \begin{gather*} -\Proto{Sequence}\otimes \archetype{S} = \ConfReq{$\archetype{S}$}{Sequence}\\ -\Proto{IteratorProtocol}\otimes \archetype{S.Iterator} = \ConfReq{$\archetype{S.Iterator}$}{IteratorProtocol} +\protosym{Sequence}\otimes \archetype{S} = \ConfReq{$\archetype{S}$}{Sequence}\\ +\protosym{IteratorProtocol}\otimes \archetype{S.Iterator} = \ConfReq{$\archetype{S.Iterator}$}{IteratorProtocol} \end{gather*} We can project the type witness for \texttt{[Sequence]Element} from the first conformance, and arrive at the archetype $\archetype{S.Element}$: \begin{gather*} @@ -234,7 +230,7 @@ \section{Local Requirements}\label{local requirements} \AssocType{[Sequence]Element} \otimes \ConfReq{$\archetype{S}$}{Sequence} = \texttt{Int} \\ \AssocType{[IteratorProtocol]Element}\otimes \ConfReq{$\archetype{S.Iterator}$}{IteratorProtocol} = \texttt{Int} \end{gather*} -It turns out that we can define a directed graph where the vertices are archetypes are edges are given by type witness projection. We will explore this in Section~\ref{type parameter graph}. +It turns out that we can define a directed graph where the vertices are archetypes are edges are given by type witness projection. We will explore this in \SecRef{type parameter graph}. \section{Primary Archetypes}\label{archetypesubst} @@ -254,7 +250,7 @@ \section{Primary Archetypes}\label{archetypesubst} \paragraph{Forwarding substitution map.} When working in the expression type checker, the SIL optimizer, or anywhere else that deals with both interface types and contextual types, a special substitution map often appears. If $G$ is a generic signature, the \IndexDefinition{forwarding substitution map}\emph{forwarding substitution map} of $G$, denoted \index{$1_{\EquivClass{G}}$}\index{$1_{\EquivClass{G}}$!z@\igobble|seealso{forwarding substitution map}}$1_{\EquivClass{G}}$, sends each generic parameter \ttgp{d}{i} of $G$ to the corresponding archetype $\archetype{\ttgp{d}{i}}$ in the primary generic environment of $G$: \[1_{\EquivClass{G}}:=\{\archetype{\ttgp{0}{0}},\,\ldots,\,\archetype{\ttgp{m}{n}};\,\ldots,\,\ConfReq{$\archetype{\ttgp{0}{0}}$}{P},\,\ldots\}\] -The forwarding substitution map looks similar to the identity substitution map $1_G$ that we saw in Section~\ref{submapcomposition}, which sends every generic parameter to itself: +The forwarding substitution map looks similar to the identity substitution map $1_G$ that we saw in \SecRef{submapcomposition}, which sends every generic parameter to itself: \[1_{G}:=\{\ttgp{0}{0},\,\ldots,\,\ttgp{m}{n};\,\ldots,\,\ConfReq{\ttgp{0}{0}}{P},\,\ldots\}\] If $\texttt{T}\in\TypeObj{G}$ and $\texttt{T}^\prime\in\TypeObj{\EquivClass{G}}$, we can apply $1_G$ and $1_{\EquivClass{G}}$ to each one of \texttt{T} and $\texttt{T}^\prime$: \begin{gather*} @@ -313,9 +309,9 @@ \section{Primary Archetypes}\label{archetypesubst} \section{The Type Parameter Graph}\label{type parameter graph} -The recursive structure of type parameters (Section~\ref{fundamental types}) suggests an intuitive viewpoint where we consider a dependent member type, like \texttt{\ttgp{0}{0}.Iterator.Element}, to be a ``child'' of \texttt{\ttgp{0}{0}.Iterator}, which is a ``child'' of \ttgp{0}{0}, a ``top level'' generic parameter type. The derived requirements formalism (Section~\ref{derived req}) tells us that a base type's ``children'' are formed from the associated type declarations of the base type's protocol conformances. Thus, if \ttgp{0}{0} conforms to \texttt{Sequence}, we can think of it as having two ``children,'' \texttt{\ttgp{0}{0}.Element} and \texttt{\ttgp{0}{0}.Iterator}. This gives us a visual aid to understand the type parameters of a generic signature. +The recursive structure of type parameters (\SecRef{fundamental types}) suggests an intuitive viewpoint where we consider a dependent member type, like \texttt{\ttgp{0}{0}.Iterator.Element}, to be a ``child'' of \texttt{\ttgp{0}{0}.Iterator}, which is a ``child'' of \ttgp{0}{0}, a ``top level'' generic parameter type. The derived requirements formalism (\SecRef{derived req}) tells us that a base type's ``children'' are formed from the associated type declarations of the base type's protocol conformances. Thus, if \ttgp{0}{0} conforms to \texttt{Sequence}, we can think of it as having two ``children,'' \texttt{\ttgp{0}{0}.Element} and \texttt{\ttgp{0}{0}.Iterator}. This gives us a visual aid to understand the type parameters of a generic signature. -So far, we've described a tree structure, where each generic parameter type has zero or more dependent member types as children, some of which may then have their own children, and so on. However, it is more illuminative to consider not the type parameters themselves, but rather equivalence classes of type parameters, under the reduced type equality relation of a generic signature (Section~\ref{reducedtypes}). This is justified by observing that if two type parameters are equivalent, any requirement that applies to one also applies to the other. +So far, we've described a tree structure, where each generic parameter type has zero or more dependent member types as children, some of which may then have their own children, and so on. However, it is more illuminative to consider not the type parameters themselves, but rather equivalence classes of type parameters, under the reduced type equality relation of a generic signature (\SecRef{type params}). This is justified by observing that if two type parameters are equivalent, any requirement that applies to one also applies to the other. Thus, we no longer have a tree, because ``parents'' can share ``children,'' and children can have children that point back at their own parents, and so on. In fact, we only get a tree in the special case where the generic signature does not have any derived same-type requirements; then, every equivalence class contains exactly one representative (if we ignore \index{unbound dependent member type}unbound dependent member types). In the general case where the equivalence relation may be non-trivial, we get a directed graph. We will have occasion to study other directed graphs later, so we'll start with some general definitions. @@ -345,7 +341,7 @@ \section{The Type Parameter Graph}\label{type parameter graph} \end{tikzpicture} \end{wrapfigure} \noindent -Intuitively, a directed graph can be visualized by representing each vertex as a point, with each edge drawn as an arrow joining the source to the destination. The direction of the arrow points towards the destination vertex. Our definition allows for $V$ or $E$ to be an infinite set; if so, we say $(V, E)$ is an \IndexDefinition{infinite graph}\emph{infinite graph}. We cannot hope to draw the entire directed graph in this case, but we can still look at a finite \IndexDefinition{subgraph}\emph{subgraph} $(V^\prime, E^\prime)$, where $V^\prime\subset V$ and $E^\prime\subset E$. +A directed graph can be visualized by representing each vertex as a point, with each edge drawn as an arrow joining the source to the destination. The direction of the arrow points towards the destination vertex. Our definition allows for $V$ or $E$ to be an infinite set; if so, we say $(V, E)$ is an \IndexDefinition{infinite graph}\emph{infinite graph}. We cannot hope to draw the entire directed graph in this case, but we can still look at a finite \IndexDefinition{subgraph}\emph{subgraph} $(V^\prime, E^\prime)$, where $V^\prime\subset V$ and $E^\prime\subset E$. \begin{wrapfigure}{r}{2.8cm} @@ -437,7 +433,7 @@ \section{The Type Parameter Graph}\label{type parameter graph} \item For each pair of reduced types \texttt{U} and \texttt{V}, we add an edge with source vertex \texttt{U} and destination vertex \texttt{V} if for some protocol \texttt{P} and associated type \texttt{A} of \texttt{P}, the following two conditions hold: \begin{enumerate} \item \texttt{U} is a type parameter conforming to the protocol \texttt{P} (that is, $G\vDash\ConfReq{U}{P}$), -\item the dependent member type \texttt{U.[P]A} with base type \texttt{U} is equivalent to \texttt{V} (that is, $G\vDash\FormalReq{U.[P]A == V}$). +\item the dependent member type \texttt{U.[P]A} with base type \texttt{U} is equivalent to \texttt{V} (that is, $G\vDash\SameReq{U.[P]A}{V}$). \end{enumerate} The added edge is labeled by the associated type declaration, which we denote by its name prefixed with ``\texttt{.}'', like ``\texttt{.A}''. The intuitive interpretation is that we can ``jump'' from \texttt{U} to \texttt{V} by suffixing \texttt{U} with \texttt{.A} to get \texttt{U.A}, which is equivalent to \texttt{V}. \end{itemize} @@ -563,7 +559,7 @@ \section{The Type Parameter Graph}\label{type parameter graph} Notice how the graph exhibits the equivalence between \texttt{\ttgp{0}{0}.Iterator.Element} and \texttt{\ttgp{0}{0}.Element}. \end{example} -\begin{example} We saw this example already at the end of Section~\ref{reducedtypes}. Recall our protocol \texttt{N} whose \index{recursive conformance requirement}associated type conforms to itself: +\begin{example} We saw this example already at the end of \SecRef{type params}. Recall our protocol \texttt{N} whose \index{recursive conformance requirement}associated type conforms to itself: \begin{Verbatim} protocol N { associatedtype A: N @@ -666,18 +662,20 @@ \section{The Type Parameter Graph}\label{type parameter graph} \item An archetype represents the reduced type parameter of some equivalence class, and therefore defines a vertex. \item The edge relation is implied by each archetype storing it's \index{local requirements}local requirements, in particular its list of conformed protocols; the associated type declarations of each protocol define the outgoing edges of each vertex. \end{itemize} -In general, the type parameter graph may be infinite, but the compiler can only ever instantiate a finite set of archetypes in a compilation session. Archetypes are built lazily as needed when type parameters are mapped into an environment, so we explore a finite, but arbitrarily large, \index{subgraph}subgraph of the type parameter graph. We will see a similar construction in Chapter~\ref{conformance paths}, where we show that the \emph{conformance path graph} is infinite in the general case, but we are able to construct arbitrarily-large finite subgraphs during compilation. +In general, the type parameter graph may be infinite, but the compiler can only ever instantiate a finite set of archetypes in a compilation session. Archetypes are built lazily as needed when type parameters are mapped into an environment, so we explore a finite, but arbitrarily large, \index{subgraph}subgraph of the type parameter graph. We will see a similar construction in \ChapRef{conformance paths}, where we show that the \emph{conformance path graph} is infinite in the general case, but we are able to construct arbitrarily-large finite subgraphs during compilation. -\paragraph{Historical aside.} \index{history}A generic environment gives a complete semantic description of a generic signature: the type parameter graph encodes the reduced type equality relation, and archetypes store their local requirements. Today, this is a derived representation, and not a ``source of truth''; the reduced types and local requirements of archetypes are determined by generic signature queries, which are ultimately implemented with a rewrite system, as we will see in in Chapter~\ref{rqm basic operation}. One might ask if it is possible to go the other way, and directly construct a type parameter graph from a generic signature, and then use this graph to compute reduced types and answer generic signature queries. In fact, Swift generics were formerly implemented in this manner. +\paragraph{Historical aside.} A generic environment gives a complete semantic description of a generic signature: the type parameter graph encodes the reduced type equality relation, and archetypes store their local requirements. Today, this is a derived representation, and not a ``source of truth''; the reduced types and local requirements of archetypes are determined by generic signature queries, which are ultimately implemented with a rewrite system, as we will see in in \ChapRef{rqm basic operation}. One might ask if it is possible to go the other way, and directly construct a type parameter graph from a generic signature, and then use this graph to compute reduced types and answer generic signature queries. In fact, Swift generics were formerly implemented in this manner. -In Swift 3.1 and earlier, the type parameter graph took on a much simpler form: +% Another important generalization allowed an associated type to conform to the same protocol that it appears in, either directly or indirectly. The ability to declare a so-called \index{recursive conformance requirement}\emph{recursive conformance} was introduced in \IndexSwift{4.1}Swift 4.1 \cite{se0157}. This feature has some profound implications, which are further explored in \SecRef{type parameter graph}, \ref{recursive conformances}, \ref{monoidsasprotocols}, and \ref{recursive conformances redux}. + +In \IndexSwift{3.1}Swift 3.1 and earlier, the type parameter graph took on a much simpler form: \begin{itemize} -\item Prior to the introduction of \texttt{where} clauses on protocols and associated types in Swift 4.0 \cite{se0142}, protocols were quite limited in what requirements they could impose on the protocol \texttt{Self} type. A protocol's contribution to the type parameter graph was defined by its protocol inheritance relationships, together with conformance and superclass requirements on the protocol's immediate associated types. -\item Even more importantly, prior to the introduction of recursive conformances in Swift~4.1 \cite{se0157}, the set of equivalence classes in a generic signature was always finite, and thus the set of vertices in the type parameter graph was also finite. +\item Prior to the introduction of \texttt{where} clauses on protocols and associated types in \IndexSwift{4.0}Swift 4 \cite{se0142}, protocols were quite limited in what requirements they could impose on the protocol \texttt{Self} type. A protocol's contribution to the type parameter graph was defined by its protocol inheritance relationships, together with conformance and superclass requirements on the protocol's immediate associated types. +\item Even more importantly, prior to the introduction of recursive conformances in \IndexSwift{4.1}Swift~4.1 \cite{se0157}, the set of equivalence classes in a generic signature was always finite, and thus the set of vertices in the type parameter graph was also finite. \end{itemize} Given the above restrictions, the so-called \IndexDefinition{ArchetypeBuilder@\texttt{ArchetypeBuilder}}\texttt{ArchetypeBuilder} algorithm was able to directly construct the type parameter graph from the user-written requirements of a generic declaration in one fell swoop. -While now obsolete, the algorithm was quite elegant, and understanding its limitations helps motivate the material in Part~\ref{part rqm}, so we're going to review it here. Note that Swift~3.1 did not have layout requirements, and to simplify matters we're going to ignore superclass and concrete type requirements also, since they only involve a little bit of additional bookkeeping and are not central to the algorithm; so we will only consider conformance requirements and same-type requirements between type parameters below. +While now obsolete, the algorithm was quite elegant, and understanding its limitations helps motivate the material in \PartRef{part rqm}, so we're going to review it here. Note that Swift~3.1 did not have layout requirements, and to simplify matters we're going to ignore superclass and concrete type requirements also, since they only involve a little bit of additional bookkeeping and are not central to the algorithm; so we will only consider conformance requirements and same-type requirements between type parameters below. The \texttt{ArchetypeBuilder} represented each equivalence class of type parameters by a structure containing these fields: \begin{itemize} @@ -732,19 +730,19 @@ \section{The Type Parameter Graph}\label{type parameter graph} \item (Reprocess) If the pending list is empty and the flag is set, move any requirements from the delayed list to the pending list, and clear the flag. \item (Check) If the pending list is still empty, go to Step~8. \item (Resolve) Remove a requirement from the pending list. -\item (Conformance) For a conformance requirement $\ConfReq{T}{P}$, invoke Algorithm~\ref{archetype builder lookup} to resolve \texttt{T} to an equivalence class. If this equivalence class does not exist, add $\ConfReq{T}{P}$ to the delayed list and set the flag. Otherwise, invoke Algorithm~\ref{archetype builder expand} to expand the conformance requirement. -\item (Same-type) For a same-type requirement $\FormalReq{T == U}$, invoke Algorithm~\ref{archetype builder lookup} to resolve each one of \texttt{T} and \texttt{U} to an equivalence class. If either equivalence class does not exist, add $\FormalReq{T == U}$ to the delayed list and set the flag. Otherwise, invoke Algorithm~\ref{archetype builder merge} to merge the two equivalence classes. +\item (Conformance) For a conformance requirement $\ConfReq{T}{P}$, invoke \AlgRef{archetype builder lookup} to resolve \texttt{T} to an equivalence class. If this equivalence class does not exist, add $\ConfReq{T}{P}$ to the delayed list and set the flag. Otherwise, invoke \AlgRef{archetype builder expand} to expand the conformance requirement. +\item (Same-type) For a same-type requirement $\SameReq{T}{U}$, invoke \AlgRef{archetype builder lookup} to resolve each one of \texttt{T} and \texttt{U} to an equivalence class. If either equivalence class does not exist, add $\SameReq{T}{U}$ to the delayed list and set the flag. Otherwise, invoke \AlgRef{archetype builder merge} to merge the two equivalence classes. \item (Repeat) Go back to Step~2. \item (Diagnose) If we end up here, there are no more pending requirements. Any requirements remaining on the delayed list reference type parameters which cannot be resolved; diagnose them as invalid. \end{enumerate} \end{algorithm} Once the type parameter graph was complete, the \texttt{ArchetypeBuilder} would live up to its name, and actually build archetypes; each archetype stored its conformed protocols, member types, and other information. Generic signature queries did not exist back then; semantic questions were answered from the archetype representation. A nascent form of requirement minimization was implemented by traversing the entire type parameter graph and collecting the requirements from each equivalence class, but we'll skip over the details here. -To use modern terminology, we can say that the \texttt{ArchetypeBuilder} discovered all \index{derived requirement}derived requirements of a generic signature (of which there were only finitely many, at the time) by exhaustive enumeration. This design survived the introduction of protocol \texttt{where} clauses in Swift 4.0 with only relatively minor complications. The introduction of recursive conformances in Swift 4.1 necessitated a much larger overhaul. The type parameter graph could now be infinite, so the eager conformance requirement expansion of Algorithm~\ref{archetype builder expand} no longer made sense. The \texttt{ArchetypeBuilder} was renamed to the \IndexDefinition{GenericSignatureBuilder@\texttt{GenericSignatureBuilder}}\texttt{GenericSignatureBuilder}, and the up-front graph construction was replaced with incremental expansion of the type parameter graph when resolving type parameters to equivalence classes \cite{implrecursive}. +To use modern terminology, we can say that the \texttt{ArchetypeBuilder} discovered all \index{derived requirement}derived requirements of a generic signature (of which there were only finitely many, at the time) by exhaustive enumeration. This design survived the introduction of protocol \texttt{where} clauses in Swift 4 with only relatively minor complications. The introduction of recursive conformances in Swift 4.1 necessitated a much larger overhaul. The type parameter graph could now be infinite, so the eager conformance requirement expansion of \AlgRef{archetype builder expand} no longer made sense. The \texttt{ArchetypeBuilder} was renamed to the \IndexDefinition{GenericSignatureBuilder@\texttt{GenericSignatureBuilder}}\texttt{GenericSignatureBuilder}, and the up-front graph construction was replaced with incremental expansion of the type parameter graph when resolving type parameters to equivalence classes \cite{implrecursive}. After a period of years, the limitations of the lazy type parameter graph approach also made themselves apparent. Unlike the lazy construction of archetypes that exists today, this kind of lazy expansion would not only create new vertices, but also merge existing vertices to form new equivalence classes, as more derived same-type requirements could be ``discovered'' at any time. The on-going mutation made things difficult to understand and debug, and also prevented sharing structure between \texttt{GenericSignatureBuilder} instances. Any generic signature referencing one of the more complicated protocol towers from the standard library, such as \texttt{RangeReplaceableCollection}, would essentially construct an entire copy of a large subgraph of associated types, causing problems with memory usage and performance -The \texttt{GenericSignatureBuilder} was later discovered to suffer from another problem. As we will learn in Section \ref{monoidsasprotocols}~and~\ref{word problem}, it is possible to write down a generic signature with an undecidable theory of derived requirements. By virtue of its design, the \texttt{GenericSignatureBuilder} pretended to accept all generic signatures as input, meaning the original generalization to infinite type parameter graphs was fundamentally incorrect. All these problems motivated the search for a sound and decidable foundation on top of which generic signature queries and requirement minimization can be implemented, which became the Requirement Machine. +The \texttt{GenericSignatureBuilder} was later discovered to suffer from another problem. As we will learn in \SecRef{word problem}, it is possible to write down a generic signature with an undecidable theory of derived requirements. By virtue of its design, the \texttt{GenericSignatureBuilder} pretended to accept all generic signatures as input, meaning the original generalization to infinite type parameter graphs was fundamentally incorrect. All these problems motivated the search for a sound and decidable foundation on top of which generic signature queries and requirement minimization can be implemented, which became the Requirement Machine. \section{Source Code Reference} @@ -763,31 +761,31 @@ \section{Source Code Reference} \end{itemize} \apiref{GenericSignature}{class} -See also Section~\ref{genericsigsourceref}. +See also \SecRef{genericsigsourceref}. \begin{itemize} \item \texttt{getGenericEnvironment()} returns the \IndexSource{primary generic environment}primary generic environment associated with this generic signature. \end{itemize} \apiref{TypeBase}{class} -See also Section~\ref{typesourceref}. +See also \SecRef{typesourceref}. \begin{itemize} \item \texttt{mapTypeOutOfContext()} returns the interface type obtained by \IndexSource{map type out of environment}mapping this contextual type out of its generic environment. \end{itemize} \apiref{SubstitutionMap}{class} -See also Section~\ref{substmapsourcecoderef}. +See also \SecRef{substmapsourcecoderef}. \begin{itemize} \item \texttt{mapReplacementTypesOutOfContext()} returns the substitution map obtained by \IndexSource{map replacement types out of environment}mapping this substitution map's replacement types and conformances out of their generic environment. \end{itemize} \apiref{ProtocolConformanceRef}{class} -See also Section~\ref{conformancesourceref}. +See also \SecRef{conformancesourceref}. \begin{itemize} \item \texttt{mapConformanceOutOfContext()} returns the protocol conformance obtained by mapping this protocol conformance out of its generic environment. \end{itemize} \apiref{DeclContext}{class} -See also Section~\ref{declarationssourceref}. +See also \SecRef{declarationssourceref}. \begin{itemize} \item \texttt{getGenericEnvironmentOfContext()} returns the generic environment of the innermost generic declaration containing this declaration context. \item \texttt{mapTypeIntoContext()} Maps an interface type into the primary generic environment for the innermost generic declaration. If at least one outer declaration context is generic, this is equivalent to: @@ -809,9 +807,9 @@ \section{Source Code Reference} \item \texttt{getGenericEnvironment()} returns the archetype's generic environment. \item \texttt{isRoot()} answers if the reduced type parameter is a generic parameter type. \end{itemize} -Local requirements (Section~\ref{local requirements}): +Local requirements (\SecRef{local requirements}): \begin{itemize} -\item \texttt{getConformsTo()} returns the archetype's required protocols. +\item \texttt{getConformsTo()} returns the archetype's required protocols. This set does not include inherited protocols. To actually check if an archetype conforms to a specific protocol, use global conformance lookup (\SecRef{conformancesourceref}) instead of looking through this array. \item \texttt{getSuperclass()} returns the archetype's superclass bound, or the empty \texttt{Type} if there isn't one. \item \texttt{requiresClass()} answers with the requires class flag. \item \texttt{getLayoutConstraint()} returns the layout constraint, or the empty layout constraint if there isn't one. diff --git a/docs/Generics/chapters/basic-operation.tex b/docs/Generics/chapters/basic-operation.tex index c94ec19686923..c3c78fdee56fa 100644 --- a/docs/Generics/chapters/basic-operation.tex +++ b/docs/Generics/chapters/basic-operation.tex @@ -4,13 +4,13 @@ \chapter{Basic Operation}\label{rqm basic operation} -\lettrine{C}{onsider the problem} of \index{generic signature query}generic signature queries and \index{minimal requirement}minimization. Ultimately, we must be able to look at the requirements appearing in generic signatures and protocol declarations, and make inferences in accordance with the \index{derived requirement}derived requirements formalism. We use the proper noun \IndexDefinition{requirement machine}``The Requirement Machine'' to mean the compiler component responsible for this, while \index{requirement machine}\emph{a} requirement machine---a common noun---is an \emph{instance} of a specific data structure that encodes a fixed set of generic requirements in a form amenable to this kind of automated reasoning. +\lettrine{C}{onsider the problem} of \index{generic signature query}generic signature queries and \index{minimal requirement}minimization. Ultimately, we must be able to look at the requirements appearing in generic signatures and protocol declarations, and make inferences in accordance with the \index{derived requirement}derived requirements formalism. We use the proper noun \IndexDefinition{requirement machine}``the Requirement Machine'' to mean the compiler component responsible for this, while \index{requirement machine}\emph{a} requirement machine---a common noun---is an \emph{instance} of a specific data structure that encodes a fixed set of generic requirements in a form amenable to this kind of automated reasoning. -In this chapter, we will see that the generic signatures and protocols declared by a Swift program define a directed acyclic graph of requirement machines. The global singleton \IndexDefinition{rewrite context}\emph{rewrite context} manages the lazily construction of this graph. Each requirement machine is a \index{string rewrite system}\emph{string rewrite system}, whose \emph{rewrite rules} are formed by taking the union of its successors in this graph, together with the machine's own generic requirements. After we develop the high-level picture in this chapter, we delve into the workings of an individual machine in subsequent chapters; Chapter~\ref{monoids} develops the theory of string rewrite systems, and Chapter~\ref{symbols terms rules} describes how requirements map to \index{rewrite rule}rewrite rules. +In this chapter, we will see that the generic signatures and protocols declared by a Swift program define a directed acyclic graph of requirement machines. The global singleton \IndexDefinition{rewrite context}\emph{rewrite context} manages the lazily construction of this graph. Each requirement machine is a \index{string rewrite system}\emph{string rewrite system}, whose \emph{rewrite rules} are formed by taking the union of its successors in this graph, together with the machine's own generic requirements. After we develop the high-level picture in this chapter, we delve into the workings of an individual machine in subsequent chapters; \ChapRef{monoids} develops the theory of string rewrite systems, and \ChapRef{symbols terms rules} describes how requirements map to \index{rewrite rule}rewrite rules. \smallskip -Let's go. We begin by observing that both entry points into The Requirement Machine build new requirement machines: +Let's go. We begin by observing that both entry points into the Requirement Machine build new requirement machines: \begin{itemize} \item To answer a \index{generic signature query}generic signature query, we construct a new requirement machine from the minimal requirements of a generic signature; we call this a \IndexDefinition{query machine}\emph{query machine}. We then consult the machine's \emph{property map}: a description of all conformance, superclass, layout and concrete type requirements imposed on each type parameter. @@ -18,7 +18,7 @@ \chapter{Basic Operation}\label{rqm basic operation} \item To a build a new generic signature, we construct a new requirement machine from an initial set of requirements; we call this a \IndexDefinition{minimization machine}\emph{minimization machine}. We read off the minimal requirements from the string rewrite system after construction. -Minimization machines have a temporary lifetime, scoped to one of the requests for building a generic signature. We previously covered these requests in Chapter~\ref{building generic signatures}; recall from Figure \ref{inferred generic signature request figure}~and~\ref{abstract generic signature request figure} that requirement minimization was the final step in the process, after requirement desugaring. +Minimization machines have a temporary lifetime, scoped to one of the requests for building a generic signature. We previously covered these requests in \ChapRef{building generic signatures}; recall from \FigRef{inferred generic signature request figure}~and~\ref{abstract generic signature request figure} that requirement minimization was the final step in the process, after requirement desugaring. \end{itemize} \begin{figure}\captionabove{Building a query machine}\label{rqm flowchart generic signature} @@ -74,15 +74,15 @@ \chapter{Basic Operation}\label{rqm basic operation} \end{figure} \paragraph{Query machines.} -Figure~\ref{rqm flowchart generic signature} shows how a query machine is constructed from the minimal requirements of a generic signature: +\FigRef{rqm flowchart generic signature} shows how a query machine is constructed from the minimal requirements of a generic signature: \begin{enumerate} \item The \index{rule builder}\emph{rule builder} lowers the generic signature's minimal requirements to \emph{rewrite rules}; these become the \IndexDefinition{local rule}\emph{local rules} of our new requirement machine. \item The rule builder also collects rewrite rules from those protocols referenced by this signature; we call these the \IndexDefinition{imported rule}\emph{imported rules}. -\item We run the \index{completion}\emph{completion procedure} (Chapter~\ref{completion}) with our local and imported rules. Completion introduces new local rules which are ``consequences'' of other rules. This gives us a \index{convergent rewrite system}\emph{convergent rewrite system}. +\item We run the \index{completion}\emph{completion procedure} (\ChapRef{completion}) with our local and imported rules. Completion introduces new local rules which are ``consequences'' of other rules. This gives us a \index{convergent rewrite system}\emph{convergent rewrite system}. -\item We construct the property map data structure, to be used for generic signature queries (Chapter~\ref{propertymap}). +\item We construct the property map data structure, to be used for generic signature queries (\ChapRef{propertymap}). \item Property map construction may also introduce new local rules, in which case we go back to Step~3; completion and property map construction are iterated until no more rules are added. \end{enumerate} @@ -92,16 +92,16 @@ \chapter{Basic Operation}\label{rqm basic operation} Step~2 is where the directed acyclic graph of requirement machines comes in; each machine has some set of \emph{protocol dependencies} which determine its successors in the graph. We will see shortly that the successors of query and minimization machines are \index{protocol machine}\emph{protocol machines}. \paragraph{Cycle detection.} -Property map construction may need to look up \index{type witness}type witnesses of \index{conformance}conformances. This lookup may recursively perform generic signature queries, for example as part of type substitution or associated type inference. This introduces the possibility that we might query a generic signature whose query machine is in the process of being constructed. This is not supported; we guard against re-entrant construction by checking if a query machine is complete before we perform queries against it. If the query machine is incomplete, it is currently being constructed further up in the call stack, so the source program has a circular dependency which cannot be resolved. It is very difficult to hit this in practice, so for simplicity of implementation we report a fatal error which ends compilation. The lazy construction of query machines resembles how the \index{request evaluator}request evaluator evaluates requests (Section~\ref{request evaluator}), but it is a hand-coded cache that does not actually use the request evaluator. The re-entrant construction of a query machine is the equivalent of a request cycle. +Property map construction may need to look up \index{type witness}type witnesses of \index{conformance}conformances. This lookup may recursively perform generic signature queries, for example as part of type substitution or associated type inference. This introduces the possibility that we might query a generic signature whose query machine is in the process of being constructed. This is not supported; we guard against re-entrant construction by checking if a query machine is complete before we perform queries against it. If the query machine is incomplete, it is currently being constructed further up in the call stack, so the source program has a circular dependency which cannot be resolved. It is very difficult to hit this in practice, so for simplicity of implementation we report a fatal error which ends compilation. The lazy construction of query machines resembles how the \index{request evaluator}request evaluator evaluates requests (\SecRef{request evaluator}), but it is a hand-coded cache that does not actually use the request evaluator. The re-entrant construction of a query machine is the equivalent of a request cycle. \paragraph{Minimization machines.} -The process of building a minimization machine, shown in Figure~\ref{rqm flowchart generic signature minimization}, differs from query machine construction in a few respects: +The process of building a minimization machine, shown in \FigRef{rqm flowchart generic signature minimization}, differs from query machine construction in a few respects: \begin{enumerate} -\item The rule builder receives desugared requirements (Section~\ref{requirement desugaring}). +\item The rule builder receives desugared requirements (\SecRef{requirement desugaring}). \item Completion records \emph{rewrite loops}, which describe relations between rewrite rules---in particular, a loop will encode if a rewrite rule is redundant because it is a consequence of existing rules. \item Property map construction records \index{conflicting requirement}conflicting requirements, to be diagnosed if this generic signature has a source location. -\item After completion and property map construction, \emph{homotopy reduction} processes rewrite loops to find a minimal subset of rewrite rules which completely describe the rewrite system. This is the topic of Chapter \ref{rqm minimization}. -\item The \index{requirement builder}\emph{requirement builder} converts the minimal set of rewrite rules into minimal requirements. This is explained in Section~\ref{requirement builder}. +\item After completion and property map construction, \emph{homotopy reduction} processes rewrite loops to find a minimal subset of rewrite rules which completely describe the rewrite system. This is the topic of \ChapRef{rqm minimization}. +\item The \index{requirement builder}\emph{requirement builder} converts the minimal set of rewrite rules into minimal requirements. This is explained in \SecRef{requirement builder}. \end{enumerate} \paragraph{An optimization.} @@ -127,18 +127,17 @@ \chapter{Basic Operation}\label{rqm basic operation} This equivalence almost always holds true, except for these four situations where the final list of minimal requirements does not completely describe all consequences of the requirements that were input: \begin{enumerate} -\item User-written requirements that mention non-existent type parameters are dropped. In this case, a new query machine will not include the corresponding rewrite rules from the old minimization machine. +\item User-written requirements that that are not \index{well-formed requirement}well-formed are dropped. In this case, a new query machine will not include the corresponding rewrite rules from the old minimization machine. -\item When two requirements are in conflict such that both cannot be simultaneously satisfied, both requirements are dropped. A new query machine will, again, not include the corresponding rewrite rules. +\item When two requirements are in \index{conflicting requirement}conflict such that both cannot be simultaneously satisfied, both requirements are dropped. A new query machine will, again, not include the corresponding rewrite rules. -\item As we will see in Section~\ref{word problem}, we may fail to construct a string rewrite system if the user-written requirements are too complex to reason about. In this case the minimization machine will include all rewrite rules that were added up to the point of failure, but the set of minimal requirements will be empty. +\item As we will see in \SecRef{word problem}, we may fail to construct a string rewrite system if the user-written requirements are too complex to reason about. In this case the minimization machine will include all rewrite rules that were added up to the point of failure, but the set of minimal requirements will be empty. -\item If a conformance requirement is made redundant by a same-type requirement that fixes a type parameter to a concrete type (such as $\FormalReq{T == S}$ and $\ConfReq{T}{P}$ where \texttt{S} is a concrete type and $\ConfReq{S}{P}$ is a concrete conformance), the rewrite system cannot be reused for technical reasons; we will talk about this in Chapter~\ref{concrete conformances}. +\item If a conformance requirement is made redundant by a same-type requirement that fixes a type parameter to a concrete type (such as $\SameReq{T}{S}$ and $\ConfReq{T}{P}$ where \texttt{S} is a concrete type and $\ConfReq{S}{P}$ is a concrete conformance), the rewrite system cannot be reused for technical reasons; we will talk about this in \ChapRef{concrete conformances}. \end{enumerate} The first three only occur with invalid code, and are accompanied by diagnostics. The fourth is not an error, just a rare edge case where our optimization cannot be performed. All conditions are checked for during minimization, and recorded in the form of a flags field. We cannot install the minimization machine if any of these flags are set; doing so would associate a generic signature with a minimization machine that contains rewrite rules not explained by the generic signature itself. This would confuse subsequent generic signature queries. In the event that the above does not cover some other unforseen scenario where equivalence fails to hold, the \IndexFlag{disable-requirement-machine-reuse}\texttt{-disable-requirement-machine-reuse} frontend flag forces minimization machines to be discarded immediately after use, instead of being installed. - \begin{example} The compiler builds several requirement machines while type checking the code below: \begin{Verbatim} @@ -179,10 +178,10 @@ \chapter{Basic Operation}\label{rqm basic operation} \end{Verbatim} \end{example} -We'll say more about debugging flags in Section~\ref{rqm debugging flags}. +We'll say more about debugging flags in \SecRef{rqm debugging flags}. \paragraph{Protocols.} -To reason about the derived requirements and valid type parameters of a generic signature, we must consider not only the generic parameters and minimal requirements of the generic signature itself, but also the associated type declarations and minimal requirements of each protocol referenced by our signature. This set of \emph{protocol dependencies} is a property of a generic signature that we will make precise in Section~\ref{protocol component}. +To reason about the derived requirements and valid type parameters of a generic signature, we must consider not only the generic parameters and minimal requirements of the generic signature itself, but also the associated type declarations and minimal requirements of each protocol referenced by our signature. This set of \emph{protocol dependencies} is a property of a generic signature that we will make precise in \SecRef{protocol component}. The associated types and requirements of each protocol define a set of rewrite rules. We may need to incorporate the rewrite rules of a large number of protocols into a single requirement machine. Similarly, any two requirement machines that import the same protocols will also have many rules in common; for example, any machine with a conformance to \texttt{Collection} will contain rewrite rules for the associated types and requirements of \texttt{Collection}, \texttt{Sequence} and \texttt{IteratorProtocol}. @@ -196,7 +195,7 @@ \chapter{Basic Operation}\label{rqm basic operation} \paragraph{Summary.} Requirement machines come in four varieties: \begin{enumerate} -\item \textbf{Query}, built from an existing generic signature, used to answer generic signature queries. +\item \textbf{Query}, built from an existing generic signature, for generic signature queries. \item \textbf{Minimization}, built from user-written requirements of a generic declaration, used for building a new generic signature. \item \textbf{Protocol}, built from an existing protocol requirement signature, used for sharing rules across requirement machines. \item \textbf{Protocol minimization}, built from user-written requirements of a protocol, used for building a new protocol requirement signature. @@ -204,9 +203,11 @@ \chapter{Basic Operation}\label{rqm basic operation} This list represents all four combinations of ``domain'' and ``purpose.'' In case (1) and~(2), the central entity is a generic signature; in (3) and~(4), we have a protocol requirement signature. In case (1) and~(3), we start with minimal requirements from an existing entity; in case (2) and~(4), we take user-written requirements, and perform minimization to build a new entity of the desired sort. +The Requirement Machine was first used for generic signature queries in \IndexSwift{5.6}Swift~5.6, and then for minimization in \IndexSwift{5.7}Swift~5.7, replacing the \Index{GenericSignatureBuilder@\texttt{GenericSignatureBuilder}}\texttt{GenericSignatureBuilder}. + \section{Protocol Components}\label{protocol component} -Now we will give a full account of how imported rules work in the Requirement Machine. We recall the definition of the protocol dependency graph from Section~\ref{recursive conformances}: we take the \index{vertex}vertex set of all protocol declarations, and \index{edge}edge set of all \index{associated conformance requirement}associated conformance requirements, with the endpoints defined follows for an arbitrary edge $\ConfReq{Self.A}{P}_\texttt{Q}$: +Now we will give a full account of how imported rules work in the Requirement Machine. We recall the definition of the protocol dependency graph from \SecRef{recursive conformances}: we take the \index{vertex}vertex set of all protocol declarations, and \index{edge}edge set of all \index{associated conformance requirement}associated conformance requirements, with the endpoints defined follows for an arbitrary edge $\ConfReq{Self.A}{P}_\texttt{Q}$: \begin{align*} \Src(\ConfReq{Self.A}{P}_\texttt{Q})&=\protosym{Q},\\ \Dst(\ConfReq{Self.A}{P}_\texttt{Q})&=\protosym{P}. @@ -214,10 +215,10 @@ \section{Protocol Components}\label{protocol component} We will show that the protocols that can appear in the \index{derived requirement}derivations of a given generic signature are precisely those reachable from an initial set in the \index{protocol dependency graph}protocol dependency graph. We begin by considering protocol dependencies between protocols first. Along the way, we also introduce the \emph{protocol component graph} to get around complications caused by circular dependencies. \begin{definition} -A protocol $\protosym{P}$ \emph{depends on} a protocol $\protosym{Q}$ (or has a \emph{protocol dependency} on~$\protosym{Q}$) if we can derive a conformance to $\protosym{Q}$ from the protocol generic signature $G_\texttt{P}$; that is, if $G_\texttt{P}\vDash\ConfReq{T}{Q}$ for some type parameter \texttt{T}. We write $\protosym{P}\prec\protosym{Q}$ if this relationship holds. The \emph{protocol dependency set} of a protocol $\protosym{P}$ is then the set of all protocols $\protosym{Q}$ such that $\protosym{P}\prec\protosym{Q}$. +A protocol $\protosym{P}$ \emph{depends on} a protocol $\protosym{Q}$ (or has a \emph{protocol dependency} on~$\protosym{Q}$) if we can derive a conformance to $\protosym{Q}$ from the protocol generic signature $G_\texttt{P}$; that is, if $G_\texttt{P}\vDash\ConfReq{T}{Q}$ for some type parameter \texttt{T}. We write $\protosym{P}\prec\protosym{Q}$ if this relationship holds. The \IndexDefinition{protocol dependency set}\emph{protocol dependency set} of a protocol $\protosym{P}$ is then the set of all protocols $\protosym{Q}$ such that $\protosym{P}\prec\protosym{Q}$. \end{definition} -Lemma~\ref{subst lemma} implies that $\prec$ is a \index{transitive relation}transitive relation; that is, if $\protosym{P}\prec\protosym{Q}$ and $\protosym{Q}\prec\protosym{R}$, then $\protosym{P}\prec\protosym{R}$. In fact, $\prec$ is the \index{reachability relation}reachability relation in the protocol dependency graph. Before we can prove this, we need a technical result. +\LemmaRef{subst lemma} implies that $\prec$ is a \index{transitive relation}transitive relation; that is, if $\protosym{P}\prec\protosym{Q}$ and $\protosym{Q}\prec\protosym{R}$, then $\protosym{P}\prec\protosym{R}$. In fact, $\prec$ is the \index{reachability relation}reachability relation in the protocol dependency graph. Before we can prove this, we need a technical result. \smallskip @@ -270,7 +271,7 @@ \section{Protocol Components}\label{protocol component} If $\protosym{P}$ and $\protosym{Q}$ are any two protocols, then $\protosym{P}\prec\protosym{Q}$ if and only if there exists a path from $\protosym{P}$ to $\protosym{Q}$ in the protocol dependency graph. \end{proposition} \begin{proof} -First, suppose $\protosym{P}\prec\protosym{Q}$. Then, there is a type parameter \texttt{T} such that $G_\texttt{P}\vDash\ConfReq{T}{Q}$. By Theorem~\ref{conformance paths theorem}, there exists a conformance path for $\ConfReq{T}{Q}$. This conformance path defines a path in the protocol dependency graph from \protosym{P} to $\protosym{Q}$. Now, suppose that we have a path from $\protosym{P}$ to $\protosym{Q}$ in the protocol dependency graph. Every protocol dependency path in originating at $\protosym{P}$ lifts to a path originating at $\ConfReq{Self}{P}$ in the conformance path graph of $G_\texttt{P}$. By Algorithm~\ref{invertconformancepath}, this conformance path defines a derived conformance requirement $\ConfReq{T}{Q}$ in $G_\texttt{P}$, showing that $\protosym{P}\prec\protosym{Q}$. +First, suppose $\protosym{P}\prec\protosym{Q}$. Then, there is a type parameter \texttt{T} such that $G_\texttt{P}\vDash\ConfReq{T}{Q}$. By \ThmRef{conformance paths theorem}, there exists a conformance path for $\ConfReq{T}{Q}$. This conformance path defines a path in the protocol dependency graph from \protosym{P} to $\protosym{Q}$. Now, suppose that we have a path from $\protosym{P}$ to $\protosym{Q}$ in the protocol dependency graph. Every protocol dependency path in originating at $\protosym{P}$ lifts to a path originating at $\ConfReq{Self}{P}$ in the conformance path graph of $G_\texttt{P}$. By \AlgRef{invertconformancepath}, this conformance path defines a derived conformance requirement $\ConfReq{T}{Q}$ in $G_\texttt{P}$, showing that $\protosym{P}\prec\protosym{Q}$. \end{proof} \begin{listing}\captionabove{Protocol component demonstration}\label{protocol component listing} @@ -326,9 +327,9 @@ \section{Protocol Components}\label{protocol component} \end{wrapfigure} \paragraph{Recursive conformances.} -We realized in Section~\ref{recursive conformances} that recursive conformance requirements create cycles in the protocol dependency graph. A cycle appears in the protocol dependency graph for the declarations of Listing~\ref{protocol component listing}, shown on the left. Each one of $\protosym{Foo}$ and~$\protosym{Bar}$ points at the other via two mutually-recursive associated conformance requirements. Based on what's been described so far, we cannot build one protocol machine without first building the other: a circular dependency. +We realized in \SecRef{recursive conformances} that recursive conformance requirements create cycles in the protocol dependency graph. A cycle appears in the protocol dependency graph for the declarations of \ListingRef{protocol component listing}, shown on the left. Each one of $\protosym{Foo}$ and~$\protosym{Bar}$ points at the other via two mutually-recursive associated conformance requirements. Based on what's been described so far, we cannot build one protocol machine without first building the other: a circular dependency. -We solve this by grouping protocols into \emph{protocol components}, such that in our example, $\protosym{Foo}$ and $\protosym{Bar}$ belong to the same component. A protocol machine describes an entire protocol component, so the local rules of a protocol machine may include requirements from multiple protocols. We will see that if we consider dependencies between protocol \emph{components} as opposed to \emph{protocols}, we get a directed acyclic graph. To make all this precise, we step back to consider directed graphs in the abstract. +We solve this by grouping protocols into \IndexDefinition{protocol component}\emph{protocol components}, such that in our example, $\protosym{Foo}$ and $\protosym{Bar}$ belong to the same component. A protocol machine describes an entire protocol component, so the local rules of a protocol machine may include requirements from multiple protocols. We will see that if we consider dependencies between protocol \emph{components} as opposed to \emph{protocols}, we get a directed acyclic graph. To make all this precise, we step back to consider directed graphs in the abstract. \paragraph{Strongly connected components.} Suppose $(V, E)$ is any \index{directed graph}directed graph, and $\prec$ is the \index{reachability relation}reachability relation, so $x\prec y$ if there is a path with source $x$ and destination $y$. We say that $x$ and $y$ are \IndexDefinition{strongly connected component}\index{SCC|see{strongly connected component}}\emph{strongly connected} if both $x\prec y$ and $y\prec x$ are true; we denote this relation by $x\equiv y$ in the following. This is an \index{equivalence relation}equivalence relation: @@ -356,7 +357,7 @@ \section{Protocol Components}\label{protocol component} \end{tikzpicture} \end{wrapfigure} -The protocol component graph of Listing~\ref{protocol component listing} is shown on the left. With the exception of $\protosym{Foo}$ and $\protosym{Bar}$, which have collapsed to a single vertex, every protocol is in a protocol component by itself. The protocol component graph is always acyclic. However, it is still not a tree or a forest; as we see in our example, we have two distinct paths with source $\protosym{Top}$ and destination~$\protosym{Bot}$, so components can share ``children.'' To ensure we do not import duplicate rules, we must be careful to only visit every downstream protocol machine once, and take only the \emph{local} rules from each. (Every imported rule was local in its original protocol machine, so we import it exactly once from there, never ``transitively.'') +The protocol component graph of \ListingRef{protocol component listing} is shown on the left. With the exception of $\protosym{Foo}$ and $\protosym{Bar}$, which have collapsed to a single vertex, every protocol is in a protocol component by itself. The protocol component graph is always \index{directed acyclic graph}acyclic. However, it is still not a tree or a forest; as we see in our example, we have two distinct paths with source $\protosym{Top}$ and destination~$\protosym{Bot}$, so components can share ``children.'' To ensure we do not import duplicate rules, we must be careful to only visit every downstream protocol machine once, and take only the \emph{local} rules from each. (Every imported rule was local in its original protocol machine, so we import it exactly once from there, never ``transitively.'') A \IndexTwoFlag{debug-requirement-machine}{protocol-dependencies}debugging flag prints out each connected component as it was formed. Let's try with our example: \begin{Verbatim}[fontsize=\footnotesize,numbers=none] @@ -399,7 +400,7 @@ \section{Protocol Components}\label{protocol component} \newcommand{\Lowlink}[1]{\texttt{LOWLINK}(#1)} \newcommand{\OnStack}[1]{\texttt{ONSTACK}(#1)} -To form the strongly connected component of a vertex $v$, Tarjan's algorithm performs a \index{depth-first search}\emph{depth-first search}: given $v$, we look for edges originating from $v$ that lead to unvisited vertices, and recursively visit each of those successor vertices, and so on, exploring the entire subgraph reachable from there before moving on to the next successor. (Contrast this with the \index{breadth-first search}\emph{breadth-first search} for finding a conformance path in Section~\ref{finding conformance paths}, where we visit all vertices at a certain level before exploring deeper.) +To form the strongly connected component of a vertex $v$, Tarjan's algorithm performs a \index{depth-first search}\emph{depth-first search}: given $v$, we look for edges originating from $v$ that lead to unvisited vertices, and recursively visit each of those successor vertices, and so on, exploring the entire subgraph reachable from there before moving on to the next successor. (Contrast this with the \index{breadth-first search}\emph{breadth-first search} for finding a conformance path in \SecRef{finding conformance paths}, where we visit all vertices at a certain level before exploring deeper.) \begin{wrapfigure}[11]{r}{2.9cm} \begin{tikzpicture}[x=1cm, y=1.3cm] @@ -423,7 +424,7 @@ \section{Protocol Components}\label{protocol component} We number vertices in the order they are visited, by assigning the current value of a counter to each visited vertex $v$, prior to visiting the successors of $v$. We denote the value assigned to a vertex $v$ by $\Number{v}$. Our traversal can determine if a vertex has been previously visited by checking if $\Number{v}$ is set. -We return to the protocol dependency graph from Listing~\ref{protocol component listing}. A depth-first search originating from $\protosym{Top}$ produces the numbering of vertices shown on the right, where the first vertex is assigned the value 1. Here, the entire graph was ultimately reachable from 1; more generally, we get a numbering of the subgraph reachable from the initial vertex. +We return to the protocol dependency graph from \ListingRef{protocol component listing}. A depth-first search originating from $\protosym{Top}$ produces the numbering of vertices shown on the right, where the first vertex is assigned the value 1. Here, the entire graph was ultimately reachable from 1; more generally, we get a numbering of the subgraph reachable from the initial vertex. When looking at a vertex $v$, we consider each edge $e\in E$ with $\Src(e)=v$. Suppose that $\Dst(e)$ is some other vertex~$w$; Tarjan considers the state of $\Number{w}$, and classifies the edge $e$ into one of four kinds: tree edges, ignored edges, fronds, and cross-links. @@ -506,7 +507,7 @@ \section{Protocol Components}\label{protocol component} \end{algorithm} After the outermost recursive call returns, the stack will always be empty. Note that while this algorithm is recursive, it is not re-entrant; in particular, the act of getting the successors of a vertex must not trigger the computation of the same strongly connected components. This is enforced in Step~2. In our case, this can happen because getting the successors of a protocol performs type resolution; in practice this should be extremely difficult to hit, so for simplicity we report a fatal error and exit the compiler instead of attempting to recover. -\paragraph{Protocol components.} The rewrite context lazily populates a mapping from protocol declarations to \emph{protocol nodes}. A protocol is a vertex in our graph; the protocol node data structure for $p$ stores $\Number{p}$, $\Lowlink{p}$, $\OnStack{p}$ and a component ID for $p$. A second table maps each component ID to a \emph{protocol component}, which is a list of protocol declarations together with a requirement machine. This requirement machine is either a protocol requirement machine, or a protocol minimization requirement machine; which one depends on whether we need to build the requirement signatures for each protocol, or if we already have requirement signatures. There is a second level of indirection here, as this requirement machine is lazily constructed when first requested, and not when the component is initially formed by Algorithm~\ref{tarjan}. +\paragraph{Protocol components.} The rewrite context lazily populates a mapping from protocol declarations to \emph{protocol nodes}. A protocol is a vertex in our graph; the protocol node data structure for $p$ stores $\Number{p}$, $\Lowlink{p}$, $\OnStack{p}$ and a component ID for $p$. A second table maps each component ID to a \index{protocol component}\emph{protocol component}, which is a list of protocol declarations together with a requirement machine. This requirement machine is either a protocol requirement machine, or a protocol minimization requirement machine; which one depends on whether we need to build the requirement signatures for each protocol, or if we already have requirement signatures. There is a second level of indirection here, as this requirement machine is lazily constructed when first requested, and not when the component is initially formed by \AlgRef{tarjan}. \paragraph{Generic signatures.} The idea of a protocol having a dependency relationship on another protocol generalizes to a generic signature depending on a protocol. The protocol dependencies of a generic signature are those that may appear in a derivation. @@ -527,17 +528,17 @@ \section{Protocol Components}\label{protocol component} \item (Initialize) Initialize a worklist and add all given protocols to the worklist in any order. Initialize \texttt{S} to an empty set of visited protocols. Initialize \texttt{M} to an empty set of requirement machines (compared by pointer equality). \item (Check) If the worklist is empty, go to Step~8. \item (Next) Otherwise, remove the next protocol $p$ from the worklist. If $p\in\texttt{S}$, go back to Step~2, otherwise set $\texttt{S}\leftarrow\texttt{S}\cup\{p\}$. -\item (Component) Use Algorithm~\ref{tarjan} to compute the component ID for $p$. -\item (Machine) Let $m$ be the requirement machine for this component, creating it first if necessary. If $m\not\in\texttt{M}$, set $\texttt{M}\leftarrow\texttt{M}\cup\{m\}$. +\item (Component) Use \AlgRef{tarjan} to compute the component ID for $p$. +\item (Machine) Let $m$ be the requirement machine for this component, creating it first if necessary. If $m\notin\texttt{M}$, set $\texttt{M}\leftarrow\texttt{M}\cup\{m\}$. \item (Successors) Add each successor of $p$ to the worklist. \item (Loop) Go back to Step~1. \item (Collect) Return the concatenation of the local rules from each $m\in\texttt{M}$. \end{enumerate} \end{algorithm} -We will encounter the protocol dependency graph one last time when we introduce the Knuth-Bendix completion procedure in Chapter~\ref{completion}. We will show that the rewrite rules we construct have a certain structure that enables an optimization where completion does not need to check for \emph{overlaps} between pairs of imported rules. +We will encounter the protocol dependency graph one last time when we introduce the Knuth-Bendix completion procedure in \ChapRef{completion}. We will show that the rewrite rules we construct have a certain structure that enables an optimization where completion does not need to check for \emph{overlaps} between pairs of imported rules. -A protocol component is always reasoned about as an indivisible unit; for example, in Chapter~\ref{rqm minimization} we will that requirement minimization must consider all protocols in a component simultaneously to get correct results. +A protocol component is always reasoned about as an indivisible unit; for example, in \ChapRef{rqm minimization} we will that requirement minimization must consider all protocols in a component simultaneously to get correct results. \section{Debugging Flags}\label{rqm debugging flags} @@ -550,29 +551,29 @@ \section{Debugging Flags}\label{rqm debugging flags} \end{quote} More generally, the argument to this flag is a comma-separated list of options, where the possible options are the following: \begin{itemize} -\item \texttt{timers}: Chapter~\ref{rqm basic operation}. -\item \texttt{protocol-dependencies}: Section~\ref{protocol component}. -\item \texttt{simplify}: Section~\ref{term reduction}. -\item \texttt{add}, \texttt{completion}: Chapter~\ref{completion}. -\item \texttt{concrete-unification}, \texttt{conflicting-rules}, \texttt{property-map}: Chapter~\ref{propertymap}. -\item \texttt{concretize-nested-types}, \texttt{conditional-requirements}: Section~\ref{rqm type witnesses}. -\item \texttt{concrete-contraction}: Section~\ref{concrete contraction}. +\item \texttt{timers}: \ChapRef{rqm basic operation}. +\item \texttt{protocol-dependencies}: \SecRef{protocol component}. +\item \texttt{simplify}: \SecRef{term reduction}. +\item \texttt{add}, \texttt{completion}: \ChapRef{completion}. +\item \texttt{concrete-unification}, \texttt{conflicting-rules}, \texttt{property-map}: \ChapRef{propertymap}. +\item \texttt{concretize-nested-types}, \texttt{conditional-requirements}: \SecRef{rqm type witnesses}. +\item \texttt{concrete-contraction}: \SecRef{concrete contraction}. \item \texttt{homotopy-reduction}, \texttt{homotopy-reduction-detail},\\ -\texttt{propagate-requirement-ids}: Section~\ref{homotopy reduction}. -\item \texttt{minimal-conformances}, \texttt{minimal-conformances-detail}: Section~\ref{minimal conformances}. +\texttt{propagate-requirement-ids}: \SecRef{homotopy reduction}. +\item \texttt{minimal-conformances}, \texttt{minimal-conformances-detail}: \SecRef{minimal conformances}. \item \texttt{minimization}, \texttt{redundant-rules}, \texttt{redundant-rules-detail},\\ -\texttt{split-concrete-equiv-class}: Chapter~\ref{requirement builder}. +\texttt{split-concrete-equiv-class}: \ChapRef{requirement builder}. \end{itemize} Two more debugging flags are defined. The \IndexFlag{analyze-requirement-machine}\texttt{-analyze-requirement-machine} flag dumps a variety of \index{histogram}histograms maintained by the rewrite context after the compilation session ends. These wereare mostly intended for performance tuning various data structures: \begin{itemize} -\item A count of the unique symbols allocated, by kind (Section~\ref{rqm symbols}). -\item A count of the number of terms allocated, by length (Section~\ref{building terms}). -\item Statistics about the rule trie (Section~\ref{term reduction}) and property map trie (Chapter~\ref{propertymap}). -\item Statistics about the minimal conformances algorithm (Section~\ref{minimal conformances}). +\item A count of the unique symbols allocated, by kind (\SecRef{rqm symbols}). +\item A count of the number of terms allocated, by length (\SecRef{building terms}). +\item Statistics about the rule trie (\SecRef{term reduction}) and property map trie (\ChapRef{propertymap}). +\item Statistics about the minimal conformances algorithm (\SecRef{minimal conformances}). \end{itemize} -The \IndexFlag{dump-requirement-machine}\texttt{-dump-requirement-machine} flag prints each requirement machine before and after \index{completion}completion. The printed representation includes a list of rewrite rules, the property map, and all \index{rewrite loop}rewrite loops. The output will begin to make sense after Chapter~\ref{symbols terms rules}. +The \IndexFlag{dump-requirement-machine}\texttt{-dump-requirement-machine} flag prints each requirement machine before and after \index{completion}completion. The printed representation includes a list of rewrite rules, the property map, and all \index{rewrite loop}rewrite loops. The output will begin to make sense after \ChapRef{symbols terms rules}. \section{Source Code Reference}\label{rqm basic operation source ref} @@ -580,7 +581,7 @@ \section{Source Code Reference}\label{rqm basic operation source ref} \begin{itemize} \item \SourceFile{lib/AST/RequirementMachine/} \end{itemize} -The Requirement Machine implementation is private to \texttt{lib/AST/}. The remainder of the compiler interacts with it indirectly, through the generic signature query methods on \texttt{GenericSignature} (Section~\ref{genericsigsourceref}) and the various requests for building new generic signatures (Section~\ref{buildinggensigsourceref}). +The Requirement Machine implementation is private to \texttt{lib/AST/}. The remainder of the compiler interacts with it indirectly, through the generic signature query methods on \texttt{GenericSignature} (\SecRef{genericsigsourceref}) and the various requests for building new generic signatures (\SecRef{buildinggensigsourceref}). \subsection*{The Rewrite Context} @@ -608,7 +609,7 @@ \subsection*{The Rewrite Context} \index{generic signature} \apiref{GenericSignature}{class} -See also Section~\ref{genericsigsourceref}. +See also \SecRef{genericsigsourceref}. \begin{itemize} \item \texttt{getRequirementMachine()} returns the requirement machine for this generic signature, by asking the rewrite context to produce one and then caching the result in an instance variable of the \texttt{GenericSignature} instance itself to speed up subsequent access. This method is used by the implementation of generic signature queries; apart from those, there should be no reason to reach inside the requirement machine instance yourself. \end{itemize} @@ -618,7 +619,7 @@ \subsection*{The Rewrite Context} A request evaluator request which collects all protocols referenced from a given protocol's associated conformance requirements. \apiref{ProtocolDecl}{class} -See also Section~\ref{genericdeclsourceref}. +See also \SecRef{declarationssourceref}. \begin{itemize} \item \texttt{getProtocolDependencies()} evaluates the \texttt{ProtocolDependenciesRequest}. \end{itemize} @@ -683,4 +684,4 @@ \subsection*{Debugging} \item \texttt{dump()} prints the histogram as ASCII art. \end{itemize} -\end{document} \ No newline at end of file +\end{document} diff --git a/docs/Generics/chapters/building-generic-signatures.tex b/docs/Generics/chapters/building-generic-signatures.tex index 21ccf7e8ecb47..304ee52a99d6c 100644 --- a/docs/Generics/chapters/building-generic-signatures.tex +++ b/docs/Generics/chapters/building-generic-signatures.tex @@ -4,79 +4,78 @@ \chapter{Building Generic Signatures}\label{building generic signatures} -\index{generic signature} -\index{generic context} -\lettrine{B}{uilding a generic signature} from user-written requirements is an important topic which we have yet to cover. This bridges the gap between Chapter~\ref{generic declarations}, where we discussed the syntax for declaring generic parameters and stating requirements, and Chapter~\ref{genericsig}, which introduced generic signatures as semantic objects which collect the generic parameters and requirements of a generic declaration. +\lettrine{B}{uilding a generic signature} from user-written requirements is something we glossed over before, and it's time to detail it now. We're going to fill in missing steps between the syntax for declaring generic parameters and stating requirements of Sections \ref{generic params}~and~\ref{requirements}, and the \index{generic signature}generic signature of \ChapRef{genericsig}, a semantic representation of the generic parameters and requirements of a declaration. -Unlike user-written requirements, the requirements of a generic signature are minimal, reduced, and appear in a certain order (we will see the formal definitions in Section~\ref{minimal requirements}). A fair amount of machinery is involved in building generic signatures. We're going to peel away the layers, starting from the entry points: +The requirements in a generic signature must be \index{reduced requirement}reduced, \index{minimal requirement}minimal, and ordered in a certain way (we will see the formal definitions in \SecRef{minimal requirements}). To build a generic signature then, we must convert user-written requirements into an \emph{equivalent} set of requirements that satisfy these additional invariants. We're going to start from the entry points for building generic signatures and peel away the layers: \begin{itemize} -\item All generic signatures are ultimately created by the \emph{primitive constructor}, which takes generic parameters and requirements known to already obey the necessary invariants. -\item The next code path concerns generic signatures of generic contexts. The \textbf{generic signature request} takes a generic context apart and delegates to the \textbf{inferred generic signature request}. The latter builds requirements from requirement representations with type resolution, hands the requirements to the minimization algorithm, and finally uses the primitive constructor to build the generic signature. -\item Another entry point exists to build a generic signature from scratch. The \textbf{abstract generic signature request} takes a list of requirements, hands the requirements off to the minimization algorithm, and again invokes the primitive constructor to build the generic signature. -\item For protocols, the \textbf{requirement signature request} solves the closely-related problem of building a requirement signature. -\end{itemize} +\item The \textbf{generic signature request} lazily builds a generic signature for a declaration written in source. This is a multi-step process; we call out to type resolution to construct the user-written requirements from syntactic representations, and then ultimately pass these requirements to the minimization algorithm. We will see how this request decomposes the declaration's syntactic forms before delegating to the \textbf{inferred generic signature request} to do most of the work. +\item The \textbf{abstract generic signature request} instead builds a generic signature from a list of generic parameters and requirements. No name lookup or type resolution occurs here, because the inputs are semantic objects already. +\item All generic signatures are ultimately created by the \textbf{primitive constructor}, which takes generic parameters and requirements known to already obey the necessary invariants, and simply allocates and initializes the semantic object. + +Outside of the generics implementation, the \IndexDefinition{generic signature constructor}primitive constructor is used in a handful of cases when it is already known the requirements satisfy the necessary conditions. When deserializing a generic signature from a \index{serialized module}serialized module for example, we know it satisfied the invariants when it was serialized. -\paragraph{Primitive constructor} -\index{minimal requirement}% -\index{reduced requirement}% -\index{serialized module}% -\IndexDefinition{generic signature constructor}% -The primitive constructor is used directly when deserializing a generic signature, which was known to satisfy the invariants when it was serialized; or when the requirements are particularly trivial and the invariants hold by definition, such as the generic signature of a protocol, \verb||, or a generic signature with no requirements, like \verb||. In almost all other situations that call for building a generic signature ``from scratch,'' the \textbf{abstract generic signature request} should be used instead. +We also use the primitive constructor to build the \index{protocol generic signature}protocol generic signature \verb|| for a protocol \texttt{P}, because we know it satisfies the invariants by definition. Anywhere else we need to build a generic signature ``from scratch,'' we use the \textbf{abstract generic signature request} instead. -\paragraph{Generic signature request} +\item The \textbf{requirement signature request} lazily builds a requirement signature for a protocol written in source. The flow resembles the inferred generic signature request; we start by resolving user-written requirements, then perform minimization. There's no ``abstract'' equivalent of the requirement signature request; a requirement signature is always attached to a specific named protocol declaration. +\end{itemize} +Now we look at each involved request in detail. +\paragraph{Generic signature request.} \index{request}% \IndexDefinition{generic signature request}% \index{request evaluator}% \index{protocol declaration}% \index{protocol extension}% \Index{protocol Self type@protocol \texttt{Self} type}% -The generic signature of a generic context is constructed lazily by evaluating the \Request{generic signature request}. There are two easy cases to handle first: -\begin{itemize} -\item If the generic context is a protocol or an unconstrained protocol extension, the generic signature \verb|| is built using the primitive constructor. +Two easy cases are handled first: +\begin{enumerate} +\item If the declaration is a protocol or an unconstrained protocol extension, we build the \index{protocol generic signature}protocol generic signature \verb|| using the primitive constructor. \Index{where clause@\texttt{where} clause} \index{inheritance clause} \index{generic parameter list} -\item If the generic context does not have a generic parameter list or trailing \texttt{where} clause of its own, it simply inherits the generic signature of the parent context. If the parent context is the top-level source file context, this is the empty generic signature. Otherwise, we recursively evaluate the generic signature request again, this time with the parent context. -\end{itemize} - -\IndexDefinition{inferred generic signature request} -\index{request} -In all cases not covered by the two easy fallbacks above, the generic signature request actually has to do its work. It does this by kicking off the lower-level \Request{inferred generic signature request}, handing it a list of arguments: +\item A declaration having neither a generic parameter list nor a trailing \texttt{where} clause simply inherits the generic signature from its parent context. If the declaration is at the top level of a source file, we return the empty generic signature. Otherwise, we recursively evaluate the \Request{generic signature request} against the parent. +\end{enumerate} +In every other case, the generic signature request kicks off the lower-level \index{request}\IndexDefinition{inferred generic signature request}\Request{inferred generic signature request}, handing it a list of arguments: \begin{enumerate} \item The generic signature of the parent context, if any. + +The generic signature of a nested declaration extends that of the parent with additional generic parameters and requirements. + \item The current generic context's generic parameter list, if any. + +At least one of the first two inputs must be specified; absent a parent signature or any generic parameters to add, the resulting generic signature is necessarily \index{empty generic signature}empty, and the caller handles this case by not evaluating the request at all. + \item The current generic context's trailing \texttt{where} clause, if any. + +The inferred generic signature request will call out to \index{type resolution}type resolution to resolve the \index{requirement representation}requirement representations written here into \index{requirement}requirements. + \item Any additional requirements to add. -\item A list of types eligible for requirement inference. -\item A flag indicating whether generic parameters can be subject to concrete same-type requirements. + +This enables a shorthand syntax for declaring a \index{constrained extension}constrained extension by writing a generic nominal type or a ``pass-through'' generic type alias as the \index{extended type}extended type (\SecRef{constrained extensions}). + +\item A list of types eligible for \index{requirement inference}requirement inference. + +If we're building the generic signature of a function or subscript declaration, this consists of the declaration's parameter and return types; otherwise, it is empty (\SecRef{requirementinference}). + +\item A source location for diagnostics. \end{enumerate} -\paragraph{Inferred generic signature request} The name ``inferred generic signature request'' is a bit of a misnomer. Originally, it was named such because it performs requirement inference. The signature is not ``inferred'' in any sense. +\paragraph{Inferred generic signature request.} A possibly apocryphal story says the name ``inferred generic signature request'' was chosen because ``requirement inference'' is one of the steps below, but this does not infer the generic signature in any real sense. Rather, this request transforms user-written requirements into a minimal, reduced form via a multi-step process shown in \FigRef{inferred generic signature request figure}: +\begin{enumerate} +\item \index{requirement resolution}\textbf{Requirement resolution} builds user-written requirements, from constraint types stated in generic parameter inheritance clauses and requirement representations in the trailing \texttt{where} clause. This happens in \index{structural resolution stage}structural resolution stage (\ChapRef{typeresolution}), so the resolved requirements may contain \index{unbound dependent member type}unbound dependent member types, which will reduce to bound dependent member types in requirement minimization. -\index{constrained extension} -\index{type resolution} -\index{requirement resolution} -\index{requirement inference} -A few words about some of the parameters above. At least one of the first two parameters must be specified; if there is no parent signature, and there are no generic parameters to add, the resulting generic signature is necessarily empty, which should have been handled already. The list of additional requirements (parameter 4) is only used for special inference behavior in extensions, described in Section~\ref{constrained extensions}. For all other declarations, the inferred generic signature request builds the list of requirements from syntactic representations, by resolving type representations in inheritance clauses and requirement representations in the \texttt{where} clause. If the generic context is a function or subscript declaration, the list of requirement inference sources (parameter 5) consists of the parameter and return types; otherwise, it is empty. The flag (parameter 6) enforces an artificial restriction whereby only certain kinds of declarations can constrain their innermost generic parameters to concrete types or other generic parameters. While it is generally useful to be able to write this, -\begin{Verbatim} -struct Outer { - func f() where T == Int {...} -} -\end{Verbatim} -or this, -\begin{Verbatim} -extension Outer where Element == Int { - func f() {...} -} -\end{Verbatim} -something like the following is nonsensical and probably indicates a mistake on the user's part, because the innermost generic parameter \texttt{T} may as well be removed entirely, with all references to \texttt{T} replaced with \texttt{Int}: -\begin{Verbatim} -// error: same-type requirement makes generic parameter `T' non-generic -func add(_ lhs: T, _ rhs: T) -> T where T == Int { - return lhs + rhs -} -\end{Verbatim} +Diagnostics are emitted here if type resolution fails. + +\item \textbf{Requirement inference} (\SecRef{requirementinference}) allows certain requirements to be omitted in source if they can be inferred. These inferred requirements are added to the list of user-written requirements. + +\item \textbf{Decomposition and desugaring} (\SecRef{requirement desugaring}) transforms requirements collected by the first two stages, rewriting conformance and same-type requirements into a simpler form, and detecting trivial requirements that are \index{satisfied requirement}always or never satisfied. + +Diagnostics are emitted here if a requirement is always unsatisfied as written. + +\item \textbf{Requirement minimization} is where the real magic happens; we will talk about the invariants established by this in \SecRef{minimal requirements}, and pick up the topic again when we get to the construction of a requirement machine from desugared requirements in \ChapRef{rqm basic operation}. + +Diagnostics are emitted here if no substitution map can satisfy the generic signature as written; the signature is said to contain \index{conflicting requirement}\emph{conflicting requirements} in this case. +\end{enumerate} \begin{figure}\captionabove{Overview of the inferred generic signature request}\label{inferred generic signature request figure} \begin{center} @@ -98,25 +97,25 @@ \chapter{Building Generic Signatures}\label{building generic signatures} \end{center} \end{figure} -With the various parameters at hand, the inferred generic signature request transforms arbitrary user-written requirements into minimal, reduced requirements via a multi-step process shown in Figure~\ref{inferred generic signature request figure}: -\begin{enumerate} -\item -\index{unbound dependent member type}% -\index{structural resolution stage}% -The first step is to construct a list of user-written requirements, resolving constraint types of generic parameter declarations and the requirement representations of the trailing \texttt{where} clause. This uses the structural resolution stage, meaning the resolved requirements might contain unbound dependent member types---they will be reduced to bound dependent member types before requirement minimization builds the final generic signature though. -\item Requirement inference (Section~\ref{requirementinference}) implements a heuristic for adding additional requirements which do not need to be explicitly stated for brevity. -\item Requirement desugaring (Section~\ref{requirement desugaring}) splits up conformance requirements where the constraint types are protocol composition types and parameterized protocol types, as well as certain forms of same-type requirements. -\item Requirement minimization is where the real magic happens; we will talk about the invariants established by this in Section~\ref{minimal requirements}, and pick up the topic again when we talk about the construction of a requirement machine from desugared requirements in Chapter \ref{rqm basic operation}. -\end{enumerate} - -\IndexDefinition{conflicting requirement}% -\IndexDefinition{redundant requirement}% -\IndexFlag{warn-redundant-requirements}% -The inferred generic signature request emits diagnostics at the source location of the generic context in the following circumstances: -\begin{enumerate} -\item If the compiler is able to prove that no substitution map can ever simultaneously satisfy all of the requirements in the generic signature, the signature is said to contain \emph{conflicting requirements}. These will diagnose an error. -\item If some requirement can be shown to be a consequence of other requirements, the requirement is called a \emph{redundant requirement}. By default, redundant requirements are silently dropped, but if the \texttt{-Xfrontend -warn-redundant-requirements} flag is passed they are diagnosed as \index{warning}warnings. -\end{enumerate} +The \index{diagnostic!from inferred generic signature request}diagnostics mentioned above are emitted at the source location of the declaration, which is given to the request. This source location is used for one more diagnostic, an artificial restriction of sorts. Once we have a generic signature, we ensure that every innermost generic parameter is a \index{reduced type parameter}reduced type. If a generic parameter is not reduced, it must be \index{reduced type equality}equivalent to a concrete type or an earlier generic parameter; it serves no purpose and should be removed, so we diagnose an error\footnote{Well, it's a warning prior to \texttt{-swift-version 6}.}: +\begin{Verbatim} +// error: same-type requirement makes generic parameter `T' non-generic +func add(_ lhs: T, _ rhs: T) -> T where T == Int { + return lhs + rhs +} +\end{Verbatim} +The restriction only concerns innermost generic parameters, so we allow this: +\begin{Verbatim} +struct Outer { + func f() where T == Int {...} +} +\end{Verbatim} +And this: +\begin{Verbatim} +extension Outer where Element == Int { + func f() {...} +} +\end{Verbatim} \begin{figure}\captionabove{Overview of the abstract generic signature request}\label{abstract generic signature request figure} \begin{center} @@ -134,1022 +133,1185 @@ \chapter{Building Generic Signatures}\label{building generic signatures} \end{center} \end{figure} -\paragraph{Abstract generic signature request} -\IndexDefinition{abstract generic signature request}% -\index{request}% -\index{substituted requirement}% -This request builds a generic signature from a list of requirements that are not associated with a generic declaration or syntactic representations. The flow here is simpler, as shown in Figure~\ref{abstract generic signature request figure}. As input, it takes an optional parent generic signature, a list of generic parameter types to add, and a list of requirements to add. - -At least one of the first two parameters must be specified; if there is no parent generic signature and there are no generic parameters to add, the result would be the empty generic signature, which the caller is expected to handle by not evaluating this request at all. The list of requirements to add is sometimes a list of substituted requirements, obtained by applying a substitution map to each requirement of some original generic signature. This will come up in Section~\ref{overridechecking} and Section~\ref{witnessthunksignature}. +\paragraph{Abstract generic signature request.} +This \index{request}\IndexDefinition{abstract generic signature request}request builds a generic signature from a list of generic parameters and requirements provided by the caller. The flow here is simpler, as shown in \FigRef{abstract generic signature request figure}. As input, it takes an optional parent generic signature, a list of generic parameter types to add, and a list of requirements to add. At least one of the first two parameters must be specified; if there is no parent generic signature and there are no generic parameters to add, the result would be the empty generic signature, which the caller is expected to handle by not evaluating this request at all. -\index{requirement inference}% -Like the inferred generic signature request, the abstract generic signature request does requirement desugaring and minimization. However, this request does not perform requirement inference, nor does it emit diagnostics, because there is no associated source location where to emit them. - -\paragraph{Requirement signature request} -\index{request}% -\index{requirement signature}% -\IndexDefinition{requirement signature request}% -\IndexDefinition{structural requirements request}% -\IndexDefinition{type alias requirements request}% -\index{associated type declaration}% -\index{protocol declaration}% -\index{protocol type alias}% -\IndexDefinition{requirement signature constructor}% -This request builds a requirement signature from the protocol's requirements and type alias members, if any. It kicks off two other requests. The \Request{structural requirements request} collects user-written requirements from the protocol’s inheritance clause, associated type inheritance clauses, and \texttt{where} clauses on the protocol’s associated types and the protocol itself. The \Request{type alias requirements request} collects the protocol type aliases and converts them to same-type requirements. Finally, there is also a primitive constructor for building a requirement signature from a serialized representation, which bypasses requirement desugaring and minimization. +Like the inferred generic signature request, the abstract generic signature request decomposes, desugars and minimizes requirements; for this reason, it is preferred over using the primitive constructor. Often this request is invoked with a list of \index{substituted requirement}substituted requirements obtained by applying a substitution map to each requirement of some original generic signature. This request is used in various places: +\begin{itemize} +\item Computing the generic signature of an opaque type declaration (\ChapRef{opaqueresult}). +\item Computing the generic signature for an opened existential type (\SecRef{open existential archetypes}). +\item Checking class method overrides (\SecRef{overridechecking}). +\item Witness thunks (\SecRef{witnessthunksignature}). +\end{itemize} -\paragraph{Protocol inheritance clauses} -\index{inheritance clause} -Recall from Section~\ref{protocols} that a constraint type written in a protocol's inheritance clause declares a conformance requirement with a subject type of \texttt{Self}. Qualified name lookup must be aware of protocol inheritance relationships, since a lookup into a protocol can also find members of inherited protocols. However, qualified lookup sits ``below'' generics. Building a protocol's requirement signature performs type resolution, which queries name lookup; those name lookups cannot in turn depend on the requirement signature having already been constructed. Thus, qualified name lookup can only look at syntactic constructs and cannot query the requirement signature. +This request does not perform \index{requirement inference}requirement inference. No \index{diagnostic!from abstract generic signature request}diagnostics are emitted either; instead, an error value is returned which is checked by the caller. -The practical consequence of this design is that protocol inheritance must be explicitly stated, and cannot be implied as a non-trivial consequence of same-type requirements. After building a protocol's requirement signature, we ensure that any conformance requirements known to be satisfied by \texttt{Self} are actually explicit requirements written in the protocol's inheritance clause, or \texttt{where} clause entires with a subject type of \texttt{Self}. Anything else is diagnosed with a \index{warning}warning. +\paragraph{Requirement signature request.} This \index{request}\IndexDefinition{requirement signature request}request builds the \index{requirement signature}requirement signature for a given \index{protocol component}\emph{protocol component}, or a collection of one or more mutually-recursive protocols; this will be explained in \SecRef{protocol component}. The evaluation function begins by evaluating two subordinate requests to collect the user-written requirements of each protocol: +\begin{itemize} +\item The \IndexDefinition{structural requirements request}\Request{structural requirements request} collects \index{associated requirement}associated requirements from the protocol’s inheritance clause, \index{associated type declaration}associated type inheritance clauses, any \texttt{where} clauses on the protocol’s associated types, and the \texttt{where} clause on the protocol itself. Refer to \SecRef{protocols} for a description of the syntax. +\item The \IndexDefinition{type alias requirements request}\Request{type alias requirements request} collects \index{protocol type alias}protocol type aliases and converts them to same-type requirements. These are discussed further in \SecRef{building rules}. +\end{itemize} +The requirement signature request takes these user-written associated requirements, and decomposes, desugars and minimizes them, much like the requirements of a generic signature. There is also a primitive constructor for requirement signatures. It is the last step of the requirement signature request. We also use it after reading a protocol from a \index{serialized module}serialized module, because the requirements were already minimal when serialized. -\begin{listing}\captionabove{Example showing non-obvious protocol inheritance relationship}\label{badinheritance} +\paragraph{Protocol inheritance clauses.} When searching for a member of a protocol, \index{name lookup}name lookup must also visit \index{inherited protocol}inherited protocols. Consider this example: \begin{Verbatim} protocol Base { associatedtype Other: Base - typealias Salary = Int } protocol Good: Base { typealias Income = Salary } +\end{Verbatim} +The underlying type of \texttt{Income} in \texttt{Good} refers to \texttt{Salary} in \texttt{Base}. +Protocol inheritance relationships are encoded in the protocol's requirement signature. A concrete type that conforms to \texttt{Good} must also conform to \texttt{Base}, so \texttt{Good} has an associated conformance requirement $\ConfReq{Self}{Base}$. Name lookup cannot issue \index{generic signature query}generic signature queries though, because building the requirement signature depends on \index{type resolution}type resolution, which depends on name lookup. To avoid \index{request cycle}request cycles, name lookup must interpret the \index{inheritance clause}inheritance clause of a protocol directly, without interfacing with generics. This introduces a minor \index{limitation!protocol inheritance clauses}limitation, which we now describe. -// warning: protocol `Bad' should be declared to refine `Base' due to a -// same-type constraint on `Self' +A general fact is that the protocol inheritance relation is \index{transitive relation}transitive, so \texttt{Most} also inherits from \texttt{Base} here, because \texttt{Most} inherits from \texttt{Good}, and \texttt{Good} inherits from \texttt{Base}: +\begin{Verbatim} +protocol Most: Good {} +\end{Verbatim} +We can understand this behavior by writing down a derivation for the requirement $\ConfReq{\rT}{Base}$ in the \index{protocol generic signature}protocol generic signature $G_\texttt{Most}$. We start with $\ConfReq{\rT}{Most}$ and apply the associated conformance requirements $\ConfReq{Self}{Good}_{\texttt{Most}}$ and $\ConfReq{Self}{Base}_{\texttt{Good}}$: +\begin{gather*} +\ConfStep{\rT}{Most}{1}\\ +\AssocConfStep{1}{\rT}{Good}{2}\\ +\AssocConfStep{2}{\rT}{Base}{3} +\end{gather*} +When the protocol inheritance relationship is a syntactic consequence of the inheritance clause, we can write down a derivation like the above where every step has a subject type of $\rT$. This is why a name lookup into \texttt{Good} knows to look into \texttt{Base}. On the other hand, it is possible to devise a protocol declaration where $\ConfReq{\rT}{Base}$ is a non-trivial consequence of a same-type requirement between $\rT$ and some other type parameter. For example, here we have $G_\texttt{Bad}\vdash\ConfReq{\rT}{Base}$, which is not immediately apparent, because nothing is stated in the protocol's inheritance clause: +\begin{Verbatim} protocol Bad { - associatedtype Tricky: Base where Tricky.Other == Self - - typealias Income = Salary - // error: cannot find type `Salary' in scope + associatedtype Tricky: Base where Self == Tricky.Other + typealias Income = Salary // error } \end{Verbatim} -\end{listing} -\begin{example} -In Listing~\ref{badinheritance}, the \texttt{Self} type of the \texttt{Bad} protocol is equivalent to the type parameter \texttt{Self.Tricky.Other} via a same-type requirement. The \texttt{Tricky} associated type conforms to \texttt{Base}, and the \texttt{Other} associated type of \texttt{Base} also conforms to \texttt{Base}. For this reason, the \texttt{Self} type of \texttt{Bad} actually conforms to \texttt{Base}. +We diagnose an error, because we are unable to resolve \texttt{Salary} as a member of \texttt{Bad}. To understand why, notice that the derivation $G_\texttt{Bad} \vdash \ConfReq{\rT}{Base}$ involves the \index{associated same-type requirement}associated same-type requirement $\SameReq{Self}{Self.Tricky.Other}_\texttt{Bad}$, so it is not a consequence of the protocol's syntactic \index{inheritance clause}inheritance clause: +\begin{gather*} +\ConfStep{\rT}{Bad}{1}\\ +\AssocConfStep{1}{\rT.Tricky}{Base}{2}\\ +\AssocConfStep{2}{\rT.Tricky.Other}{Base}{3}\\ +\AssocSameStep{1}{\rT}{\rT.Tricky.Other}{4}\\ +\SameConfStep{3}{4}{\rT}{Base}{5} +\end{gather*} +A special code path detects this problem and \index{diagnostic!protocol inheritance clause}diagnoses a \index{warning}warning to explain what is going on. After building a protocol's requirement signature, the \Request{type-check source file request} verifies that any conformance requirements known to be satisfied by \texttt{Self} actually appear in the protocol's inheritance clause, or \texttt{where} clause entries with a subject type of \texttt{Self}. + +In our example, we discover the unexpected derived requirement $\ConfReq{\ttgp{0}{0}}{Base}$ of $G_\texttt{Bad}$, but at this point it is too late to attempt the failed name lookup of \texttt{Salary} again. The compiler instead suggests that the user should explicitly state the inheritance from \texttt{Base} in the inheritance clause of \texttt{Bad}: +\begin{Verbatim} +$ swiftc bad.swift + +bad.swift:8:22: error: cannot find type `Salary' in scope + typealias Income = Salary + ^~~~~~ +bad.swift:6:10: warning: protocol `Bad' should be declared to refine +`Base' due to a same-type constraint on `Self' +protocol Bad { + ^ +\end{Verbatim} + +A mistake of this sort is baked into the standard library's \texttt{SIMDScalar} protocol. The \texttt{Self} type of \texttt{SIMDScalar} must conform to all of \texttt{Equatable}, \texttt{Hashable}, \texttt{Encodable} and \texttt{Decodable}, via an associated same-type requirement: +\begin{Verbatim} +public protocol SIMDScalar { + ... + associatedtype SIMD2Storage: SIMDStorage + where SIMD2Storage.Scalar == Self + ... +} -However, this inheritance relationship is invisible to name lookup, so resolution of the underlying type of \texttt{Income} fails to find the declaration of \texttt{Salary}. After building the protocol's requirement signature, the type checker discovers the unexpected conformance requirement on \texttt{Self}, but at this stage, it is too late to attempt the failed name lookup again! The compiler instead emits a \index{warning}warning suggesting the user explicitly states the inheritance relationship in the declaration of \texttt{Bad}. -\end{example} +public protocol SIMDStorage { + associatedtype Scalar: Codable, Hashable + ... +} +\end{Verbatim} +However, these requirements are not stated in \texttt{SIMDScalar}'s inheritance clause. Since the inheritance clause of a protocol is part of the Swift \index{ABI}ABI, this omission cannot be fixed at this point, so the type checker specifically muffles the warning when building the standard library. \section{Requirement Inference}\label{requirementinference} -\index{requirement} -\IndexDefinition{requirement inference} -\index{type resolution} -\index{generic arguments} -\index{interface resolution stage} -\index{structural resolution stage} -Requirement inference allows the user to omit any requirement that is implied by the application of a bound generic type to a generic argument. It is easiest to explain with an example. Recall that the standard library \texttt{Set} type declares a single \texttt{Element} generic parameter and requires that it conform to \texttt{Hashable}: +Consider a function that returns the unique elements of a collection: \begin{Verbatim} -struct Set {...} +func uniqueElements(_ seq: S) -> Set + where S.Element: Hashable {...} \end{Verbatim} -Say you're writing a function that returns returns a set of unique elements in a collection: +When \index{type resolution}resolving the function's return type, we must establish that the \index{generic argument}generic argument \texttt{\rT.Element} satisfies the requirements in the generic signature of the \texttt{Set} type declared in the standard library: \begin{Verbatim} -func uniqueElements(_ seq: S) -> Set {...} +struct Set {...} \end{Verbatim} -The type \verb|Set| only makes sense if \texttt{S.Element} conforms to \texttt{Hashable}. This is checked by type resolution in the interface resolution stage, after the generic signature has been built. However this requirement is not explicitly stated here. Instead, requirement inference \emph{adds} this requirement during the structural resolution stage when the generic signature is being built, ensuring that the type representation \verb|Set| can successfully resolve later. - -You can always state an inferred requirement explicitly instead. Since this is useful for documentation purposes, the \verb|-warn-redundant-requirements| flag will not emit a \index{warning}warning if an explicit requirement is made redundant by an inferred requirement. +The sole \index{substituted requirement}substituted requirement is satisfied (\SecRef{checking generic arguments}) because the generic signature of \texttt{uniqueElements()} contains the requirement $\ConfReq{\rT.Element}{Hashable}$. In fact, this requirement \emph{has} to be part of our function's generic signature, for the return type, and thus the overall function declaration, to be valid at all. The \IndexDefinition{requirement inference}\emph{requirement inference} feature allows us to omit this requirement: \begin{Verbatim} func uniqueElements(_ seq: S) -> Set - where S.Element: Hashable {...} + /* where S.Element: Hashable */ {...} \end{Verbatim} +These \emph{inferred requirements} are not the derived requirements in the sense of \SecRef{derived req}. They appear in the generic signature, along with user-written requirements; they are not consequences of other requirements. Both declarations of \texttt{uniqueElements()} above really are identical, and in particular they have the \emph{same} generic signature with an explicit conformance requirement $\ConfReq{\rT.Element}{Hashable}$. Consumers of generic signatures do not know or care about requirement inference. -Requirement inference has an elegant formulation in terms of applying a substitution map to the requirements of a generic signature. In a sense, the problem being solved here is the opposite of checking generic arguments in type resolution (Section~\ref{checking generic arguments}). There, we determine if a substitution map satisfies the requirements of the referenced type declaration's generic signature, by applying the substitution map to each requirement and evaluating the truth of the substituted requirement. Requirement inference, on the other hand, \emph{adds} the substituted requirements to the generic signature being built, in order to \emph{make them true}. - -In our example, we consider the type representation \texttt{Set} while building the generic signature of \texttt{uniqueElements()}. After resolving this to a type, we take its context substitution map, which has a single replacement type \texttt{S.Element}. Let's call this substitution map $\Sigma$: -\[\Sigma:=\SubstMapLongC{\SubstType{Element}{S.Element}}{\SubstConf{Element}{S.Element}{Hashable}}\] -Applying $\Sigma$ to both types in the requirement $\ConfReq{Element}{Hashable}$ of \texttt{Set}'s generic signature yields the substituted requirement $\ConfReq{S.Element}{Hashable}$, which is added to the list of desugared requirements which are fed into the minimization algorithm: -\[\ConfReq{Element}{Hashable}\otimes \Sigma = \ConfReq{S.Element}{Hashable}\] - -\index{inheritance clause} -\index{generic parameter declaration} -\Index{where clause@\texttt{where} clause} -\index{underlying type} -\index{function declaration} -\index{subscript declaration} -\index{structural resolution stage} -\index{generic nominal type} -\index{type alias type} -\index{generic type alias} -Requirement inference considers type representations in the following positions when building the generic signature of a generic context: +\paragraph{Inferred requirements.} When building the generic signature of a declaration, we collect inferred requirements by visiting \index{type representation}type representations written in the following specific positions: \begin{enumerate} -\item The constraint types in inheritance clauses of generic parameter declarations. -\item Any types appearing inside the requirements of a \texttt{where} clause. -\item An additional list of type representations passed in to the inferred generic signature request. For functions and subscripts, this is the list of function parameter types together with the return type. For type aliases, this is the underlying type of the type alias. -\end{enumerate} -\index{substitution map} -\index{context substitution map} -Requirement inference resolves each of the above type representations in the structural type resolution stage (Chapter~\ref{typeresolution}), then recursively walks the type to find all generic nominal types and generic type alias types appearing within: -\begin{itemize} -\item Generic nominal types decompose into a nominal type declaration together with the context substitution map (Section~\ref{contextsubstmap}). -\item Type alias types decompose into a type alias declaration and a substitution map directly stored inside the type alias type (Section~\ref{more types}). -\end{itemize} -\index{substituted requirement}% -In both cases, we build a list of substituted requirements by applying the substitution map to each requirement of the type declaration's generic signature. These substituted requirements are precisely those that must become true in order for the type's generic arguments to satisfy the referenced type declaration's generic signature. Together with the user-written requirements, these substituted requirements form the basis for the generic signature being built. -\begin{example} -The subject type of a substituted requirement need not be a type parameter. To help motivate the \emph{requirement desugaring} algorithm introduced in the next section, let's first define a generic struct with some non-trivial requirements: +\item The parameter and return types of \index{function declaration}function and \index{subscript declaration}subscript declarations, if the generic declaration we're given is one of those. In our \texttt{uniqueElements()} example, we infer requirements from the function's return type. + +\item Types appearing in \index{inheritance clause}inheritance clauses of \index{generic parameter declaration}generic parameter declarations. The only interesting case is a \index{generic class}\index{superclass requirement}generic superclass bound. Here, the generic signature of \texttt{Foo} has the explicit superclass requirement $\ConfReq{\rT}{Base<\rU>}$ and an inferred requirement $\ConfReq{\rU}{Equatable}$, the latter written in the commented-out \texttt{where} clause: \begin{Verbatim} -struct Transform - where X.Element == Y.Element {} +class Base {...} +struct Foo, U> + /* where U: Equatable */ {...} \end{Verbatim} -Now, we're going to reference this struct from the type of a method parameter, which is one of the positions eligible for requirement inference: + +\item Types appearing within the requirements of a \Index{where clause@\texttt{where} clause}\texttt{where} clause; for example, the right-hand side of a same-type requirement. Here, we get the inferred requirement $\ConfReq{\rU}{Hashable}$: \begin{Verbatim} -struct Transformer { - func transform(_: Transformer, - Array>, - Array>) {} -} +struct Foo + where T.Element == Set /*, U: Hashable */ {...} \end{Verbatim} -The generic signature of \texttt{Transformer} has four requirements: -\begin{quote} -\begin{tabular}{lll} -\toprule -\textbf{Kind}&\textbf{Subject type}&\textbf{Constraint type}\\ -\midrule -Conformance&\texttt{Key}&\texttt{Hashable}\\ -Conformance&\texttt{X}&\texttt{Sequence}\\ -Conformance&\texttt{Y}&\texttt{Sequence}\\ -Same type&\texttt{X.[Sequence]Element}&\texttt{Y.[Sequence]Element}\\ -\bottomrule -\end{tabular} -\end{quote} -The context substitution map of the function's parameter type is this: -\[ -\SubstMapLongC{\SubstType{Key}{Set}\\ -\SubstType{X}{Array>}\\ -\SubstType{Y}{Array>}}{ -\SubstConf{Key}{Set}{Hashable}\\ -\SubstConf{X}{Array>}{Sequence}\\ -\SubstConf{Y}{Array>}{Sequence}} -\] -Applying this substitution map to each requirement of the generic signature yields four substituted requirements: -\begin{quote} -\begin{tabular}{lll} -\toprule -\textbf{Kind}&\textbf{Subject type}&\textbf{Constraint type}\\ -\midrule -Conformance&\texttt{Set}&\texttt{Hashable}\\ -Conformance&\texttt{Array>}&\texttt{Sequence}\\ -Conformance&\texttt{Array>}&\texttt{Sequence}\\ -Same type&\texttt{Array}&\texttt{Array}\\ -\bottomrule -\end{tabular} -\end{quote} -Not all of these requirements are useful. The first three are conformance requirements with a concrete subject type, so they are discarded by requirement desugaring after they are checked to be satisfied. The fourth substituted requirement states that the two array types \texttt{Array} and \texttt{Array} are the same type. Requirement desugaring replaces this requirement with the simpler equivalent requirement $\FormalReq{E}{Int}$. -In the end, the generic signature of \texttt{Transformer.transform()} becomes the following: -\begin{quote} -\begin{verbatim} - -\end{verbatim} -\end{quote} -The \verb|T: Hashable| requirement was explicitly stated, while the $\FormalReq{E == Int}$ requirement was inferred. -\end{example} -\index{generic type alias} -\begin{example} -Requirement inference is specifically defined to look at type alias types, making the building of generic signatures one of a handful of places in the language where writing a sugared type has a semantic effect. Consider this example: +\item The typed throws feature, introduced with \IndexSwift{6.0}Swift~6 \cite{se0413}, allows specifying the type of error thrown by a function. For function, subscript and constructor declarations, we infer a conformance requirement to \texttt{Error} with the declaration's thrown error type as the subject type: \begin{Verbatim} -typealias EquatableArray = Array - where Element: Equatable - -func allEqual(_: EquatableArray) -> Bool {} +func f(_: Int) throws(E) /* where E: Error */ {...} \end{Verbatim} -The \texttt{allEqual()} function has a single \texttt{Element} generic parameter, and a parameter of type \texttt{EquatableArray}. The \texttt{EquatableArray} generic type alias requires that its generic parameter is \texttt{Equatable}. Thus, the generic signature of \texttt{allEqual()} contains a single requirement introduced by requirement inference: -\begin{quote} -\begin{verbatim} - -\end{verbatim} -\end{quote} - -Suppose we wrote our function without reference to the type alias: +If requirement inference encounters a function type in any other position, it also considers the thrown error type there: \begin{Verbatim} -func allEqual(_: Array) -> Bool {} +func f(_: () throws(E) -> ()) /* where E: Error */ {...} \end{Verbatim} -Now there is nothing for requirement inference to do, and \texttt{allEqual()} receives a different generic signature with no requirements: -\begin{quote} -\begin{verbatim} - -\end{verbatim} -\end{quote} -\end{example} +\end{enumerate} +(From this list, we see that types appearing in the \emph{body} of a declaration do not participate in requirement inference; it would be quite unfortunate if the interface type of a function, for example, could not be determined without type checking the function body.) -\index{parameterized protocol type} -\begin{example} -The underlying type of a type alias need not mention the generic parameter types of the type alias at all. Before parameterized protocol types were added to the language, a few people discovered a funny trick that could be used to simulate them: -\begin{Verbatim} -typealias SequenceOf = Any where T: Sequence, T.Element == E +We resolve each of the enumerated type representations in the \index{structural resolution stage}structural resolution stage, because the generic signature of the current declaration isn't known; we're building it right now. Let's say that $H$ is the generic signature being built; we can still talk about $H$ in the abstract, we just can't define anything concrete to depend on generic signature queries against~$H$. The resolved type possibly contains type parameters of $H$, so it is element of $\TypeObj{H}$. From the resolved type, we obtain a set of inferred requirements by walking its child types recursively. We look for children that are generic nominal types or generic type alias types, and extract a \index{substitution map}substitution map from each one: +\begin{itemize} +\item For a \index{generic nominal type}generic nominal type, we take the context substitution map (\SecRef{contextsubstmap}). +\item For a generic \index{type alias type}type alias type, we take the substitution map stored inside the type. +\end{itemize} +Let's call this substitution map $\Sigma$, and let $G$ be the \index{input generic signature}input generic signature of $\Sigma$. This is the generic signature of the referenced nominal type or type alias declaration. The \index{output generic signature}output generic signature of $\Sigma$ is $H$, so $\Sigma\in\SubMapObj{G}{H}$. We proceed as if we were checking generic arguments, applying $\Sigma$ to each requirement of $G$. This gives us a substituted requirement in $\ReqObj{H}$. (Recall that when checking generic arguments in the interface stage, we worked with archetypes, so we'd get something in $\ReqObj{\EquivClass{H}}$; here, we instead want the substituted requirement to talk about type parameters). -func sum>(_: T) {...} -\end{Verbatim} -To understand what this does, and how, consider the generic signature of \texttt{SequenceOf}: -\begin{quote} -\begin{verbatim} - -\end{verbatim} -\end{quote} -This signature has two requirements: -\begin{quote} -\begin{tabular}{lll} -\toprule -\textbf{Kind}&\textbf{Subject type}&\textbf{Constraint type}\\ -\midrule -Conformance&\texttt{T}&\texttt{Sequence}\\ -Same type&\texttt{E}&\texttt{T.[Sequence]Element}\\ -\bottomrule -\end{tabular} -\end{quote} -The \texttt{sum()} function declares a generic parameter \texttt{S} with the type \texttt{SequenceOf} in its inheritance clause. The underlying type of this type alias is \texttt{Any}, so the inheritance clause introduces the trivial requirement $\ConfReq{S}{Any}$. However, requirement inference also visits the type \texttt{SequenceOf} in the inheritance clause. This type alias type has the following substitution map: -\[ -\SubstMapC{ -\SubstType{T}{S},\,\SubstType{E}{Int} -}{\SubstConf{T}{S}{Sequence}} -\] -Applying this substitution map to each requirement in the type alias declaration's generic signature yields the following two substituted requirements: -\begin{quote} -\begin{tabular}{lll} -\toprule -\textbf{Kind}&\textbf{Subject type}&\textbf{Constraint type}\\ -\midrule -Conformance&\texttt{S}&\texttt{Sequence}\\ -Same type&\texttt{Int}&\texttt{S.[Sequence]Element}\\ -\bottomrule -\end{tabular} -\end{quote} -So the generic signature of \texttt{sum()} becomes the following: -\begin{quote} -\begin{verbatim} - -\end{verbatim} -\end{quote} -Indeed, our function \texttt{sum()} could have been written as follows: +In all examples we've seen so far, the inferred requirement's subject type was a type parameter of~$H$, and the inferred requirement was just added to~$H$ unchanged. To preview the next section, the requirement's subject type might be \index{fully-concrete type}fully concrete after substitution; for example, here we get the useless inferred requirement $\ConfReq{Int}{Hashable}$: \begin{Verbatim} -func sum(_: T) where S: Sequence, S.Element == Int {...} +func f(_: T, _: Set) {} \end{Verbatim} -Which is of course equivalent to the modern syntax: +This requirement is not a statement about the type parameters of \texttt{f()} at all; it is just ``always true,'' so it does not affect the generic signature of \texttt{f()}. A more complex scenario is possible with \index{conditional conformance}conditional conformances. In the next example, the inferred requirement is $\ConfReq{\rT}{Hashable}$: \begin{Verbatim} -func sum(_: T) where S: Sequence {...} +func f(_: Set>) /* where T: Hashable */ {} \end{Verbatim} -\end{example} +From the generic nominal type \texttt{Set>}, we obtain the inferred conformance requirement $\ConfReq{Array<\rT>}{Hashable}$. The standard library declares a conditional conformance of \texttt{Array} to \texttt{Hashable} when the element type is \texttt{Hashable}. We replace our original requirement with the conditional requirements of this conformance, which gives us $\ConfReq{\rT}{Hashable}$. -\index{generic arguments} -\index{interface resolution stage} -\begin{example} -It is instructive to consider the behavior of type resolution's generic argument checking, with and without requirement inference. Requirement inference only happens if a new generic signature actually needs to be built, either because the generic context adds generic parameters or has a trailing \texttt{where} clause. However, if neither is present, the generic context always inherits the generic signature from its parent context, without giving requirement inference the opportunity to introduce requirements. +In general, when a requirement's subject type is not a type parameter, we repeatedly rewrite it into a (possibly empty) set of simpler requirements, until only requirements about type parameters remain; this \emph{requirement desugaring} procedure is described in the next section. + +After desugaring, inferred requirements are passed on to \index{minimal requirement}requirement minimization, where they might become redundant. For example, a user who is simply unaware of requirement inference or has a preference against it will re-state all inferred requirements explicitly, as in the first version of \texttt{uniqueElements()}; these duplicate requirements are eliminated by requirement minimization. More elaborate examples where inferred requirements become redundant can also be constructed. In the below, we infer the requirement $\ConfReq{\rT.Iterator}{IteratorProtocol}$, but it's not an explicit requirement of the generic signature of \texttt{f()} because it can be derived from $\ConfReq{\rT}{Sequence}$: +\begin{Verbatim} +struct G {} + +func f(_: T) -> G {...} +\end{Verbatim} +Of course, the user can explicitly re-state the redundant requirement in the \texttt{where} clause just for good luck, which would make it twice redundant, in a way. -Consider the below generic struct type containing three functions: +\paragraph{Outer generic parameters.} +The outer generic parameters of a declaration can be subject to inferred requirements. Consider a generic struct with three methods: \begin{Verbatim} struct G { - func example1(_: V, _: Set) {} // infers T: Hashable - func example2(_: Set) where U: Sequence {} // infers T: Hashable - func example3(_: Set) {} // nothing inferred; error diagnosed + func f(_: Set, _: V) /* where U: Hashable */ {} + func f(_: Set) where T: Equatable /*, U: Hashable */ {} + func f(_: Set) {} // error! } \end{Verbatim} -\begin{enumerate} -\item The first function declares a generic parameter list, so the inferred generic signature request runs. Requirement inference adds the inferred requirement $\FormalReq{T}{Hashable}$; thus the generic signature is -\begin{quote} -\begin{verbatim} - -\end{verbatim} -\end{quote} -Type resolution then checks the generic arguments of \texttt{Set} in the interface resolution stage. The generic argument \texttt{T} is mapped into the function's generic environment, producing the archetype $\archetype{T}$. The substituted requirement $\Conf{\archetype{T}}{Hashable}$ is satisfied, because $\archetype{T}$ conforms to \texttt{Hashable}. +We infer requirements if the declaration has generic parameters \emph{or} a \texttt{where} clause, so in the first two, we infer $\ConfReq{\rU}{Hashable}$. The third method inherits the generic signature of the struct, so the requirement is not satisfied and we diagnose an error. -\item In the second example, the generic signature is slightly different, but the generic argument check succeeds for the same reason: -\begin{quote} -\begin{verbatim} - -\end{verbatim} -\end{quote} +\paragraph{Generic type aliases.} +A reference to a \index{generic type alias}generic type alias resolves to a \index{sugared type}sugared \index{type alias type}type alias type (\SecRef{more types}). This sugared type prints as written when it appears in \index{diagnostic!sugared type alias type}diagnostic messages, but it is canonically equal to its \index{substituted underlying type}substituted underlying type, behaving like it otherwise. So here, the interface type of ``\texttt{x}'' prints as \texttt{OptionalElement>}, but it is canonically equal to its substituted underlying type, \texttt{Optional}: +\begin{Verbatim} +typealias OptionalElement = Optional -\item In the third example, the declaration does not have a generic parameter list or \texttt{where} clause, so it inherits the generic signature from the parent context, \texttt{}. In this generic environment, the archetype $\archetype{T}$ does not conform to \texttt{Hashable}, so the type representation \texttt{Set} cannot be resolved. An error is diagnosed, pointing at the source location of the type representation. -\end{enumerate} -\end{example} +let x: OptionalElement> = ... +\end{Verbatim} -\paragraph{Requirement signatures} -\index{requirement signature request} -\index{limitation} -\index{protocol dependency graph} -Unlike the inferred generic signature request, the requirement signature request does not perform requirement inference. Originally the reasoning was that all requirements imposed on the concrete conforming type should be explicitly stated inside the protocol body, for clarity. +In the above, we form the substituted underlying type \texttt{Optional} by applying a substitution map that replaces $\rT$ with \texttt{Array} to the underlying type of the type alias declaration, \texttt{Optional}. This gives us the \index{structural components}structural components of a type alias type: a reference to a type alias declaration, a substituted underlying type, and a substitution map. The substitution map is used when printing the sugared type's generic arguments. Cruicially to our topic at hand, this substitution map is also considered by requirement inference, making this one of a handful of language features where the appearance of a sugared type \emph{does} have a semantic effect. In this example, we infer the requirement $\ConfReq{\rT}{Sequence}$ from considering \texttt{OptionalElement<\rT>}: +\begin{Verbatim} +func maybePickElement(_ sequence: T) -> OptionalElement +\end{Verbatim} -\index{protocol dependency graph} -\index{associated conformance requirement} -This has since been retconned\footnote{``revise (an aspect of a fictional work) retrospectively, typically by introducing a piece of new information that imposes a different interpretation on previously described events.''} into an appeal for implementation simplicity. For various reasons, when computing the requirement signature of a protocol we need to construct something called the \emph{protocol dependency graph} very early. This graph, which you will meet in Section~\ref{recursive conformances}, records the relationship between a protocol and other protocols it references from the right hand side of associated conformance requirements. Since requirement inference can introduce new conformance requirements, it is incompatible with protocol requirement signature minimization. +Mostly of theoretical interest is the fact that the underlying type of a type alias need not mention the generic parameter types of the type alias at all. Before \index{parameterized protocol type}parameterized protocol types were added to the language, a few people discovered a funny trick to simulate something similar: +\begin{Verbatim} +typealias SequenceOf = Any + where T: Sequence, T.Element == E +\end{Verbatim} +The underlying type of \texttt{SequenceOf} is just \texttt{Any}, and it's generic signature has the two requirements $\ConfReq{\rT}{Sequence}$ and $\SameReq{\rU}{\rT.Element}$. Now, we can state this type alias in the inheritance clause of a generic parameter declaration: +\begin{Verbatim} +func sum>(_: T) {...} +\end{Verbatim} +The inheritance clause entry introduces the requirement $\ConfReq{\rT}{Any}$; this has no effect, because \texttt{Any} is the empty protocol composition. However, requirement inference also visits the type alias type \texttt{SequenceOf}, which is the whole trick. This type alias type has the following substitution map: +\[ +\Sigma := \SubstMapC{ +\SubstType{\rT}{\rT},\,\SubstType{\rU}{Int} +}{\SubstConf{\rT}{\rT}{Sequence}} +\] +Applying $\Sigma$ to the generic signature of \texttt{SequenceOf} gives us our two inferred requirements: +\begin{gather*} +\ConfReq{\rT}{Sequence}\otimes\Sigma = \ConfReq{\rT}{Sequence}\\ +\SameReq{\rU}{\rT.Element}\otimes\Sigma = \SameReq{\rU}{Int} +\end{gather*} +These requirements survive minimization and appear in the generic signature of \texttt{sum()}. We get the same generic signature as both of the following: +\begin{Verbatim} +func sum>(_: T) {...} +func sum(_: T) where S.Element == Int {...} +\end{Verbatim} -For example, the $\ConfReq{Particle}{Hashable}$ requirement cannot -be omitted below, because the type \texttt{Set} would no longer satisfy \texttt{Set}'s $\ConfReq{Element}{Hashable}$ requirement: +\paragraph{Protocols.} +Unlike the inferred generic signature request, the \index{requirement signature request}requirement signature request \index{limitation!requirement inference in protocols}does not perform requirement inference; all associated requirements imposed on the concrete conforming type must be explicitly stated in source. For example, we must state the $\ConfReq{Self.Particle}{Hashable}$ requirement below, because otherwise the type \texttt{Set} appearing in the same-type requirement would not satisfy \texttt{Set}'s $\ConfReq{\rT}{Hashable}$ requirement: \begin{Verbatim} protocol Cloud { - associatedtype Particle: Hashable + associatedtype Particle /* must be : Hashable */ associatedtype Particles: Sequence - where Particles.Element == Set + where Particles.Element == Set // error } \end{Verbatim} -\section{Requirement Desugaring}\label{requirement desugaring} +The technical reason for this restriction is the following. Before we can build the requirement signature of a protocol, we must construct the \index{protocol dependency graph}\emph{protocol dependency graph}. This graph, which we will meet in \SecRef{recursive conformances}, encodes the relation between each protocol, and any protocols that appear on the right-hand sides of its \index{associated conformance requirement}associated conformance requirements. A key property is that this graph can be recovered by performing name lookup operations alone, without queries against other requirement signatures or generic signatures. If we could infer requirements in protocols, this would no longer hold; we could define \texttt{Cloud} as above without explicit mention of \texttt{Hashable}, so our protocol dependency graph would have a ``non-obvious'' edge relation. -After requirement resolution and requirement inference, the inferred generic signature request has collected a list of requirements, which together with the requirements in the parent generic signature, form the basis for a new generic signature. (The abstract generic signature request \emph{starts} here; the list of requirements are directly provided by the caller, rather than being constructed from user-written requirement representations and requirement inference.) +\section{Decomposition and Desugaring}\label{requirement desugaring} -There are several extra steps before we can arrive at the \emph{minimal}, \emph{reduced} list of requirements that appear in the final generic signature. The first step is \emph{requirement desugaring}, which establishes the invariants defined below. These invariants will become important when we build rewrite rules from requirements in Section~\ref{building rules}. -\IndexDefinition{desugared requirement} -\index{requirement desugaring|see {desugared requirement}} -\begin{definition}\label{desugaredrequirementdef} A \emph{desugared requirement} satisfies two conditions: -\begin{enumerate} -\item If the requirement is a conformance requirement, the constraint type must be a protocol type (and not a protocol composition or parameterized protocol). -\item The subject type must be a type parameter. -\end{enumerate} -\end{definition} +Before proceeding to requirement minimization, we eliminate some unneeded generality in the specification of requirements. To backtrack a little, here is where we are: +\begin{itemize} +\item If we're evaluating the \index{inferred generic signature request}\Request{inferred generic signature request}, we're about half-way through \FigRef{inferred generic signature request figure}; we've gathered a set of user-written and inferred \index{requirement}requirements that together describe the type parameters of a generic declaration. +\item If we're evaluating the \index{abstract generic signature request}\Request{abstract generic signature request}, we \emph{start} here; as shown in \FigRef{abstract generic signature request figure}, the caller gives us a set of requirements as input. +\end{itemize} +In the \index{derived requirement}derived requirements formalism, the right-hand side of a \index{conformance requirement}conformance requirement is always a protocol type, however in the syntax we can also write a protocol composition type, or a parameterized protocol type. Such conformance requirements must be split up---this is \emph{requirement decomposition}. We also recall that the left-hand side of a derived requirement is always a type parameter, whereas requirement inference (or even the user) can write down a requirement with any subject type. Such requirements are also split up into zero or more simpler requirements, which gives us \emph{requirement desugaring}. + +\paragraph{Decomposition.} This formalizes the syntax sugar from Sections \ref{requirements}~and~\ref{protocols}. For example, the standard library defines the \texttt{Codable} \index{type alias type}type alias, whose underlying type is a \index{protocol composition type}composition of two protocols, \texttt{Decodable} and \texttt{Encodable}: +\begin{Verbatim} +typealias Codable = Decodable & Encodable +\end{Verbatim} +We can state conformance to \texttt{Codable} in the inheritance clause of a \index{horse}generic parameter: +\begin{Verbatim} +func ride(_: Horse) {} +\end{Verbatim} +We can understand the above by decomposing $\ConfReq{\rT}{Decodable \& Encodable}$ into the two conformance requirements $\ConfReq{\rT}{Decodable}$ and $\ConfReq{\rT}{Encodable}$. The right-hand side of a conformance requirement might also be \index{parameterized protocol type}parameterized protocol type, which is sugar for a series of same-type requirements, constraining each of the protocol's \index{primary associated type}primary associated types to the corresponding \index{generic argument}generic argument: +\begin{Verbatim} +func search>(...) {} +\end{Verbatim} +The conformance requirement $\ConfReq{\rU}{Sequence<\rT>}$ written above decomposes into two requirements, $\ConfReq{\rU}{Sequence}$ and $\SameReq{\rU.[Sequence]Element}{\rT}$. -\index{conformance requirement} -\index{protocol type} -\index{protocol composition type} -\index{parameterized protocol type} -The idea behind establishing the first invariant was already introduced informally in Sections~\ref{constraints} and Section~\ref{protocols}. Now we will see the precise semantics in the form of an algorithm. +We now present requirement decomposition in the form of an algorithm. Mostly this is a restatement of the above, with additional details and a couple of edge cases. \index{protocol substitution map} \index{primary associated type} \index{same-type requirement} -\begin{algorithm}[Expanding conformance requirements]\label{expand conformance req algorithm} Takes a list of conformance requirements as input. Outputs a new list of equivalent requirements where all conformance requirements have a protocol type on the right hand side. +\begin{algorithm}[Decomposition of conformance requirements]\label{expand conformance req algorithm} Takes a list of conformance requirements as input. Outputs a new list of equivalent requirements where all conformance requirements have a protocol type on the right hand side. \begin{enumerate} -\item Initialize the output list to an empty list. -\item Initialize the worklist with all conformance requirements from the input list. +\item Initialize an empty list to collect the output requirements. +\item Add all input requirements to the worklist. \item (Check) If the worklist is empty, return the output list. -\item (Loop) Take a conformance requirement $\texttt{T}:~\texttt{C}$ from the worklist. -\item (Base case) If \texttt{C} is a protocol type, add this conformance requirement \texttt{T:~C} to the output list. -\item (Composition) If \texttt{C} is a protocol composition type, visit each protocol composition member $\texttt{M}\in\texttt{C}$. If \texttt{M} is a class type or \texttt{AnyObject}, add the superclass or layout requirement \texttt{T:~M} to the output list. Otherwise, \texttt{M} might need to be decomposed further, so add a conformance requirement \texttt{T:~M} to the worklist. -\item (Parameterized) If \texttt{C} is a parameterized protocol type \texttt{P} with base type \texttt{P} and generic arguments \texttt{Gi}, decompose the requirement as follows: +\item (Loop) Remove a conformance requirement $\ConfReq{T}{X}$ from the worklist, where \texttt{X} is a ``protocol-like'' type, with one of the three kinds below. +\item (Base case) If \texttt{X} is a protocol type, output the conformance requirement $\ConfReq{T}{X}$. +\item (Composition) If \texttt{X} is a protocol composition type \texttt{$\texttt{M}_1$ \& ...~\& $\texttt{M}_n$}, visit each member $\texttt{M}_i\in\texttt{X}$. If $\texttt{M}_i$ is a class type, output the superclass requirement $\ConfReq{T}{$\texttt{M}_i$}$. If $\texttt{M}_i$ is \texttt{AnyObject}, output the layout requirement $\ConfReq{T}{AnyObject}$. Otherwise, $\texttt{M}_i$ is either a protocol type, or it can be decomposed further. Add the conformance requirement $\ConfReq{T}{$\texttt{M}_i$}$ to the worklist. +\item (Parameterized) If \texttt{X} is a parameterized protocol type \texttt{P<$\texttt{B}_1$, ..., $\texttt{B}_n$>} with base protocol type \texttt{P} and generic arguments $\texttt{B}_i$, decompose the requirement as follows: \begin{enumerate} -\item The base type \texttt{P} is always a protocol type. Add the conformance requirement \texttt{T:~P} to the output list. -\item For each primary associated type \texttt{Ai} of \texttt{P}, construct the protocol substitution map for the subject type \texttt{T}, and apply it to the declared interface type of \texttt{A}: -\[\texttt{Self.[P]Ai} \otimes \SubstMapLongC{\SubstType{Self}{T}}{\SubstConf{Self}{T}{P}} = \texttt{T.[P]Ai}\] -(Note that if \texttt{T} is a type parameter, the substituted type is the dependent member type \texttt{T.[P]Ai}; however, \texttt{T} might also be a concrete type, in which case we look up the type witness of \texttt{Ai} in the conformance $\ConfReq{T}{P}$.) - -Add the same-type requirement $\FormalReq{T.[P]Ai == Gi}$ to the output list. +\item Output the conformance requirement $\ConfReq{T}{P}$. +\item For each primary associated type $\texttt{A}_i$ of \texttt{P}, apply the protocol substitution map for \texttt{T} to the declared interface type of $\texttt{A}_i$: +\[\texttt{Self.[P]A}_i \otimes \SubstMapLongC{\SubstType{Self}{T}}{\SubstConf{Self}{T}{P}} = \texttt{T.[P]A}_i\] +Output the same-type requirement $\SameReq{$\texttt{T.[P]A}_i$}{$\texttt{G}_i$}$. \end{enumerate} +\item (Error) If \texttt{X} is anything else, we have an invalid requirement, for example $\ConfReq{T}{Int}$. \index{diagnostic!invalid requirement}Diagnose an error. +\item (Next) Go back to Step~4. \end{enumerate} \end{algorithm} -This gives us the first invariant of Definition~\ref{desugaredrequirementdef}. To understand how requirement desugaring establishes the second invariant, we need to understand what it means for a requirement to have a concrete subject type. Requirements involving concrete types can appear when building a new generic signature from substituted requirements obtained by applying a substitution map to the requirements of a different generic signature. We already saw how this can happen with requirement inference, and it will come up again in Section~\ref{overridechecking} and Section~\ref{witnessthunksignature}. +Usually \texttt{T} will be a type parameter in Step~7, and the substituted type becomes the dependent member type $\texttt{T.[P]A}_i$. Our algorithm relies on type substitution to also cope with \texttt{T} being a concrete type. When \texttt{T} is a concrete type, the subject type of each introduced same-type requirement has to be the \index{type witness}type witness for $\texttt{A}_i$ in the concrete conformance $\ConfReq{T}{P}$. For example, given $\ConfReq{Array<\rT>}{Sequence}$, we output $\ConfReq{Array<\rT>}{Sequence}$ and $\SameReq{\rT}{Int}$. Admittedly, this is a rather a silly edge case. One might think that requirement inference would exercise this scenario: +\begin{Verbatim} +typealias G> = T +func f(_: G>) {} +\end{Verbatim} +However, we build the generic signature of \texttt{G} first, so we've already decomposed the requirement $\ConfReq{\rT}{Sequence}$ by the time we get around to building the generic signature of \texttt{f()}; when we visit \texttt{G} in requirement inference, we substitute the decomposed and desugared (and indeed, the minimal) requirements of \texttt{G}. In fact, the only way to exercise this is to write it down directly: +\begin{Verbatim} +func f(_: Array) where Array: Sequence {} +\end{Verbatim} + +A more useful application of decomposition is the following. The \Request{abstract generic signature request} performs decomposition to deal with the opaque return types of \ChapRef{opaqueresult} and existential types of \ChapRef{existentialtypes}. We reason about these kinds of types by using this request to build an ``auxiliary'' generic signature describing the type. The \index{constraint type}constraint type after the \texttt{some} or \texttt{any} keyword defines a conformance requirement; this requirement must be decomposed by the same algorithm, so that we can interpret something like ``\texttt{any Sequence}'' or ``\verb|some Equatable & AnyObject|''. + +\paragraph{Desugaring.} +Requirement inference is one of several instances of a useful pattern: we apply a substitution map to the minimal requirements of a generic signature, and build a new generic signature from these substituted requirements. \SecRef{overridechecking} will use the \Request{abstract generic signature request} in this way to check class method overrides, for example. When the original requirements are minimal, the substituted requirements are already decomposed, but it is possible for such a requirement to have a left-hand side that is not a type parameter, but a concrete type, introduced by the substitution. + +A thought experiment helps us understand the meaning of a requirement whose left-hand side is not a type parameter. If the subject type of some requirement $R$ is an interface type, not necessarily a type parameter, we cannot describe it with the derived requirements formalism. However, we can still apply a substitution map $\Sigma$ to $R$, and check if the \index{substituted requirement}substituted requirement $R\otimes\Sigma$ is \index{satisfied requirement}satisfied with \AlgRef{reqissatisfied}; nothing in the algorithm depends on the \emph{original} requirement's subject type being a type parameter. + +We will transform $R$ into a set of simpler requirements $\{R_1,\ldots,R_n\}$, such that for \emph{every} substitution map $\Sigma$, $R\otimes\Sigma$ is satisfied if and only if for all $i\le n$, $R_i\otimes\Sigma$ is satisfied. Of course we're not iterating over all possible substitution maps in the implementation, this is just a thought experiment. If we believe that the below transformation upholds the desired property, we can conclude that $R$ can be replaced with $\{R_1,\ldots,R_n\}$ without changing the meaning of our generic signature. + +This rule essentially determines the implementation of requirement desugaring. Let's first consider a requirement that does not contain type parameters at all, such as $\ConfReq{Int}{Hashable}$ or $\SameReq{Int}{String}$. Applying a substitution map cannot ever change our requirement, so it is always true or always false; we can check it with \AlgRef{reqissatisfied}: +\begin{itemize} +\item If the requirement is satisfied, we we can delete it, that is, replace it with the \index{empty set}empty set of requirements, without violating our invariant. It doesn't contribute anything new. +\item If the requirement is unsatisfied on the other hand, the only way to proceed is to replace it with something else unsatisfiable, so we must diagnose an error and give up. +\end{itemize} +The second case merits further explanation. A generic declaration whose requirements cannot be satisfied by any substitution map is essentially useless, and we will see in \SecRef{minimal requirements} that we try to uncover situations where two requirements are in conflict with each other and cannot be simultaneously satisfied. Here though, requirement desugaring detects the trivial case where a requirement \index{conflicting requirement}``conflicts'' with itself. -\index{global conformance lookup} -Recall from Section~\ref{checking generic arguments} that a requirement where all types are fully concrete is a ``statement'' whose truth can be evaluated. A conformance requirement with a concrete subject type can be checked by performing a global conformance lookup; a same-type requirement between concrete types can be checked by comparing two types for canonical equality, and so on. +\smallskip -\index{conflicting requirement} -\index{redundant requirement} -If a requirement with a concrete subject type is satisfied, the requirement is necessarily redundant because it does not give us any new information about the generic signature's type parameters. If it is unsatisfied, the requirement is a conflicting requirement; an error is diagnosed. +Next, we consider a conformance requirement with a left-hand side that contains type parameters, like $\ConfReq{Array<\rT>}{Sequence}$ or $\ConfReq{Array<\rT>}{Hashable}$. With \index{global conformance lookup}global conformance lookup, we can find the concrete conformance of the subject type to this protocol. If the conformance is \index{invalid conformance}invalid, we have a trivial conflict; if unconditional, the requirement is trivially satisfied. -\begin{example} -The requirement \texttt{Int:\ Hashable} of the first function is redundant because it is true for any replacement type \texttt{T}. The requirement \texttt{Int:\ Sequence} of the second function is a conflict, because it is false independent of \texttt{T}: +A \index{conditional conformance}conditional conformance here finally gives us an interesting behavior. Our invariant \textsl{leaves us with no choice}: we must replace the original requirement with the \index{conditional requirement}conditional requirements of this conformance (an unconditional conformance has an empty set of conditional requirements, so there's really just one rule here). For example, for any substitution map $\Sigma$, it is true that $\texttt{Array<\rT>}\otimes\Sigma$ conforms to \texttt{Hashable} if and only if $\rT\otimes\Sigma$ conforms to \texttt{Hashable}. By this logic, $\ConfReq{Array<\rT>}{Hashable}$ must desugar to $\ConfReq{\rT}{Hashable}$, which now fully explains how we infer $\ConfReq{\rT}{Hashable}$ from the parameter type \texttt{Set>} below, as we saw previously: \begin{Verbatim} -func trivial(_: T) where Int: Hashable {} -func contradictory(_: T) where Int: Sequence {} +func f(_: Set>) /* where T: Hashable */ {} \end{Verbatim} -\end{example} - -\index{conditional requirement} -\index{conditional conformance} -\index{invalid conformance} -\index{same-type requirement} -A third possibility is that the subject type is a concrete type but it contains type parameters and we cannot conclude outright if it is redundant or conflicting. In this case, we split it up into smaller requirements, which are then processed recursively. This is done with a generalization of Algorithm~\ref{reqissatisfied}. -\index{same-type requirement} -\index{reduced type} -First, let's look at same-type requirements. Before desugaring, there are four possible flavors of same-type requirements to consider: +Finally, to desugar a \index{same-type requirement}same-type requirement, we consider four possibilities: \begin{enumerate} -\item Both sides are type parameters, \verb|T.A == U.B|. -\item Subject type is a type parameter, constraint type is concrete: \verb|T.A == Array|. -\item Subject type is concrete, constraint type is a type parameter: \verb|Array == U.B|. -\item Both sides are concrete types, \verb|Array == Array|. +\item Both sides are type parameters. +\item Left-hand side is a type parameter, right-hand side is a concrete type. +\item Left-hand side is a concrete type, right-hand side is a type parameter. +\item Both sides are concrete types. \end{enumerate} -The first two cases already meet the definition of a desugared requirement because the subject type is a type parameter, so we're done. To desugar a requirement of the third kind, we swap the subject type and constraint type; this is valid, because a same-type requirement is a statement that two types have the same reduced type, thus changing the order of the two types does not change the meaning of the requirement. - -This leaves us with the fourth case, where \emph{both} types in the same-type requirement are concrete types. We solve this problem by walking the two types in parallel and splitting up the same-type requirement into one or more simpler same-type requirements. - -\begin{example}\label{same-type desugaring example} -Take the requirement $\FormalReq{Array == Array}$, with \texttt{Element} a generic parameter. This states that the two types \texttt{Array} and \texttt{Array} must have the same reduced type. The second type is already fully concrete and cannot be reduced further, so the requirement is really stating that the reduced type of \texttt{Array} must be \texttt{Array}. This can only be true if the reduced type of \texttt{Element} is \texttt{Int}. Therefore, our requirement is actually equivalent to $\FormalReq{Element == Int}$, which satisfies the definition of a desugared requirement. -\end{example} -\begin{example} -Let's look at $\FormalReq{Dictionary == Dictionary}$, with \texttt{K} and \texttt{V} being generic parameters. This splits up into two requirements, $\FormalReq{K == Int}$ and $\FormalReq{String == V}$. The first is desugared; the second is an instance of case (3) above and can be desugared by flipping the subject type and constraint type. -\end{example} -\begin{example}\label{conflicting requirement example} -\index{conflicting requirement} -Now, consider a requirement like \texttt{Array == Set}. The reduced type of \texttt{Array} will always be some specialization of \texttt{Array}, which will never equal a specialization of \texttt{Set}. This requirement can never be satisfied, and is diagnosed as a conflict. -\end{example} -To understand why this makes sense, note that computing a reduced type of a concrete type (Algorithm~\ref{reducedtypealgo}) only transforms the leaves that happen to be type parameters, replacing them with other type parameters or concrete types; the overall ``shape'' of the concrete type remains the same. When the derivation rules for derived requirements were introduced in Section~\ref{derived req} -\begin{definition} -\index{tree}% -Two types have \emph{matching sub-components} if they have the same kind, same number of child types, and exactly equal non-type information, such as the declaration of nominal types, the labels of two tuples, value ownership kinds of function parameters, and so on. This property only considers the root of the tree; \texttt{Array} and \texttt{Array>} still have matching sub-components, but the two sub-components \texttt{Int} and \texttt{Set} do not. -\end{definition} -With the above definition, we can finish formalizing the desugaring of same-type requirements. If both sides of a same-type requirement have matching sub-components, the requirement desugars by recursively matching the sub-components of the two sides. If the two sides do not have matching sub-components, the requirement is a conflicting requirement and an error is diagnosed. +The first two cases already have the correct form, so we're done. The third case reduces to the second, because we can swap the two sides. Indeed, a same-type requirement is also a statement that both sides have the \index{reduced type equality}same \index{reduced type}reduced type, and this relation is \index{symmetric relation}symmetric. In the fourth case, we have something like this, +\[\SameReq{Dictionary<\rT, Bool>}{Dictionary}\] +which we would like to desugar into two requirements: +\begin{gather*} +\SameReq{\rT}{Int}\\ +\SameReq{Bool}{\rU} +\end{gather*} +The first output is already desugared; the second one must be flipped, and then we're done. On the other hand, say we're given $\SameReq{Array<\rT>}{Set<\rT>}$ instead. No substitution map can transform the two sides into the same type, so this requirement can never be satisfied. This gives us the general rule. + +For two concrete types to be equivalent under all substitutions, they can only differ in certain ways. Two types \emph{match} if they have the same kind, same number of structural component types, and exactly equal non-type information (examples of this include the declaration of a nominal type, the labels of a tuple, value ownership kinds of function parameters, and so on). Our definition of matching is not recursive, so \texttt{Array>} and \texttt{Array>} still match because everything lines up at the outermost level, but their two children \texttt{Array} and \texttt{Set} do not match. + +If the two types in a same-type requirement do not match, we have a conflict, so we \index{diagnostic!invalid same-type requirement}diagnose an error and give up. Otherwise, the types have an equal number of children; we walk the children in parallel and construct a simpler set of same-type requirements equivalent to the original. This is a recursive process; some of these requirements might need further desugaring, or lead to conflicts, and so on. \index{same-type requirement} -\begin{algorithm}[Same-type requirement desugaring]\label{desugar same type algo} As input, takes an arbitrary same-type requirement. As output, returns three lists of same-type requirements, the \emph{desugared} list, \emph{redundant} list, and \emph{conflict} list. +\begin{algorithm}[Same-type requirement desugaring]\label{desugar same type algo} Takes an arbitrary same-type requirement as input. Outputs a list of desugared requirements and a list of conflicting requirements. \begin{enumerate} -\item Initialize the desugared list, redundant list and conflict list to empty lists. -\item Initialize a worklist with a single element, the input requirement. -\item (Loop) Take the next requirement \texttt{T == U} from the worklist. -\item (Abstract) If \texttt{T} and \texttt{U} are both type parameters, add \texttt{T == U} to the desugared list. -\item (Concrete) If \texttt{T} is a type parameter and \texttt{U} is concrete, add \texttt{T == U} to the desugared list. -\item (Flipped) If \texttt{T} is concrete and \texttt{U} is a type parameter, add \texttt{U == T} (note the flip) to the desugared list. -\item (Redundant) If \texttt{T} and \texttt{U} are both concrete and canonically equal, add \texttt{T == U} to the redundant list. -\item (Recurse) If \texttt{T} and \texttt{U} are not canonically equal but have matching sub-components, let $\texttt{T1}\ldots\texttt{Tn}$ and $\texttt{U1}\ldots\texttt{Un}$ be the child types of \texttt{T} and \texttt{U}. For each $1\le \texttt{i}\le \texttt{n}$, add the same-type requirement \texttt{Ti == Ui} to the worklist. -\item (Conflict) If \texttt{T} and \texttt{U} are both concrete and do not have matching sub-components, add \texttt{T == U} to the conflict list. -\item (Check) If the worklist is empty, return. Otherwise, go back to Step~3. +\item Initialize empty output and conflict lists. +\item Add the input requirement to the worklist. +\item (Next) Take the next requirement $\SameReq{T}{U}$ from the worklist. +\item (Abstract) If \texttt{T} and \texttt{U} are both type parameters, add $\SameReq{T}{U}$ to the output list. +\item (Concrete) If \texttt{T} is a type parameter and \texttt{U} is concrete, output $\SameReq{T}{U}$. +\item (Flipped) If \texttt{T} is concrete and \texttt{U} is a type parameter, output $\SameReq{U}{T}$. +\item (Redundant) If \texttt{T} and \texttt{U} are canonically equal, nothing non-trivial will be generated below, so we immediately go to Step~10. +\item (Recurse) If \texttt{T} and \texttt{U} match, let $\texttt{T}_1\ldots\texttt{T}_n$ and $\texttt{U}_1\ldots\texttt{U}_n$ be the children of \texttt{T} and~\texttt{U}. For each $1\le i\le n$, add $\SameReq{$\texttt{T}_i$}{$\texttt{U}_i$}$ to the worklist. +\item (Conflict) If \texttt{T} and \texttt{U} do not match, add $\SameReq{T}{U}$ to the conflict list and diagnose. +\item (Loop) If the worklist is empty, return. Otherwise, go back to Step~3. \end{enumerate} \end{algorithm} -With all of the above in place, we can finally present the algorithm for desugaring an arbitrary requirement. This algorithm is intended to run after Algorithm~\ref{expand conformance req algorithm}, so we assume the constraint types of conformance requirements are protocol types and never protocol composition types or parameterized protocol types. -\index{conditional conformance} -\index{conformance requirement} -\index{superclass requirement} -\index{layout requirement} -\index{global conformance lookup} -\index{self-conforming protocol} +Next we have the algorithm for desugaring an arbitrary requirement. This runs after \AlgRef{expand conformance req algorithm}, so we assume conformance requirements have been decomposed. Notice how this algorithm generalizes the \index{satisfied requirement}``requirement is satisfied'' check of \AlgRef{reqissatisfied}: if we desugar a requirement that does not contain any type parameters, the conflict list will be empty if and only if the requirement is satisfied. + \begin{algorithm}[Requirement desugaring]\label{requirement desugaring algorithm} -As input, takes an arbitrary requirement. As output, returns three lists of requirements, the \emph{desugared} list, \emph{redundant} list, and \emph{conflict} list. +Takes an arbitrary requirement as input. Outputs a list of desugared requirements and a list of conflicting requirements. \begin{enumerate} -\item Initialize the desugared list, redundant list and conflict list to empty lists. -\item Initialize a worklist with a single element, the input requirement. -\item (Loop) Take the next requirement from the worklist. If the requirement's subject type is a type parameter, move it to the desugared list and go back to Step 3. -\item (Desugar) Otherwise, the subject type is a concrete type. Handle each requirement kind as follows: +\item Initialize empty output and conflict lists. +\item Add the input requirement to the worklist. +\item (Next) Take the next requirement from the worklist. If the requirement's subject type is a type parameter, add it to the output list and go to Step~5. +\item (Desugar) Otherwise, the subject type is a concrete type. Handle each \index{requirement kind}requirement kind as follows: \begin{enumerate} -\item \textbf{Conformance requirements:} We perform a global conformance lookup to get the conformance of the subject type to the protocol named by the constraint type. There are three possible outcomes: +\item For a \index{conformance requirement}\textbf{conformance requirement} $\ConfReq{T}{P}$, perform the \index{global conformance lookup}global conformance lookup $\protosym{P}\otimes\texttt{T}$: \begin{enumerate} -\item Unconditional conformance: move the requirement to the redundant list. -\item Conditional conformance: add each conditional requirement to the worklist (Section~\ref{conditional conformance}). -\item Invalid conformance: Move the requirement to the conflict list. +\item If we get a \index{concrete conformance}concrete conformance, add the \index{conditional requirement}conditional requirements, if any, to the worklist. +\item If we get an invalid conformance, add $\ConfReq{T}{P}$ to the conflict list. \end{enumerate} -\item \textbf{Superclass requirements:} There are three possible cases: +\item For a \index{superclass requirement}\textbf{superclass requirement} $\ConfReq{T}{C}$: \begin{enumerate} -\item If the subject type and constraint type are both generic class types with the same declaration, add a same-type requirement between the two types to the worklist. -\item If the subject type does not have a superclass type (Chapter~\ref{classinheritance}), move the superclass requirement to the conflict list. -\item The final case is where the subject type has a superclass type. Construct a new requirement by replacing the original requirement's subject type with the superclass type. Add the new requirement to the worklist. +\item If \texttt{T} and \texttt{C} are two specializations of the same \index{class declaration}class declaration, add the same-type requirement $\SameReq{T}{C}$ to the worklist. +\item If \texttt{T} does not have a \index{superclass type}superclass type (\ChapRef{classinheritance}), then \texttt{T} cannot be a subclass of~\texttt{C}; add $\ConfReq{T}{C}$ to the conflict list. +\item Otherwise, let $\texttt{T}^\prime$ be the superclass type of \texttt{T}. Add the superclass requirement $\ConfReq{$\texttt{T}^\prime$}{C}$ to the worklist. \end{enumerate} -\item \textbf{Layout requirements:} Check the requirement with Algorithm~\ref{reqissatisfied}. If it is satisfied, move it the redundant list. Otherwise, move it to the conflict list. -\item \textbf{Same-type requirements:} apply Algorithm~\ref{desugar same type algo} and add the results to the desugared, redundant and conflict lists. +\item For a \index{layout requirement}\textbf{layout requirement} $\ConfReq{T}{AnyObject}$, any type parameters contained in the concrete type \texttt{T} have no bearing on the outcome. It suffices to apply \AlgRef{reqissatisfied}. If unsatisfied, add $\ConfReq{T}{AnyObject}$ to the conflict list. +\item For a \index{same-type requirement}\textbf{same-type requirement} $\SameReq{T}{U}$, apply \AlgRef{desugar same type algo} and add the results to the output and conflict lists. \end{enumerate} -\item (Loop) Go back to Step 3. +\item (Loop) If the worklist is empty, return. Otherwise, go back to Step~3. \end{enumerate} -Requirements on the redundant list are either silently dropped, or diagnose a warning if the \verb|-Xfrontend -warn-redundant-requirements| flag was passed. Requirements on the conflict list are diagnosed as errors. Requirements on the desugared list proceed to requirement minimization. \end{algorithm} +Requirements on the conflict list are diagnosed as errors. Requirements on the output list have the correct desugared form for minimization. To restate: +\begin{definition}\label{desugaredrequirementdef} +A \IndexDefinition{desugared requirement}\emph{desugared requirement} is a requirement where: +\begin{enumerate} +\item The left-hand side is a type parameter. +\item For a conformance requirement, the right-hand side is a \index{protocol type}protocol type. +\end{enumerate} +\end{definition} -You might want to compare the above algorithm with the ``requirement is satisfied'' check (Algorithm~\ref{reqissatisfied}). In fact, if you apply Algorithm~\ref{requirement desugaring algorithm} to a requirement that does not contain any type parameters, it will end up in the redundant list if it is satisfied, the conflict list if it is unsatisfied, and never on the desugared list. Requirement desugaring can be seen as a generalized form of checking if a requirement is satisfied. The desugared list contains the requirements that \emph{can} become true, but are not \emph{provably} true from first principles. +\section{Well-Formed Requirements}\label{generic signature validity} -\section{Requirement Validity}\label{generic signature validity} +After desugaring and decomposition, our user-written requirements are now in a form where we can reason about them with the \index{derived requirement}derived requirements formalism. So far, we have not attempted to impose any restrictions on the explicit requirements from which all else is derived, other than those requirements being \emph{syntactically} well-formed. In the next section, we will precisely state the invariants of a generic signature, but first, we need to extend our theory with a notion of \emph{semantically} well-formed requirements. It turns out we need this to \index{diagnostic!malformed generic signature}diagnose certain malformed generic signatures. -As defined so far, our \index{derived requirement}derived requirement formalism does not impose any semantic restrictions on generic signatures. The derivation steps are sound as long as the generic signature is structurally well-formed. However, this comes with a limitation which makes it impossible to prove certain properties. Specifically, the problem is that if we have a derivation of a requirement, say, $\ConfReq{T.A.B}{P}$, there is no apparent way to obtain a derivation of the type parameter \texttt{T.A.B} appearing in this requirement. +To motivate our notion of well-formedness, let's return to the idea of a \index{valid type parameter}\emph{valid type parameter} from \SecRef{derived req}. Specifically, we're going to look at prefixes of valid type parameters. Consider the type parameter \texttt{\rT.Element.Element} below: +\begin{Verbatim} +struct Concat where C.Element: Collection { + let x: C.Element.Element = ... +} +\end{Verbatim} -In fact, with the structural definition of a generic signature provided so far, it is possible to write down generic signatures where this property does not hold. Consider this generic signature: -\begin{quote} -\texttt{<\ttgp{0}{0} where \sout{\ttgp{0}{0}:~Sequence,} \ttgp{0}{0}.[Sequence]Element:~Hashable>} -\end{quote} -If we delete the requirement $\ConfReq{\ttgp{0}{0}}{Sequence}$, we get this: +A \index{type parameter}type parameter is a \index{prefix}\emph{prefix} of another type parameter if it is equal to the other's base type, or the base type of the base type, at any level of nesting. So \texttt{\rT.Element} and $\rT$ are the two prefixes of \texttt{\rT.Element.Element}. From the user's point of view, if ``\texttt{C.Element.Element}'' is a meaningful utterance to write down in the program, ``\texttt{C.Element}'' certainly should be as well! This suggests a reasonable property that any ``good'' generic signature ought to have: \textsl{every prefix of a valid type parameter is itself a valid type parameter}. This is not our final condition though, so we generalize further. + +When we say that a type parameter \texttt{T} is valid in a generic signature $G$, we mean that we have a derivation~$G\vdash\texttt{T}$. Only a handful of derivation steps have a type parameter as a conclusion (see \AppendixRef{derived summary}), and by inspecting these steps, we can see another way to characterize the validity of a type parameter~\texttt{T}: +\begin{itemize} +\item If \texttt{T} is a \index{generic parameter type}generic parameter, then \texttt{T} is valid if and only if it appears in $G$. +\item If \texttt{T} is an \index{unbound dependent member type}unbound \index{dependent member type}dependent member type \texttt{U.A}, then \texttt{T} is valid if and only if some protocol \texttt{P} declares an associated type named \texttt{A}, and $G\vdash\ConfReq{U}{P}$. +\item If \texttt{T} is a \index{bound dependent member type}bound dependent member type \texttt{U.[P]A}, then \texttt{T} is valid if and only if $G\vdash\ConfReq{U}{P}$, where \texttt{P} is the parent protocol of the associated type declaration denoted by \texttt{[P]A}. +\end{itemize} + +If we have one of those generic signatures where every prefix of \texttt{T} is also valid, then in the case where \texttt{T} is a dependent member type above, the subject type \texttt{U} of $\ConfReq{U}{P}$ is a prefix of~\texttt{T}, so \texttt{U} must be valid. We say that $\ConfReq{U}{P}$ is a \emph{well-formed} requirement if its subject type \texttt{U} is valid. This idea of well-formedness allows us to subsume ``prefix validity'' with a more general idea: \textsl{every derived conformance requirement must be well-formed}. In other words, we want $G\vdash\ConfReq{U}{P}$ to imply $G\vdash\texttt{U}$ for all \texttt{U} and \texttt{P}. + +We can also find motivation in \index{type substitution}type substitution. Let $G$ be a generic signature where some derived requirement of $G$, not necessarily a conformance requirement, contains an invalid type parameter. By \AlgRef{reqissatisfied}, no substitution map $\Sigma$ can \index{satisfied requirement}satisfy this requirement, because after we apply $\Sigma$, the invalid type parameter becomes an error type. Recalling our earlier notion of a \index{well-formed substitution map}well-formed \emph{substitution map} from \DefRef{valid subst map}, we conclude that~$G$ cannot have any well-formed substitution maps at all! To rule this out, we generalize our condition to cover all requirements, not just conformance requirements. + +\begin{definition}\label{valid requirement} +A requirement is \IndexDefinition{well-formed requirement}\emph{well-formed} with respect to a generic signature~$G$ if all type parameters contained in the requirement are valid type parameters of~$G$: +\begin{itemize} +\item A \textbf{conformance requirement} $\ConfReq{T}{P}$ is well-formed if $G\vdash\texttt{T}$. +\item A \textbf{superclass requirement} $\ConfReq{T}{C}$, with $\{\texttt{C}_1,\ldots,\texttt{C}_n\}$ denoting the set of type parameters contained in~\texttt{C}, is well-formed if $G\vdash\texttt{T}$ and $G\vdash\texttt{C}_i$ for all $1\le i\le n$. +\item A \textbf{layout requirement} $\ConfReq{T}{AnyObject}$ is well-formed if $G\vdash\texttt{T}$. +\item A \textbf{same-type requirement} $\SameReq{T}{U}$, with $\{\texttt{U}_1,\ldots,\texttt{U}_n\}$ denoting the set of type parameters contained in~\texttt{U}, is well-formed if $G\vdash\texttt{T}$ and $G\vdash\texttt{U}_i$ for all $1\le i\le n$. (If the right-hand side \texttt{U} is a type parameter, the set of type parameters here is trivially $\{\texttt{U}\}$.) +\end{itemize} +\end{definition} + +We will be talking about both ``derived'' and ``well-formed'' requirements below, so before we completely lose the reader, let's make the distinction totally clear. Recall the generic signature of \texttt{Concat}: \begin{quote} -\texttt{<\ttgp{0}{0} where \ttgp{0}{0}.[Sequence]Element:~Hashable>} +\texttt{<\rT~where \rT:~Collection, \rT.Element:~Collection>} \end{quote} -Intuitively, this generic signature no longer makes sense because \ttgp{0}{0} does not have an \texttt{Element} member type. In our formalism, this means that \texttt{\ttgp{0}{0}.[Sequence]Element} is not a valid type parameter. If it were valid, there would be a derivation ending with an application of the \IndexStep{AssocType}\textsc{AssocType} derivation step to a conformance requirement $\ConfReq{\ttgp{0}{0}}{Sequence}$: +We cannot derive $\ConfReq{\rT.Element.Element}{Hashable}$ from this generic signature; nothing in our signature is \texttt{Hashable}. However, this requirement is certainly well-formed, because \texttt{\rT.Element.Element} is a valid type parameter in our generic signature. Thus, a derived requirement is provably \emph{true}; a well-formed requirement makes sense as a \emph{question}. So starting from prefix validity, we've arrived at our final statement: we want our generic signatures to only prove things that make sense! + +\begin{definition}\label{valid generic signature def} +A \index{generic signature}generic signature $G$ is \IndexDefinition{well-formed generic signature}\emph{well-formed} if all derived requirements of~$G$ are well-formed. +\end{definition} +This definition doesn't immediately give us an algorithm for checking well-formedness. The derived requirements of a generic signature are an infinite set in general, so we cannot enumerate them all. We will resolve this dilemma shortly. + +Notice how if $G\vdash\ConfReq{T}{P}$ for some protocol \texttt{P}, the well-formedness of~$G$ implicitly depends on the well-formedness of the associated requirements of~\texttt{P}; we interpret~$G$ with respect to its \index{protocol dependency set}\emph{protocol dependency set}, containing those protocols that can appear on the right-hand side of a derived conformance requirement. We will return to this topic in \SecRef{protocol component}; for now, we can take this set to contain \emph{all} protocols as a conservative approximation. + +One more thing. The result we proved while motivating \DefRef{valid generic signature def} will be useful later, so let's re-state it for posterity: +\begin{proposition}\label{prefix prop} +Let $G$ be a well-formed generic signature, and let \texttt{T} be a \index{valid type parameter}valid type parameter of~$G$. Then every prefix of \texttt{T} is also a valid type parameter of~$G$. +\end{proposition} + +\paragraph{Diagnostics.} We now give an example of a generic signature that is not well-formed. We return to our \texttt{Concat} type, except here we ``forget'' to state that \texttt{C} must conform to \texttt{Collection}: +\begin{Verbatim} +struct Bad where C.Element: Collection {} +\end{Verbatim} +Nevertheless, starting with the explicit requirement $\ConfReq{\rT.Element}{Collection}$, we can derive other requirements and valid type parameters; to pick a few at random: \begin{gather*} -\ldots\vdash \ConfReq{\ttgp{0}{0}}{Sequence}\tag{1}\\ -(1)\vdash \texttt{\ttgp{0}{0}.[Sequence]Element}\tag{2} +\ConfStep{\rT.Element}{Collection}{1}\\ +\AssocConfStep{1}{\rT.Element}{Sequence}{2}\\ +\AssocConfStep{2}{\rT.Element.Iterator}{IteratorProtocol}{3}\\ +\AssocNameStep{3}{\rT.Element.Iterator.Element}{4} \end{gather*} -However, we cannot derive $\ConfReq{\ttgp{0}{0}}{Sequence}$ in this generic signature; it is not explicitly stated, and there are no other requirements we could derive it from. +The derived requirements (1)~and~(2) are not well-formed because their subject types are not valid type parameters, and the valid type parameter (4) has an invalid prefix \texttt{\rT.Element}. Clearly, \texttt{Bad} ought to be rejected by the compiler. What actually happens when we type check \texttt{Bad}? Recall the \index{type resolution stage}type resolution stage from \ChapRef{typeresolution}. We first resolve the requirement $\ConfReq{\rT.Element}{Sequence}$ in \index{structural resolution stage}structural resolution stage, and we get a requirement whose subject type is an \index{unbound dependent member type}unbound dependent member type. We don't know that this requirement is not well-formed, yet. -Thus, we can derive the requirement $\ConfReq{\ttgp{0}{0}.[Sequence]Element}{Hashable}$, but not the type parameter \texttt{\ttgp{0}{0}.[Sequence]Element} appearing within, which means that in the general case, the derived requirements of our formalism might describe invalid type parameters. To rule this out, we will impose a validity condition on generic signatures. The condition talks about explicit requirements, not derived requirements, but we will prove that the more general result follows. +After we build the generic signature for \texttt{Bad}, we revisit the \texttt{where} clause again, and resolve the requirement in the \index{interface resolution stage}interface resolution stage. As the subject type is not a valid type parameter, type resolution \index{diagnostic!invalid type parameter}diagnoses an error and returns an \index{error type}error type: +\begin{Verbatim} +bad.swift:1:23: error: `Element' is not a member type of type `C' +struct Bad where C.Element: Sequence {} + ^ +\end{Verbatim} -\begin{definition}\label{valid generic signature def} -A generic signature $G$ is \IndexDefinition{valid generic signature}\emph{valid} if the following two conditions hold: +Next, we define what it means for an associated requirement to be well-formed: +\begin{definition} +An associated requirement of a protocol \texttt{P} is \emph{well-formed} if it is a well-formed requirement for the protocol generic signature $G_\texttt{P}$; that is, all type parameters it contains are valid type parameters in $G_\texttt{P}$. +\end{definition} + +A generic signature may depend on protocols written in source, or protocols from serialized modules. For protocols written in source, type resolution checks that their associated requirements are well-formed by visiting them again in the interface resolution stage, once the protocol's \index{requirement signature}requirement signature has been built. Protocols from \index{serialized module}serialized modules already have well-formed associated requirements, because they were checked before serialization. Thus, if type resolution does not diagnose any errors, all user-written requirements are well-formed. The next theorem says this is a sufficient condition for all generic signatures in the \index{main module}main module to be well-formed. + +\begin{theorem}\label{valid theorem} +Suppose that a generic signature~$G$ satisfies these conditions: \begin{itemize} -\item For every explicit requirement $R$ of $G$, all type parameters appearing in $R$ are valid in $G$. -\item For every protocol \texttt{P} appearing in a derivation of $G$, for every explicit requirement $R$ of the requirement signature of \texttt{P}, all type parameters appearing in $R$ are valid in \verb||. +\item Every explicit requirement of $G$ is well-formed. +\item For every protocol \texttt{P} such that $G\vdash\ConfReq{T}{P}$ for some \texttt{T}, every associated requirement of \texttt{P} is well-formed (in the protocol generic signature $G_\texttt{P}$). \end{itemize} -\end{definition} -Indeed, if the user attempts to impose a requirement on a non-existent type parameter, the compiler diagnoses an error instead of constructing a generic signature containing an invalid requirement. Similarly, requirements written inside a protocol can only refer to valid type parameters of the protocol \texttt{Self} type. In Section~\ref{recursive conformances}, we will show how to determine which protocols can appear in the derivations of a given generic signature. For now, it suffices to leave it unspecified. +Then every derived requirement of $G$ is well-formed; in other words, $G$ is well-formed. +\end{theorem} +To prove this theorem, we must expand our repertoire for reasoning about derivations. First, we recall the \index{protocol generic signature}protocol generic signature from \SecRef{requirement sig}. If~\texttt{P} is any protocol, then its generic signature, which we denote by~$G_\texttt{P}$, has the single requirement $\ConfReq{Self}{P}$. As always, the protocol \texttt{Self} type is sugar for $\rT$. -\paragraph{Formal substitution} -We can derive various type parameters and requirements in a protocol generic signature, like \verb||, for example: -\begin{gather*} -\ldots\vdash \ConfReq{Self.Iterator}{IteratorProtocol}\tag{1}\\ -(1)\vdash \texttt{Self.Iterator.Element}\tag{2} -\end{gather*} -Intuitively, anything we can say about the protocol \texttt{Self} type in this signature, is also true of an arbitrary type parameter \texttt{T.A.B} in some other generic signature $G$ where we can first derive the conformance requirement $\ConfReq{T.A.B}{Sequence}$. So a particular consequence of the above is that there ought to be a derivation of a valid type parameter \texttt{T.A.B.Iterator.Element} in $G$. We can show that this derivation always exists. +The protocol generic signature describes the structure generated by the protocol's requirement signature. These are the \index{valid type parameter}valid type parameters and derived requirements inside the declaration of the protocol and its unconstrained extensions. These type parameters are all rooted in the protocol \texttt{Self} type, and the derived requirements talk about these \texttt{Self}-rooted type parameters. Informally, anything we can say about the protocol \texttt{Self} type in $G_\texttt{P}$, should also be true of an arbitrary type parameter \texttt{T} in some other generic signature~$G$ where $G\vdash\ConfReq{T}{P}$. We will now make this precise. + +For example, we might first define an algorithm in a protocol extension of \texttt{Collection}, and then call our algorithm from another generic function: +\begin{Verbatim} +extension Collection { + func myComplicatedAlgorithm() {...} +} -We do this with a \IndexDefinition{formal substitution}\emph{formal substitution}. When writing down derivations, we've been doing formal substitution already. Each kind of derivation step is defined in the form of a ``schema,'' where it is understood the various meta-syntactic variables (\texttt{T} for type parameter, \texttt{P} for protocol, \texttt{A} for associated type, and so on) are replaced with concrete instances of those entities in some specific generic signature $G$. We can also imagine applying a formal substitution to an existing derivation, replacing some elements in a way that preserves the validity of the derivation. +func anotherAlgorithm(_ c: C, _ index: C.Element.Index) + where C.Element: Collection { + c[index].myComplicatedAlgorithm() +} +\end{Verbatim} +The generic signature of \texttt{anotherAlgorithm()} is the same as \texttt{Concat} from earlier. Let's call it~$G$. The reference to \texttt{myComplicatedAlgorithm()} has this substitution map: +\[ +\SubstMapLongC{\SubstType{Self}{\rT.Element}}{\SubstConf{Self}{\rT.Element}{Collection}} +\] +The \index{input generic signature}input generic signature is $G_\texttt{Collection}$, and \index{output generic signature}output generic signature is $G$, so for generics to ``work'' we would expect that applying this substitution map to a valid type parameter or derived requirement of $G_\texttt{Collection}$ should give us a valid type parameter or derived requirement in~$G$. -Specifically, we want to take a derivation in the protocol generic signature, and replace \texttt{Self} with \texttt{T.A.B} to obtain a derivation in $G$. This looks similar to how we can form a \index{protocol substitution map}protocol substitution map from a conforming type and conformance, and then apply it to a type parameter. In our case, we could express this with our type substitution algebra: +Suppose that \texttt{myComplicatedAlgorithm()} makes use of \texttt{Self.SubSequence.Index} conforming to \texttt{Comparable} inside the body. We can derive this requirement, together with the validity of \texttt{Self.SubSequence.Index} (so our requirement is well-formed): +\begin{gather*} +\ConfStep{Self}{Collection}{1}\\ +\AssocConfStep{1}{Self.SubSequence}{Collection}{2}\\ +\AssocNameStep{2}{Self.SubSequence.Index}{3}\\ +\AssocConfStep{2}{Self.SubSequence.Index}{Comparable}{4} +\end{gather*} +We apply our substitution map to (3) and (4). Of course, we have yet to explain how to apply a substitution map to a dependent member type! We will get to that in the next chapter, but for now, let's make the simplifying assumption that we're performing a syntactic replacement of ``\texttt{Self}'' with ``\texttt{\rT.Element}''. This gives us: +\begin{gather*} +\texttt{\rT.Element.SubSequence.Index}\\ +\ConfReq{\rT.Element.Self.SubSequence.Index}{Comparable} +\end{gather*} +We can show that the first is a valid type parameter of $G$, and the second a derived requirement of~$G$, by taking our original derivation in $G_\texttt{Collection}$, and replacing \texttt{Self} with \texttt{\rT.Element} throughout: +\begin{gather*} +\ConfStep{\rT.Element}{Collection}{1}\\ +\AssocConfStep{1}{\rT.Element.SubSequence}{Collection}{2}\\ +\AssocConfStep{2}{\rT.Element.SubSequence.Index}{Comparable}{3} +\end{gather*} +This works because $\ConfReq{\rT.Element}{Collection}$ was an \emph{explicit} requirement of $G$, so it's not as general as we'd like. Suppose we started with a more complicated \emph{derived} conformance requirement, like $G\vdash\ConfReq{\rT.Indices}{Collection}$ in the same signature: +\begin{gather*} +\ConfStep{\rT}{Collection}{1}\\ +\AssocConfStep{1}{\rT.Indices}{Collection}{2} +\end{gather*} +To build a derivation for $G\vdash\ConfReq{\rT.Indices.SubSequence.Index}{Comparable}$, we first replace the elementary step $\ConfReq{Self}{Collection}$ with the entire \emph{derivation} of $G\vdash\ConfReq{\rT.Indices}{Collection}$; then we substitute \texttt{Self} with \texttt{\rT.Indices} in all remaining steps: \begin{gather*} -\texttt{Self.Iterator.Element}\otimes\SubstMapLongC{\SubstType{Self}{T.A.B}}{\SubstConf{Self}{T.A.B}{Sequence}}\\ -\qquad {} = \texttt{T.A.B.Iterator.Element} +\ConfStep{\rT}{Collection}{1}\\ +\AssocConfStep{1}{\rT.Indices}{Collection}{2}\\ +\AssocConfStep{2}{\rT.Indices.SubSequence}{Collection}{3}\\ +\AssocConfStep{3}{\rT.Indices.SubSequence.Index}{Comparable}{4} +\end{gather*} +A protocol generic signature has two elementary derivation steps, so we might instead have a derivation that starts with the \textsc{Generic} elementary step for $\texttt{Self}$. For example, we can derive the requirement $\SameReq{Self}{Self}$ in $G_\texttt{Collection}$: +\begin{gather*} +\GenericStep{Self}{1}\\ +\ReflexStep{1}{Self}{2} +\end{gather*} +To get a derivation $G \vdash \SameReq{\rT.Indices}{\rT.Indices}$, we make use of the fact that \texttt{\rT.Indices} is a valid type parameter of~$G$. We replace the elementary derivation step for \texttt{Self} with the entire derivation of $G \vdash \texttt{\rT.Indices}$: +\begin{gather*} +\ConfStep{\rT}{Collection}{1}\\ +\AssocNameStep{1}{\rT.Indices}{2}\\ +\ReflexStep{2}{\rT.Indices}{3} \end{gather*} -However, we haven't formally defined what it means to apply a substitution map to a dependent member type yet, nor do we have any reason to assume that the result of doing so is a valid type parameter. When we complete our study of type substitution in Chapter~\ref{conformance paths}, we will make use of the results in this section. By proving the below result purely in terms of derived requirements, without reference to the type substitution algebra, we avoid inadvertently presenting a circular argument. -\begin{lemma}\label{subst lemma} -Let $G$ be an arbitrary generic signature. Suppose that \texttt{T} is a valid type parameter of $G$, and $\ConfReq{T}{P}$ is a derived conformance requirement of $G$. Then, consider the protocol generic signature \verb||: +We now state the general result. We need this for the proof of \ThmRef{valid theorem}, and also later in \SecRef{conformance paths exist}, when we show that every derived conformance requirement has a derivation of a certrain form. + +\begin{lemma}[Formal substitution]\label{subst lemma} +Let $G$ be an arbitrary generic signature. Suppose that $G\vdash\texttt{T}$ and $G\vdash\ConfReq{T}{P}$ for some type parameter \texttt{T} and protocol \texttt{P}. Then if we take a valid type parameter or derived requirement of~$G_\texttt{P}$ and replace \texttt{Self} with \texttt{T} throughout, we get a valid type parameter or derived requirement of~$G$, just rooted in \texttt{T}. That is: \begin{itemize} -\item If \texttt{Self.U} is a valid type parameter in the protocol generic signature, then \texttt{T.U} is a valid type parameter in $G$. -\item If the protocol generic signature has a derived requirement with subject type \texttt{Self.U}, then there is a corresponding derived requirement with subject type \texttt{T.U} in $G$. +\item If $G_\texttt{P}\vdash\texttt{Self.U}$, then $G\vdash\texttt{T.U}$. +\item If $G_\texttt{P}\vdash R$, then $G\vdash R^\prime$, where $R^\prime$ is the substituted requirement obtained by replacing \texttt{Self} with \texttt{T} in $R$. \end{itemize} -That is, a derivation in \verb||, written in terms of the protocol \texttt{Self} type and the conformance requirement $\ConfReq{Self}{P}$, can be ``re-based'' on top of the type parameter \texttt{T} and conformance requirement $\ConfReq{T}{P}$ to obtain a new derivation in $G$. \end{lemma} + \begin{proof} -We are given a derivation of a valid type parameter or derived requirement for the protocol generic signature \verb||. We will transform this into a derivation for $G$. - -What we would like to do is perform a formal substitution on each derivation step, replacing \texttt{Self} with \texttt{T} throughout, but the initial derivation steps require some additional handling. Observe that the protocol generic signature admits two initial derivations: -\begin{gather} -\vdash \texttt{Self}\tag{1}\\ -\vdash \ConfReq{Self}{P}\tag{2} -\end{gather} -Our given derivation must contain at least one of the above. Performing the formal substitution would give us the following: -\begin{gather} -\vdash \texttt{T}\tag{1}\\ -\vdash \ConfReq{T}{P}\tag{2} -\end{gather} -However, this might not be a valid derivation, because in general \texttt{T} and $\ConfReq{T}{P}$ may need multiple steps to derive in $G$. We only assumed that \texttt{T} was some type parameter, not specifically a generic parameter, and similarly, we assumed that $\ConfReq{T}{P}$ was a derived conformance requirement, not necessarily an explicit one. - -Here we make use of the assumption that we have derivations for both \texttt{T} and $\ConfReq{T}{P}$ in $G$. In addition to performing the formal substitution, we also replace occurrences of each initial derivation step with the \emph{entire derivation} of \texttt{T} or $\ConfReq{T}{P}$: -\begin{gather} -\ldots \vdash \texttt{T}\tag{1}\\ -\ldots \vdash \ConfReq{T}{P}\tag{2} -\end{gather} -Having done that, we argue that for all other derivation steps in our given derivation, the formal substitution alone is sufficient and preserves validity. For example, suppose we have the following derivation in a protocol generic signature \verb||: -\begin{gather} -\vdash\ConfReq{Self}{P}\tag{1}\\ -\vdash\FormalReq{Self.A == Self.B}_\texttt{P}\tag{2}\\ -\vdash\FormalReq{Self.B == Self.C}_\texttt{P}\tag{3}\\ -(1),\,(2)\vdash\FormalReq{Self.A == Self.B}\tag{4}\\ -(1),\,(3)\vdash\FormalReq{Self.B == Self.C}\tag{5}\\ -(1),\,(5)\vdash\FormalReq{Self.A == Self.C}\tag{6} -\end{gather} -After replacing the initial derivation step with a derivation of \texttt{T} and substituting \texttt{T} for \texttt{Self} in all other derivation steps, we get a derivation in $G$: -\begin{gather} -\ldots \vdash\ConfReq{T}{P}\tag{1}\\ -\vdash\FormalReq{Self.A == Self.B}_\texttt{P}\tag{2}\\ -\vdash\FormalReq{Self.B == Self.C}_\texttt{P}\tag{3}\\ -(1),\,(2)\vdash\FormalReq{T.A == T.B}\tag{4}\\ -(1),\,(3)\vdash\FormalReq{T.B == T.C}\tag{5}\\ -(1),\,(5)\vdash\FormalReq{T.A == T.C}\tag{6} -\end{gather} -It is easy to verify that the above is a valid derivation under our assumptions, and that this similarly holds for all other kinds of derivation steps. +Suppose we wanted to write down an \emph{algorithm} for the above. As input, we're given a generic signature~$G$, a pair of derivations $G\vdash\texttt{T}$ and $G\vdash\ConfReq{T}{P}$, and some derivation in $G_\texttt{P}$. We rewrite the derivation in $G_\texttt{P}$ into a derivation in $G$ by visiting each consecutive step. First, consider the elementary derivation steps of $G_\texttt{P}$: +\begin{gather*} +\texttt{Self}\tag{\textsc{Generic}}\\ +\ConfReq{Self}{P}\tag{\textsc{Conf}} +\end{gather*} +Each one is replaced with the \emph{entire derivation} of \texttt{T} or $\ConfReq{T}{P}$, respectively: +\begin{gather*} +\texttt{T}\tag{$\ldots$}\\ +\ConfReq{T}{P}\tag{$\ldots$} +\end{gather*} +For all other derivation steps, we substitute \texttt{T} in place of \texttt{Self} anywhere it appears in the derivation step's statement. This gives us the desired derivation in~$G$. \end{proof} +A few words about the above proof. To be completely rigorous, we should go through each \index{inference rule}inference rule and argue that the substitution outputs a meaningful derivation step in each case. We're not going to do that, because we will demonstrate this form of exhaustive case analysis in the proof of \ThmRef{valid theorem} instead. Finally, note that our lemma makes the assumption that $\ConfReq{T}{P}$ is well-formed, because we need $G\vdash\texttt{T}$. However, we are specifically not assuming that $G$~is well-formed. In fact, we will use this lemma in the proof of \ThmRef{valid theorem}, so such an assumption would be circular. -\paragraph{Structural induction} -Our validity condition states that \emph{explicit} requirements of a generic signature must reference valid type parameters. We will now show the same is true of a valid generic signature's \index{derived requirement}\emph{derived} requirements. That is, if we are given an arbitrary derived requirement in a valid generic signature $G$, we need to be able to produce a derivation of every type parameter appearing in this requirement. We also know that the fact that $G$ is valid must play a role in the construction, since we saw above that this result does not hold without the validity assumption. - -This sort of problem lends itself well to proof by \index{structural induction}\emph{structural induction}, which we will briefly summarize now. For the general principle of induction, see something like \cite{grimaldi}; structural induction is covered in more advanced texts such as \cite{bradley2007calculus}. - -In its simplest form, mathematical \IndexDefinition{induction}induction concerns properties of the set of natural numbers $\mathbb{N}$. To show that a property $P(n)$ is true for all $n\in\mathbb{N}$ by induction, we write down a proof in two parts: +\paragraph{Structural induction.} Suppose that $P(n)$ is some property of the \index{natural numbers}natural numbers, and we wish to show that it is always true of all $n\in\mathbb{N}$. We can argue by \IndexDefinition{induction}\emph{induction}, writing down a proof in two parts: \begin{itemize} -\item The \index{base case}\emph{base case}, that $P(0)$ holds. -\item The \index{inductive step}\emph{inductive step}, where we show that the truth of $P(n)$ for a given $n>0$ follows from $P(m)$ for all $m1$. We look at the final derivation step, and we assume that $P(D^\prime)$ already holds for the derivation $D^\prime$ of each assumption used in this step. Then, we do a case analysis on the last step. For each kind of derivation step, we argue that the truth of $P(D)$ follows from the truth of $P(D^\prime)$ on the derivation's assumptions. +\item The \emph{base case} establishes that $P(D)$ is true for every elementary statement $D$. Recall that these are the \IndexStep{Generic}\textsc{Generic} steps for each generic parameter of~$G$, and the steps for each explicit requirement of~$G$. +\item The \emph{inductive step}, where we have a derivation step with assumptions $D_1$, \ldots, $D_n$ and conclusion $D$. We assume $P(D_1)$, \ldots, $P(D_n)$ hold, then argue that $P(D)$ must hold as a consequence. We can perform a case analysis over the various inference rules to handle each kind of inference rule. \end{itemize} +If we wish to prove a statement about derived requirements only, we slightly modify the above scheme. The base case no longer needs to consider \textsc{Generic} steps, but also, the \IndexStep{Reflex}\textsc{Reflex} step \emph{becomes} a base case, because none of its assumptions are requirements. The following proof, which we are now in a position to state, uses the modified scheme. -\begin{proposition}[Validity]\label{validity lemma} -Let $G$ be a valid generic signature. Then if $R$ is a derived requirement of $G$, every type parameter appearing in $R$ is valid in $G$. -\end{proposition} -\begin{proof} -We are given a derived requirement $R$ and some type parameter appearing in this requirement. Let $D$ be a derivation of this requirement in $G$. We can assume that specifically, the final step of $D$ produces the requirement in question (if there are other ``useless'' steps that follow, we can delete them): -\begin{gather*} -\ldots\vdash\ConfReq{T}{P}\\ -\ldots\vdash\ConfReq{T}{C}\\ -\ldots\vdash\ConfReq{T}{AnyObject}\\ -\ldots\vdash\FormalReq{T == U} -\end{gather*} -We wish construct a new derivation to show that the type parameter \texttt{T} (or \texttt{U}) is valid in $G$: +\begin{proof}[Proof of Theorem~\ref*{valid theorem}] +We're given a requirement~$R$ such that $G\vdash R$, and a type parameter contained in~$R$. We must construct a derivation of this type parameter. + +\smallskip + +\emph{Base case.} We start with the elementary statements: \begin{gather*} -\ldots\vdash\texttt{T}\\ -\ldots\vdash\texttt{U} +\ConfStepDef\\ +\SameStepDef\\ +\ConcreteStepDef\\ +\SuperStepDef\\ +\LayoutStepDef \end{gather*} -In fact, we will construct a derivation of every type parameter that appears in $D$, not just every type parameter appearing in $R$. We proceed by structural induction on derivations. -\begin{itemize} -\item -A requirement derivation of length 1 consists of a single \IndexStep{GenSig}\textsc{GenSig} derivation step, deriving an explicit requirement of the generic signature: +The first assumption in our theorem was that all explicit requirements of~$G$ are well-formed. In other words, every type parameter in~$R$ is a valid type parameter of~$G$, so we're done. The other base case is that we have a requirement obtained by the \IndexStep{Reflex}\textsc{Reflex} inference rule, from some valid type parameter~\texttt{T}: \begin{gather*} -\vdash\ConfReq{T}{P}\tag{\textsc{GenSig}}\\ -\vdash\ConfReq{T}{C}\\ -\vdash\ConfReq{T}{AnyObject}\\ -\vdash\FormalReq{T == U} +\ReflexStepDef \end{gather*} -This derivation step has no assumptions so we must derive the validity of \texttt{T} (or \texttt{U}) from first principles. We can use the fact that $G$ is a valid generic signature. Specifically, the first part in the definition of a valid generic signature tells us that type parameters appearing in explicit requirements are valid, thus we know that there exists a derivation for \texttt{T} (or \texttt{U}). +We must have $G\vdash\texttt{T}$, so the requirement $\SameReq{T}{T}$ is then well-formed by definition. -This proves the \index{base case}base case. For the \index{inductive step}inductive step, we assume that we have a derivation for every type parameter appearing in every requirement derived in $D$, except for possibly the last step. Then, we perform a case analysis on the last derivation step. If any new type parameters appear in the requirement derived in the last step, we must construct a new derivation from what is already known. +\smallskip -\item -Of the three \IndexStep{AssocType}\textsc{AssocType} derivation steps, only the third one derives a requirement. This requirement contains the two type parameters \texttt{T.[P]A} and \texttt{T.A}: +\emph{Inductive step.} We must handle each kind of inference rule in turn. Consider an \IndexStep{AssocBind}\textsc{AssocBind} step, for some type parameter~\texttt{T}, protocol~\texttt{P} and associated type~\texttt{A}: \begin{gather*} -\ConfReq{T}{P}\vdash\FormalReq{T.[P]A == T.A}\tag{\textsc{AssocType}} +\AssocBindStepDef \end{gather*} -This derivation step has the assumption $\ConfReq{T}{P}$. We can derive the validity of \texttt{T.[P]A} and \texttt{T.A} by applying the other two kinds of \textsc{AssocType} derivation step to the same assumption: +The conclusion is a derived same-type requirement that contains two type parameters, \texttt{T.[P]A} and \texttt{T.A}. We can derive both from the conformance requirement $\ConfReq{T}{P}$ using \IndexStep{AssocDecl}\textsc{AssocDecl} and \IndexStep{AssocName}\textsc{AssocName}: \begin{gather*} -\ldots\vdash\ConfReq{T}{P}\tag{1}\\ -(1)\vdash\texttt{T.[P]A}\tag{2}\\ -(2)\vdash\texttt{T.A}\tag{3} +\AnyStep{\ConfReq{T}{P}}{1}\\ +\AssocDeclStep{1}{T.[P]A}{2}\\ +\AssocNameStep{2}{T.A}{3} \end{gather*} - -\item -The \IndexStep{Member}\textsc{Member} derivation steps are similar: +Next, consider the \IndexStep{SameName}\textsc{SameName} and \IndexStep{SameDecl}\textsc{SameDecl} derivation steps: \begin{gather*} -\ConfReq{U}{P},\,\FormalReq{T == U}\vdash\FormalReq{T.A == U.A}\tag{\textsc{Member}}\\ -\ConfReq{U}{P},\,\FormalReq{T == U}\vdash\FormalReq{T.[P]A == U.[P]A} +\SameNameStepDef\\ +\SameDeclStepDef \end{gather*} -The right hand sides contain four type parameters: \texttt{T.A}, \texttt{T.[P]A}, \texttt{U.A} and \texttt{U.[P]A}. Each can be derived by an \textsc{AssocType} derivation step. For the first two, we use \IndexStep{Same}\textsc{Same} to get a conformance requirement $\ConfReq{U}{P}$ first. For the second two, we use $\ConfReq{T}{P}$: +We have four type parameters to derive: \texttt{T.A}, \texttt{T.[P]A}, \texttt{U.A} and \texttt{U.[P]A}. We first derive $\ConfReq{T}{P}$ from $\ConfReq{U}{P}$, and then apply \textsc{AssocDecl} and \textsc{AssocName} to $\ConfReq{T}{P}$ and $\ConfReq{U}{P}$: \begin{gather*} -\ldots\vdash \ConfReq{U}{P}\tag{1}\\ -\ldots\vdash \FormalReq{T == U}\tag{2}\\ -(1),\,(2)\ldots\vdash \ConfReq{T}{P}\tag{3}\\ -(3)\vdash \texttt{T.A}\tag{4}\\ -(3)\vdash \texttt{T.[P]A}\tag{5}\\ -(1)\vdash \texttt{U.A}\tag{6}\\ -(1)\vdash \texttt{U.[P]A}\tag{7} +\AnyStep{\ConfReq{U}{P}}{1}\\ +\AnyStep{\SameReq{T}{U}}{2}\\ +\SameConfStep{1}{2}{T}{P}{3}\\ +\AssocNameStep{3}{T.A}{4}\\ +\AssocDeclStep{3}{T.[P]A}{5}\\ +\AssocNameStep{1}{U.A}{6}\\ +\AssocDeclStep{1}{U.[P]A}{7} \end{gather*} - -\item -Next, consider the \IndexStep{Equiv}\textsc{Equiv} derivation steps: +To handle the derivation steps generated by the \index{associated requirement}associated requirements of a protocol~\texttt{P}, we must make use of the second assumption of our theorem, together with \LemmaRef{subst lemma}. The simplest case is an \IndexStep{AssocConf}\textsc{AssocConf} or \IndexStep{AssocLayout}\textsc{AssocLayout} step: \begin{gather*} -\texttt{T}\vdash\FormalReq{T == T}\tag{\textsc{Equiv}}\\ -\FormalReq{T == U}\vdash\FormalReq{U == T}\\ -\FormalReq{T == U},\,\FormalReq{U == V}\vdash\FormalReq{T == V} +\AssocConfStepDef\\ +\AssocLayoutStepDef \end{gather*} -In the first case, we're deriving the trivial same-type requirement $\FormalReq{T == T}$ from the validity of a type parameter \texttt{T}. The only type parameter appearing in $\FormalReq{T == T}$ is \texttt{T}, and it's validity must have already been established. +By the induction hypothesis, $G\vdash\texttt{T}$. We must show that $G\vdash\texttt{T.U}$. The requirement on the right-hand side is obtained by replacing \texttt{Self} with \texttt{T} in some associated conformance requirement $\ConfReq{Self.U}{Q}_\texttt{P}$. By assumption, this associated requirement is well-formed with respect to $G_\texttt{P}$, so we have a derivation $G_\texttt{P}\vdash\texttt{Self.U}$. All the conditions of \LemmaRef{subst lemma} are satisfied, thus we can construct a derivation $G\vdash\texttt{T.U}$. -In the second and third case, the only type parameters appearing on the right side of $\vdash$ also appear on the left side. Here, we finally make use of the inductive hypothesis, which tells us that the type parameters on the left side are already known to be valid. Thus, so are the type parameters on the right. - -\item -A similar argument establishes the result for the \IndexStep{Same}\textsc{Same} derivation steps: +For an \IndexStep{AssocSame}associated same-type requirement $\SameReq{Self.U}{Self.V}_\texttt{P}$ we repeat the same construction to derive $G\vdash\texttt{T.U}$ and $G\vdash\texttt{T.V}$: \begin{gather*} -\ConfReq{U}{P},\,\FormalReq{T == U}\vdash\ConfReq{T}{P}\tag{\textsc{Same}}\\ -\ConfReq{U}{C},\,\FormalReq{T == U}\vdash\ConfReq{T}{C}\\ -\ConfReq{U}{AnyObject},\,\FormalReq{T == U}\vdash\ConfReq{T}{AnyObject} +\AssocSameStepDef \end{gather*} -By the inductive hypothesis, we know that \texttt{T} and \texttt{U} are both valid, since they appear on the left hand side of $\vdash$. The requirements on the right hand side only mentions \texttt{T}, so the result follows. - -\item -So far, the one assumption we haven't used in our proof is the second part of the validity of $G$, concerning protocol requirement signatures. We make use of this when considering \IndexStep{Conf}\textsc{Conf} derivation steps: +If we are looking at a \IndexStep{AssocSuper}concrete same-type requirement $\SameReq{Self.U}{X}_\texttt{P}$, or a superclass requirement $\ConfReq{Self.U}{C}_\texttt{P}$, we again use \LemmaRef{subst lemma} to derive each type parameter that appears in $\Xprime$ and $\Cprime$, which we use to denote the types obtained from \texttt{X} and \texttt{C}, respectively, by structural replacement of \texttt{Self} with \texttt{T}: \begin{gather*} -\ConfReq{T}{P},\,\ConfReq{Self.U}{Q}\vdash\ConfReq{T.U}{Q}\tag{\textsc{Conf}}\\ -\ConfReq{T}{P},\,\ConfReq{Self.U}{C}\vdash\ConfReq{T.U}{C}\\ -\ConfReq{T}{P},\,\ConfReq{Self.U}{AnyObject}\vdash\ConfReq{T.U}{AnyObject}\\ -\ConfReq{T}{P},\,\FormalReq{Self.U == Self.V}\vdash\FormalReq{T.U == T.V} +\AssocConcreteStepDef\\ +\AssocSuperStepDef \end{gather*} -By the inductive hypothesis, \texttt{T} is a valid type parameter in $G$, because it appears on the left side of $\vdash$. However, the right-hand side contains a wholly new type parameter \texttt{T.U} (or \texttt{T.V}). We must construct a derivation to show this type parameter is valid. - -The type parameter \texttt{T.U} is constructed from \texttt{T} and \texttt{Self.U} (or \texttt{Self.V}) by formal substitution. Since \texttt{Self.U} (or \texttt{Self.V}) appears in the requirement signature of \texttt{P}, it is a valid type parameter of \verb||. Now, the conditions of Lemma~\ref{subst lemma} are satisfied, showing that \texttt{T.U} (or \texttt{T.V}) is a valid type parameter in $G$. -\end{itemize} -Our case analysis is not exhaustive; we did not prove the result for a handful of derivation steps concerning concrete types. Completing this theory is left as an exercise for the reader. +Finally, all other derivation steps have the property that the type parameters appearing in their conclusion already appear in their assumptions, so by the induction hypothesis, the derived requirement is well-formed: +\begin{gather*} +\SymStepDef\\ +\TransStepDef\\ +\SameConfStepDef\\ +\SameConcreteStepDef\\ +\SameSuperStepDef\\ +\SameLayoutStepDef +\end{gather*} +This completes the induction. \end{proof} +Formally, structural induction depends on a \index{well-founded order}well-founded order (\SecRef{reduced types}), so we would use the ``containment'' order on derivations. However, the ``recursive algorithm'' viewpoint is good enough for us. Induction over the natural numbers is covered in introductory books such as \cite{grimaldi}; for structural induction in formal logic, see something like~\cite{bradley2007calculus}. We will use structural induction over derivations again to study conformance paths in \SecRef{conformance paths exist}, encode finitely-presented monoids as protocols in \SecRef{monoidsasprotocols}, and finally present a correctness proof for the Requirement Machine in \SecRef{rqm correctness}. -, we continue to build up our repertoire of tricks for building new derivations from existing ones. Recall that the \IndexStep{Member}\textsc{Member} derivation step produces a requirement $\FormalReq{T.[P]A == U.[P]A}$ from $\ConfReq{U}{P}$ and $\FormalReq{T == U}$, where protocol \texttt{P} declares an associated type \texttt{A}. By iterated application of the \textsc{Member} step, we perform a construction with \emph{any} valid type parameter \texttt{Self.V} in the protocol generic signature \verb||, to obtain a derived requirement $\FormalReq{T.V == U.V}$. This depends on a somewhat trivial fact we must establish first. -\begin{proposition}\label{protocol generic signature valid} -Suppose that $G$ is a valid generic signature. If $G\vdash\ConfReq{T}{P}$ for some type parameter $\texttt{T}\in\TypeObj{G}$ and protocol \texttt{P}, then the protocol generic signature $G_\texttt{P}$ is also valid. -\end{proposition} +\medskip + +Before we end this section, we revisit \index{bound type parameter}bound and \index{unbound type parameter}unbound type parameters from \SecRef{type params}, and prove one final result. We previously claimed that every equivalence class of type parameters contains both a bound and unbound type parameter representative. This actually requires the assumption that our generic signature is well-formed. +\begin{theorem}\label{bound and unbound equiv} +Let $G$ be a well-formed \index{generic signature}generic signature, and suppose \texttt{T} is a \index{valid type parameter}valid type parameter of~$G$. Then, the \index{equivalence class}equivalence class of \texttt{T} also contains two type parameters (not necessarily unique), which we denote by $\texttt{T}^*$ and $\texttt{T}_*$, such that: +\begin{enumerate} +\item $\texttt{T}^*$ is a bound type parameter, +\item $\texttt{T}_*$ is an unbound type parameter, +\item $\texttt{T}^*$~and~$\texttt{T}_*$ have the same \index{type parameter length}length as~\texttt{T}, +\item $\texttt{T}^*\le\texttt{T}\le\texttt{T}_*$ under the type parameter order. +\end{enumerate} +Furthermore, if \texttt{T} is a \index{reduced type parameter}reduced type parameter, then $\texttt{T}^*$ is canonically equal to $\texttt{T}$, and thus every reduced type parameter of~$G$ is a bound type parameter. +\end{theorem} \begin{proof} -Recall that a valid generic signature satisfies both conditions of Definition~\ref{valid generic signature def}. +We proceed by \index{induction}induction on the length of the type parameter~\texttt{T}. In the base case, we prove that the property holds for all generic parameters; in the inductive step, we prove it holds for any dependent member type whose base type again has the property. -The first condition is satisfied by $G_\texttt{P}$, because the explicit requirement $\ConfReq{Self}{P}$ of $G_\texttt{P}$ contains a single type parameter \texttt{Self}, and $G_\texttt{P}\vdash\texttt{Self}$ for any protocol \texttt{P}. +\smallskip -The second condition concerns requirement signatures of protocols appearing on the right hand sides of the derived conformance requirements of $G_\texttt{P}$. We claim that any such protocol can already appear on the right hand side of a derived conformance requirement of $G$, and since $G$ is valid, it follows that $G_\texttt{P}$ satisfies the second condition. For suppose that $G_\texttt{P}\vdash\ConfReq{Self.U}{Q}$ for some type parameter $\texttt{Self.U}\in\TypeObj{G_\texttt{P}}$ and protocol \texttt{Q}. Then by Lemma~\ref{subst lemma}, $G\vdash\ConfReq{T.U}{Q}$. -\end{proof} -Now, we can perform the iterated \textsc{Member} construction. -\begin{proposition}\label{general member type} -Suppose that $G$ is a valid generic signature, and that $\texttt{T}$, $\texttt{U}\in\TypeObj{G}$ are type parameters, and \texttt{P} is a protocol. Then, if $G\vdash\ConfReq{U}{P}$, $G\vdash\FormalReq{T == U}$, and $G_\texttt{P}\vdash\texttt{Self.V}$, we have $G\vdash\FormalReq{T.V == U.V}$. -\end{proposition} -\begin{proof} -We proceed by \index{induction}induction on the \index{type parameter length}length of the type parameter \texttt{Self.V}, building the desired same-type requirement one step at a time, via repeated application of the \textsc{Member} derivation step. +\emph{Base case:} \index{base case}if \texttt{T} is a generic parameter type \ttgp{d}{i}, we can set both $\texttt{T}^*$ and $\texttt{T}_*$ to~\texttt{T}, so $\texttt{T}^*\le\texttt{T}\le\texttt{T}_*$ holds. Then, we derive a trivial same-type requirement twice: +\begin{gather*} +\GenericStep{\ttgp{d}{i}}{1}\\ +\ReflexStep{1}{\ttgp{d}{i}}{2}\\ +\ReflexStep{1}{\ttgp{d}{i}}{3} +\end{gather*} -The \index{base case}base case is that \texttt{Self.V} has length 1. Here, ``\texttt{Self.V}'' is actually the generic parameter type \texttt{Self}. In this case, our epitheory simplifies ``\texttt{T.V}'' to \texttt{T} and ``\texttt{U.V}'' to \texttt{U}, and we already have the same-type requirement $\FormalReq{T == U}$ by assumption, and thus the claimed same-type requirement $\FormalReq{T.V == U.V}$ has already been derived. +\emph{Inductive step:} \index{inductive step}assume that \texttt{T} is a dependent member type, either \texttt{U.[P]A} (bound) or \texttt{U.A} (unbound), for some \texttt{U} and \texttt{A} of~\texttt{P}. We have $G\vdash\ConfReq{U}{P}$ because \texttt{T} is valid, and $G\vdash\texttt{U}$ because $\ConfReq{U}{P}$ is well-formed. The length of \texttt{U} is one less than the length of \texttt{T}, so the induction hypothesis then gives us a pair of same-type requirements $\SameReq{U}{$\texttt{U}^*$}$ and $\SameReq{U}{$\texttt{U}_*$}$, such that $\texttt{U}^*\le\texttt{U}\le\texttt{U}_*$. We set $\texttt{T}^*:=\texttt{U}^*\texttt{.[P]A}$ and $\texttt{T}_*:=\texttt{U}_*\texttt{.A}$, and derive three new same-type requirements via \IndexStep{SameDecl}\textsc{SameDecl}, \IndexStep{AssocBind}\textsc{AssocBind} and \IndexStep{SameName}\textsc{SameName}: +\begin{gather*} +\SameDeclStep{$\SameReq{U}{$\texttt{U}^*$}$}{$\ConfReq{U}{P}$}{U.[P]A}{$\texttt{T}^*$}{1}\\ +\AssocBindStep{$\ConfReq{U}{P}$}{U.[P]A}{U.A}{2}\\ +\SameNameStep{$\SameReq{U}{$\texttt{U}_*$}$}{$\ConfReq{U}{P}$}{U.A}{$\texttt{T}_*$}{3} +\end{gather*} +From the definition of the type parameter order in \SecRef{reduced types}, we also see that: +\[\texttt{T}^*\le\texttt{U.[P]A}<\texttt{U.A}\le\texttt{T}_*\] -Now, for the \index{inductive step}inductive step, assume \texttt{Self.V} is a type parameter of length $n$, where $n>1$. We can write \texttt{Self.V} as a dependent member type \texttt{Self.W.[Q]A}, with associated type \texttt{A} of some protocol \texttt{Q}, and base type \texttt{Self.W} (of length $n-1$). +If $\texttt{T}$ is the bound dependent member type $\texttt{U.[P]A}$, we already have $\SameReq{T}{$\texttt{T}^*$}$ as~(1), but we must \IndexStep{Trans}derive $\SameReq{T}{$\texttt{T}_*$}$: +\[ +\TransStep{2}{3}{T}{$\texttt{T}_*$}{4} +\] -We assumed that \texttt{Self.V} is a valid type parameter, so we have a derivation for this type parameter in the \index{protocol generic signature}protocol generic signature of \texttt{P}. A derivation of \texttt{Self.V} ends with the \IndexStep{AssocType}\textsc{AssocType} derivation step applied to the conformance requirement $\ConfReq{Self.W}{Q}$: +If $\texttt{T}$ is the unbound dependent member type $\texttt{U.A}$, we find ourselves in the opposite situation. We already have $\SameReq{T}{$\texttt{T}_*$}$ as~(3), but we derive $\SameReq{T}{$\texttt{T}^*$}$ with \IndexStep{Sym}\textsc{Sym} and \IndexStep{Trans}\textsc{Trans}: \begin{gather*} -\ldots\vdash \ConfReq{Self.W}{Q}\tag{1}\\ -(1)\vdash \texttt{Self.W.[Q]A}\tag{2} +\SymStep{2}{U.[P]A}{T}{5}\\ +\TransStep{5}{1}{T}{$\texttt{T}^*$}{6} \end{gather*} -The assumptions of Proposition~\ref{validity lemma} hold, showing that \texttt{Self.W} is a valid type parameter in the generic signature \verb||: +This completes the induction. To prove the second part of the theorem, we further assume that \texttt{T} is reduced. By the preceding argument, we get a bound type parameter~$\texttt{T}^*$ such that $\texttt{T}^*\le\texttt{T}$. On the other hand, a reduced type parameter is the smallest element in its equivalence class, so $\texttt{T}\le\texttt{T}^*$. We conclude that $\texttt{T}^*$ is canonically equal to $\texttt{T}$ if~\texttt{T} is reduced. +\end{proof} + +\section{Requirement Minimization}\label{minimal requirements} + +The last step shown in \FigRef{inferred generic signature request figure}~and~\ref{abstract generic signature request figure} is called \IndexDefinition{requirement minimization}\emph{requirement minimization}. To finish building our generic signature, we must transform the list of desugared requirements into a list of \emph{minimal} requirements. These minimal requirements are then given to the \index{generic signature constructor}primitive constructor, together with the list of generic parameter types collected at the start of the process, and we have our generic signature! + +Because of the central role that generic signatures play in the Swift \index{ABI}ABI, it is worth describing the requirement minimization problem in the abstract, which is our goal in the present section. The implementation itself will be revealed in \PartRef{part rqm}, when we build a rewrite system from desugared requirements (\SecRef{building rules}) and then perform rewrite system minimization (\ChapRef{rqm minimization}). The material in this section was based on~\cite{gensig}, an earlier write-up about the \Index{GenericSignatureBuilder@\texttt{GenericSignatureBuilder}}\texttt{GenericSignatureBuilder}. + +We can summarize the key behaviors of requirement minimization: \begin{enumerate} -\item We know the signature \verb|| is valid by Proposition~\ref{protocol generic signature valid}. -\item We have a derivation of $\ConfReq{Self.W}{Q}$ in \verb||. -\item The type parameter \texttt{Self.W} appears in this derived requirement. +\item +Type substitution only accepts bound dependent member types. To ensure that we can apply a substitution map to the requirements of a generic signature, as we do in \AlgRef{check generic arguments algorithm} for example, each requirement is rewritten to use \index{bound dependent member type}bound dependent member types. + +\item Generic signatures describe the calling convention of generic functions, the layout of nominal type metadata, the mangling of symbol names, and so on. To ensure that trivial syntactic changes do not affect ABI, each requirement in a generic signature is \emph{reduced} into the simplest possible form, redundant requirements are dropped to produce a \emph{minimal} list, and this list is sorted in canonical order. + +\item We need to detect and \index{diagnostic!conflicting requirement}diagnose generic signatures with \emph{conflicting requirements} that cannot be satisfied by any \index{well-formed substitution map}well-formed substitution map. This allows us to assume that all generic signatures in the \index{main module}main module are satisfiable as long as no diagnostics are emitted during type checking. \end{enumerate} -Now, \texttt{Self.W} has length $n-1$, since it is the base type of \texttt{Self.V}, which has length $n$. By the inductive hypothesis, our result already holds for all valid type parameters of length $n-1$. Thus, it holds for \texttt{Self.W} in particular, so we know that we can derive a same-type requirement $\FormalReq{T.W == U.W}$. -Next, we check that each assumption of Lemma~\ref{subst lemma} holds, allowing us to derive the requirement $\ConfReq{T.W}{Q}$ in $G$: +\paragraph{Equivalence of requirements.} In our derived requirements formalism, a generic signature is just a list of requirements. We saw in the previous section that the only real assumption we need from those requirements is that they are \index{desugared requirement}desugared requirements. Thus, when taken with the list of generic parameter types, the desugared requirements given to minimization already form a generic signature in our theory. + +We will define a \emph{minimal} generic signature as one satisfying the additional conditions we just described. We will understand requirement minimization as a mathematical \index{function}function that takes a generic signature and outputs a minimal generic signature. In the implementation, all generic signatures come from requirement minimization, and so all generic signatures are actually minimal. To justify this transformation, we introduce the notion of \emph{equivalent} generic signatures. Two generic signatures are equivalent if they generate the same \emph{theory}; that is, they have the same set of \index{valid type parameter}valid type parameters and \index{derived requirement}derived requirements. Note that in particular, if two generic signatures~$G_1$ and $G_2$ are equivalent, then~$G_1$ is well-formed if and only if~$G_2$ is well-formed. + +\begin{proposition}\label{equiv generic signatures} +Any two generic signatures $G_1$~and~$G_2$ satisfying the below conditions are \IndexDefinition{generic signature equivalence}\emph{equivalent}, meaning they generate the same theory: \begin{enumerate} -\item The generic signature $G$ is valid. -\item We have a derivation of $\ConfReq{T}{P}$ in $G$, which was one of our initial assumptions. -\item We have a derivation of $\ConfReq{Self.W}{Q}$ in \verb||, which followed from the validity of \texttt{Self.V}. +\item $G_1$ and $G_2$ have the same list of generic parameter types. +\item Every explicit requirement of $G_1$ can be derived in $G_2$. +\item Every explicit requirement of $G_2$ can be derived in $G_1$. \end{enumerate} +\end{proposition} +\begin{proof} +We will show that $G_1$ and $G_2$ generate the same theory by first arguing that the theory of $G_1$ is a subset of the theory of $G_2$, and vice versa. -We're almost done. We have a conformance requirement $\ConfReq{T.W}{Q}$, and a same-type requirement -$\FormalReq{T.W == U.W}$. We can apply the \textsc{Member} derivation step to derive the same-type requirement $\FormalReq{T.W.[Q]A == U.W.[Q]A}$: -\begin{gather*} -\ldots\vdash \ConfReq{Self.W}{Q}\tag{1}\\ -\ldots\vdash \FormalReq{T.W == U.W}\tag{2}\\ -(1),\,(2)\vdash \FormalReq{T.W.[Q]A == U.W.[Q]A}\tag{3} -\end{gather*} -Noting that \texttt{T.W.[Q]A} is \texttt{T.V} and \texttt{U.W.[Q]A} is \texttt{U.V}, we see that we have the exact same-type requirement $\FormalReq{T.V == U.V}$ we set out to derive, completing our proof. +Suppose we are given a derivation~$G_1\vdash D$, where $D$ is either a valid type parameter or derived requirement of~$G_1$. We must show that $D$ is a valid type parameter or derived requirement of~$G_2$. The elementary derivation steps that can appear in $G_1\vdash D$ are those defined by the generic parameters and explicit requirements of~$G_1$. We can mechanically construct a derivation of $G_2\vdash D$ from $G_1\vdash D$ to arrive at our conclusion: +\begin{enumerate} +\item By the first assumption, every \IndexStep{Generic}\textsc{Generic} step that derives a generic parameter of $G_1$ is already a valid \textsc{Generic} step for $G_2$, so we leave it unchanged. +\item By the second assumption, every elementary step for an explicit requirement of~$G_1$ has a corresponding \emph{derivation} of the same requirement in~$G_2$. We replace each elementary step with its derivation. +\item All other derivation steps remain unchanged. +\end{enumerate} +This is a derivation in $G_2$, and so the theory of $G_1$ is a subset of the theory of $G_2$. To get the other inclusion, we make the same argument but with $G_1$ and $G_2$ swapped, and use the third assumption in step~2 instead. \end{proof} -\section{Requirement Minimization}\label{minimal requirements} - -\index{conflicting requirement} -\index{redundant requirement} -\index{minimal requirement} -\index{reduced requirement} -\index{bound dependent member type} -\index{unbound dependent member type} -\index{structural resolution stage} -Requirement minimization reasons about relationships between multiple requirements, detecting redundancies and conflicts in the process. It also reduces unbound dependent member types appearing in desugared requirements with bound dependent member types and performs other simplifications. Requirement minimization is the final step of the process shown in Figure \ref{inferred generic signature request figure}~and~\ref{abstract generic signature request figure}. - -\IndexFlag{debug-generic-signatures} -Let's look at some examples before diving into the details. You can try compiling these with the \texttt{-debug-generic-signatures} flag, which will print each generic signature as its being built. The mode of reasoning employed by the below examples is similar to how the behavior of generic signature queries were justified in Section~\ref{genericsigqueries}. -This is not a coincidence; generic signature queries and requirement minimization are both built on the same substrate, as we will learn in Part~\ref{part rqm}. +Note that we do not attempt to delete ``redundant'' generic parameters. For example, \verb|| and \verb|| are also ``equivalent'' in some broader sense, because any derivation in the former theory can be transformed into one for the latter by replacing \verb|U| with \verb|T.Element|. The latter is also more ``minimal'' because the calling convention for a function with this signature only needs to pass type metadata for \texttt{T}, rather than both \texttt{T} and \texttt{U}. Since we do not pursue this direction, our notion of equivalence bakes in the idea that both the original and minimal generic signatures have the same identical list of generic parameter types. -\begin{example} Let's model geometric shapes with a class hierarchy, which is rather trite, and furthermore declare a protocol with an associated type subject to a superclass requirement: -\begin{Verbatim} -class Shape {} -class Rectangle: Shape {} -class Square: Rectangle {} -class Circle: Shape {} +We want requirement minimization to output a minimal generic signature that is equivalent, but there are two important exceptions to keep in mind: +\begin{enumerate} +\item If any of the original requirements are not \index{well-formed generic signature}well-formed, we cannot rewrite them to use bound dependent member types, so we drop them instead, and the minimal generic signature describes a smaller theory. This is fine; type resolution will have diagnosed errors already, and we will not proceed to code generation. +\item Our derived requirements formalism does not explain all implemented behaviors of superclass, layout and concrete same-type requirements. If these \index{requirement kind}requirement kinds are among the explicit requirements of the generic signature, or the associated requirements of some protocol, minimization may output a generic signature with a different theory. We will see examples of this after. +\end{enumerate} -protocol Sponge { - associatedtype S: Rectangle -} -\end{Verbatim} -We're going to look at three functions which all declare a generic parameter \texttt{T} conforming to \texttt{Sponge} and then impose one of three additional superclass requirements on \texttt{T.S}. By choosing different classes for this superclass requirement we observe some different behaviors of requirement minimization: -\begin{itemize} -\item The requirement \verb|T.S: Shape| of \texttt{f()} is redundant: -\begin{Verbatim} -func f(_: T) where T.S: Shape {} -\end{Verbatim} -To see why, note that \texttt{T.S} is already a \texttt{Shape}: -\begin{itemize} -\item \texttt{T} conforms to \texttt{P}, -\item \texttt{P} requires that its \texttt{S} associated type is a \texttt{Rectangle}, -\item every \texttt{Rectangle} is a \texttt{Shape}. -\end{itemize} -Indeed, for \texttt{f()} to say that \texttt{T.S} is a \texttt{Shape} does not give you anything new. The generic signature of \texttt{f()} is just \verb||. +As an aside, \PropRef{equiv generic signatures} and \ThmRef{bound and unbound equiv} allow us to finally explain why we can take the explicit requirements in a derivation to contain bound \emph{or} unbound dependent member types, depending on occasion. Indeed, sometimes we would omit the \texttt{[P]}'s to save space, other times we left them in to be explicit. It turns out that it doesn't matter; as long as the explicit requirements we start with are well-formed, we can derive either list from the other and arrive at the same theory. -\item The requirement \verb|T.S: Square| of \texttt{g()} is neither redundant, nor conflicting: +\paragraph{Reduced requirements.} +We're now going to work towards a definition of a minimal generic signature. Consider this function: \begin{Verbatim} -func g(_: T) where T.S: Square {} +func uniqueElements1(_: T) -> Int + where T.Element: Hashable {} \end{Verbatim} -Since \texttt{Square} inherits from \texttt{Rectangle} it actually makes sense for \texttt{g()} to further constrain \texttt{T.S}, giving us a function operating on square-shaped sponges\footnote{Or at least, it makes sense to the compiler. Whether this class hierarchy is meaningful to a human programmer is another question.}. The generic signature of \texttt{g()} is: +When building the generic signature of \texttt{uniqueElements1()}, we get two user-written requirements from type resolution in the \index{structural resolution stage}structural resolution stage: +\[\{\ConfReq{T}{Sequence},\,\ConfReq{T.Element}{Hashable}\}\] +The subject type of the second requirement is an \index{unbound dependent member type}unbound dependent member type formed from the \index{generic parameter type}generic parameter type \texttt{T} and identifier \texttt{Element}. We can replace the second requirement with $\ConfReq{T.[Sequence]Element}{Hashable}$; the latter's subject type is a bound dependent member type, formed from \texttt{T} and the associated type declaration \texttt{Element} of \texttt{Sequence}. To justify this, we can derive $\ConfReq{T.[Sequence]Element}{Hashable}$ from $\ConfReq{T.Element}{Hashable}$: +\begin{gather*} +\ConfStep{T.Element}{Hashable}{1}\\ +\ConfStep{T}{Sequence}{2}\\ +\AssocBindStep{2}{T.[Sequence]Element}{T.Element}{3}\\ +\SameConfStep{1}{3}{T.[Sequence]Element}{Hashable}{4} +\end{gather*} +And vice versa: +\begin{gather*} +\ConfStep{T.[Sequence]Element}{Hashable}{1}\\ +\ConfStep{T}{Sequence}{2}\\ +\AssocBindStep{2}{T.[Sequence]Element}{T.Element}{3}\\ +\SymStep{3}{T.Element}{T.[Sequence]Element}{4}\\ +\SameConfStep{1}{4}{T.Element}{Hashable}{5} +\end{gather*} +This is how we arrive at the generic signature that we see if we test this example with the \IndexFlag{debug-generic-signatures}\texttt{-debug-generic-signatures} flag: \begin{quote} \begin{verbatim} - + \end{verbatim} \end{quote} -Something interesting happened here, though. In type parameter order, bound dependent member types precede (are ``more reduced'' than) unbound dependent member types (Section~\ref{typeparams}). For this reason, requirement minimization reduced the subject type of the second requirement from \texttt{T.S} to \texttt{T.[Sponge]S}. - -\item The requirement \verb|T.S: Circle| of \texttt{h()} is conflicting: +Now, notice that the second conformance requirement's subject type is not just bound, it is \index{reduced type parameter}\emph{reduced}. This suggests a stronger condition. We could instead spell the conformance to \texttt{Hashable} with the subject type \texttt{T.Iterator.Element}: \begin{Verbatim} -func h(_: T) where T.S: Circle {} - -// error: no type for `T.S' can satisfy both `T.S : Circle' and -// `T.S : Rectangle' +func uniqueElements2(_: T) -> Int + where T.Iterator.Element: Hashable {} \end{Verbatim} -Our protocol \texttt{P} requires that \texttt{T.S} inherits from \texttt{Rectangle}, while the function requires that \texttt{T.S} inherits from \texttt{Circle}. The same class cannot inherit from both \texttt{Rectangle} and \texttt{Circle} because neither is a superclass of the other (and Swift does not allow multiple inheritance). This means our function \texttt{h()} cannot be invoked at all, because there is no substitution map which simultaneously satisfies all of our requirements. Requirement minimization diagnoses an error to this effect. +If we test this with \texttt{-debug-generic-signatures}, we see that \texttt{uniqueElements2()} has the same generic signature as \texttt{uniqueElements1()}. We start from these requirements: +\[\{\ConfReq{T}{Sequence},\,\ConfReq{T.Iterator.Element}{Hashable}\}\] +We can similarly show that this is an equivalent list: +\begin{align*} +\{&\ConfReq{T}{Sequence},\,\\ +&\ConfReq{T.[Sequence]Iterator.[IteratorProtocol]Element}{Hashable}\} +\end{align*} +However, the reduced type of \texttt{T.Iterator.Element} is \texttt{T.[Sequence]Element}, because of the associated same-type requirement in \texttt{Sequence}, so in fact we can simplify the second conformance requirement further without changing the theory: +\[\{\ConfReq{T}{Sequence},\,\ConfReq{T.[Sequence]Element}{Hashable}\}\] +This is a list of \emph{reduced} requirements, because the subject type of each is a reduced type parameter in our generic signature. We will define reduced requirements more generally now. In what follows, there is an important distinction between same-type requirements between type parameters ($\SameReq{T.A}{T.B}$), and same-type requirements where the right hand side is a concrete type ($\SameReq{T.A}{Array}$). We will treat them as essentially two different kinds of requirements. The below definition is straightforward for every \index{requirement kind}requirement kind, except for a same-type requirement between type parameters. + +\begin{definition}\label{reduced requirement} +Let $G$ be a generic signature, and let $R$ be an explicit requirement of~$G$. We say $R$ is a \IndexDefinition{reduced requirement}\emph{reduced requirement} if the following holds: +\begin{itemize} +\item For a \index{conformance requirement}\textbf{conformance requirement} $\ConfReq{T}{P}$: \texttt{T} is a reduced type parameter. +\item For a \index{layout requirement}\textbf{layout requirement} $\ConfReq{T}{AnyObject}$: \texttt{T} is a reduced type parameter. +\item For a \index{superclass requirement}\textbf{superclass requirement} $\ConfReq{T}{C}$: both \texttt{T} and \texttt{C} are reduced types. +\item For a \index{same-type requirement}\textbf{same-type requirement} $\SameReq{T}{X}$ where \texttt{X} is a \textbf{concrete type}: \texttt{T} is a reduced type parameter and \texttt{X} is a reduced type. +\item For a \textbf{same-type requirement} $\SameReq{T}{U}$ where \texttt{U} is a \textbf{type parameter}: +\begin{enumerate} +\item $\texttt{T} < \texttt{U}$ under the \index{type parameter order}type parameter order. +\item \texttt{T} is either a reduced type parameter, or identical to the right-hand side of some other (explicit) same-type requirement in~$G$. +\item \texttt{T} is not identical to the left-hand side of any same-type requirement of~$G$. +\item \texttt{U} has the property that the only derivations $G\vdash\SameReq{$\texttt{U}^\prime$}{U}$ where $\texttt{U}^\prime<\texttt{U}$ must involve the explicit requirement $\SameReq{T}{U}$ itself; that is, \texttt{U} cannot be further reduced by any other requirement of~$G$. +\end{enumerate} \end{itemize} -\end{example} +\end{definition} -\begin{example}\label{same-type minimization example} -For the next setup, we need a protocol with three associated types: +For a same-type requirement between type parameters, we cannot simply say that both sides are reduced type parameters, because the only way this can happen is if we have a trivial same-type requirement $\SameReq{T}{T}$ for some reduced type parameter~\texttt{T}. An example will clarify the situation. Recall this funny protocol from \SecRef{type params}: \begin{Verbatim} -protocol P { - associatedtype A - associatedtype B - associatedtype C +protocol N { + associatedtype A: N } \end{Verbatim} -We're going to look at three different ways of relating \texttt{A}, \texttt{B} and \texttt{C} with same-type requirements. The basic setup is the same as in the previous example; each of the three functions below has a single generic parameter \texttt{T} conforming to the same protocol \texttt{P}, however each function will impose different additional requirements on the protocol's associated types. -\begin{itemize} -\item In the first function, the three type parameters \texttt{T.A}, \texttt{T.B} and \texttt{T.C} collapse into one: +Now, consider these two generic structs: \begin{Verbatim} -func f(_: T) where T.A == T.B, T.A == T.C, T.B == T.C {} +struct Hook1 where U: N, T == T.A, T.A == U.A {} +struct Hook2 where U: N, T == T.A, U.A == T {} \end{Verbatim} -\index{transitive relation} -Note that while we wrote three same-type requirements above, any two alone are sufficient and imply the third (the relation generated by same-type requirements is \emph{transitive}). The minimization algorithm keeps the first and last requirements and diagnoses the second one as redundant, leaving us with this generic signature: +We will look at \texttt{Hook1} first, but we will see that both in fact have the same generic signature. To build the generic signature of \texttt{Hook1}, we start with these user-written requirements: +\[\{\ConfReq{U}{N},\,\SameReq{T}{T.A},\,\SameReq{T.A}{U.A}\}\] +As before, we derive an equivalent list involving only bound dependent member types: +\[\{\ConfReq{T}{N},\,\SameReq{T}{T.[N]A},\,\SameReq{T.[N]A}{U.[N]A}\}\] +This list of requirements is already reduced and minimal, so we get this generic signature: \begin{quote} \begin{verbatim} - + \end{verbatim} \end{quote} -\item In the second function, neither of the two requirements are redundant, but the second one can be simplified further: -\begin{Verbatim} -func g(_: T) where T.A == Array, T.A == Array {} -\end{Verbatim} -By transitivity, \verb|T.A == Array| and \verb|T.A == Array| together imply that \verb|Array == Array|, which following the reasoning of Example~\ref{same-type desugaring example}, is actually equivalent to \verb|T.B == Int|. +To better understand this generic signature, note that we can derive $\ConfReq{T}{N}$: +\begin{gather*} +\ConfStep{U}{N}{1}\\ +\AssocConfStep{1}{U.[N]A}{N}{2}\\ +\SameStep{T.[N]A}{U.[N]A}{3}\\ +\SameConfStep{1}{3}{T.[N]A}{N}{4}\\ +\SameStep{T}{T.[N]A}{5}\\ +\SameConfStep{4}{5}{T}{N}{6} +\end{gather*} +Both \texttt{T} and \texttt{U} conform to $\protosym{N}$, so each has a member type named \texttt{A}, but because of our same-type requirements, these member types are equivalent to~\texttt{T}. We get this \index{type parameter graph}type parameter graph: +\[ +\begin{tikzpicture} +\node (T) [interior] {\texttt{T}}; +\node (U) [interior, right=of T] {\texttt{U}}; + +\begin{scope}[on background layer] +\path (U) edge [arrow] node [yshift=5pt] {\tiny{\texttt{.A}}} (T); +\path (T) edge [loop below, arrow] node {\tiny{\texttt{.A}}} (); +\end{scope} +\end{tikzpicture} +\] +Next, we look at \texttt{Hook2}, which only differs in how the second same-type requirement is specified. We write the requirements of \texttt{Hook2} with bound dependent member types: +\[\{\ConfReq{U}{N},\,\SameReq{T}{T.[N]A},\,\SameReq{U.[N]A}{T}\}\] +The last requirement is not reduced, because $\texttt{U.[N]A}>\texttt{T}$. However, we can derive $\SameReq{T.[N]A}{U.[N]A}$, which is reduced, from $\SameReq{U.[N]A}{T}$: +\begin{gather*} +\SameStep{U.[N]A}{T}{1}\\ +\SameStep{T}{T.[N]A}{2}\\ +\TransStep{1}{2}{U.[N]A}{T.[N]A}{3}\\ +\SymStep{3}{T.[N]A}{U.[N]A}{4} +\end{gather*} +And vice versa: +\begin{gather*} +\SameStep{T.[N]A}{U.[N]A}{1}\\ +\SameStep{T}{T.[N]A}{2}\\ +\TransStep{1}{2}{T}{U.[N]A}{3}\\ +\SymStep{3}{U.[N]A}{T}{4} +\end{gather*} +This shows that \texttt{Hook1} and \texttt{Hook2} actually have the same minimal generic signature, which \texttt{-debug-generic-signatures} will confirm. -At this point, we can replace \emph{either} of the two original requirements with the new requirement \verb|T.B == Int|. We say \verb|T.A == Array| is ``more minimal'' since the right hand side is fully concrete, so we remove the other requirement, leaving us with: -\begin{quote} -\begin{verbatim} -, T.[P]B == Int> -\end{verbatim} -\end{quote} -\item The third function has a conflicting requirement: -\begin{Verbatim} -func h(_: T) where T.A == Array, T.A == Set {} +There's a bit of history behind this test case. Our derivation of $\ConfReq{T}{N}$ involved the same-type requirement $\SameReq{T}{T.A}$, which contains a member type of \texttt{T}. This was too difficult for the \Index{GenericSignatureBuilder@\texttt{GenericSignatureBuilder}}\texttt{GenericSignatureBuilder} to reason about, so we only accepted \texttt{Hook2} but not \texttt{Hook1}, despite \texttt{Hook1} being the one whose requirements are written in reduced form according to the rules of the Swift~\index{ABI}ABI. This bug was fixed with the Requirement Machine. -// error: error: no type for `T.A' can satisfy both -// `T.A == Set' and `T.A == Array' +\paragraph{Minimal requirements.} Here is another variant of a function we saw earlier: +\begin{Verbatim} +func uniqueElements3(_: T) -> Int + where T.Element: Hashable, + T.Iterator: IteratorProtocol, + T.Element: Equatable {} \end{Verbatim} -We can justify this claim as follows: +We get this list of reduced requirements: +\begin{align*} +\{&\ConfReq{T}{Sequence},\,\ConfReq{T.[Sequence]Element}{Hashable},\\ +&\ConfReq{T.[Sequence]Iterator}{IteratorProtocol},\\ +&\ConfReq{T.[Sequence]Element}{Equatable}\} +\end{align*} +The third and fourth requirements do not give us anything new, because they can be derived from the other two. Thus, \texttt{uniqueElements3()} has the same minimal generic signature as \texttt{uniqueElements1()} and \texttt{uniqueElements2()}. (In the past, we \index{diagnostic!redundant requirements}diagnosed redundant requirements as a \index{warning}warning, but this was removed in \IndexSwift{5.7}Swift~5.7, so now they're just dropped.) + +\index{requirement minimization|see {minimal requirement}} +\begin{definition}\label{minimal generic sig def} Let $G$ be a generic signature. \begin{itemize} -\item The first requirement makes \texttt{T.A} and \texttt{Array} equal as reduced types. -\item The second requirement makes \texttt{T.A} and \texttt{Set} equal as reduced types. -\item Thus, \texttt{Array} and \texttt{Set} are equal as reduced types (transitivity of equality once again being the mathematical justification for this). -\item However, \texttt{Array} and \texttt{Set} can never be equal as reduced types. -\end{itemize} -Indeed, if you had written the requirement \verb|Array == Set|, desugaring would detect the conflict (recall Example~\ref{conflicting requirement example}), but in this case the conflict is a consequence of two requirements. +\item An explicit requirement~$R$ of $G$ is a \IndexDefinition{redundant requirement}\emph{redundant requirement} if we can write down a derivation $G\vdash R$ using only the remaining requirements, without invoking~$R$ as an \index{elementary statement}elementary statement. +\item We say $G$ is a \emph{minimal} generic signature if every explicit requirement of~$G$ is reduced, and no explicit requirement of~$G$ is redundant. \end{itemize} -\end{example} +\end{definition} -\begin{example}\label{conformance minimization} -Our final example will demonstrate how conformance and concrete same-type requirements interact: +The next example shows that a list of reduced requirements may have more than one \emph{distinct} minimal subset. +We declare three generic structs that only differ in conformance requirements; the reader may again want to try this with \texttt{-debug-generic-signatures}: \begin{Verbatim} -struct NotHashable {} - -struct Box where T.Element: Hashable { - func f() where T.Iterator.Element == Int {} - - // error: no type for `T.Element' can satisfy both - // `T.Element == NotHashable' and `T.Element : Hashable' - func g() where T.Element == NotHashable {} -} +struct Knot1 where T: N, T == U.A, U == T.A {} +struct Knot2 where T == U.A, U: N, U == T.A {} +struct Knot3 where T: N, T == U.A, U: N, U == T.A {} \end{Verbatim} -The generic signature of \texttt{Box} is: +We're going to look at the generic signature of \texttt{Knot1} first. Type resolution produces the following list of user-written requirements for \texttt{Knot1}: +\[\{\ConfReq{T}{N},\,\SameReq{T}{U.A},\,\SameReq{U}{T.A}\}\] +Here is the equivalent list of reduced requirements: +\[\{\ConfReq{T}{N},\,\SameReq{T}{U.[N]A},\,\SameReq{U}{T.[N]A}\}\] +This list is minimal, so we get the following generic signature for \texttt{Knot1}: \begin{quote} \begin{verbatim} - + \end{verbatim} \end{quote} -The generic signatures of \texttt{f()} and \texttt{g()} are built from the requirements of the generic signature of \texttt{Box}, together with one additional requirement in each method. -\begin{description} -\item[\texttt{f()}:] The requirement \verb|T.Iterator.Element == Int| can be more simply written as \verb|T.Element == Int|, via the same-type requirement in the \texttt{Sequence} protocol. Then, we can see that the requirement \verb|T.Element: Hashable| is now redundant, because \verb|T.Element| is fixed to \verb|Int|, which conforms to \verb|Hashable|. So the final generic signature becomes: +Notice that we have an explicit requirement $\ConfReq{T}{N}$, and we can also derive $\ConfReq{U}{N}$: +\begin{gather*} +\ConfStep{T}{N}{1}\\ +\AssocConfStep{1}{T.[N]A}{N}{2}\\ +\SameStep{U}{T.[N]A}{3}\\ +\SameConfStep{2}{3}{U}{N}{4} +\end{gather*} +We will now show the type parameter graph for \texttt{Knot1}. The requirements look similar to \texttt{Hook1} because we have two equivalence classes that have a member type named~\texttt{A}, but the same-type requirements act in a different way. Each member type~\texttt{A} now takes us to \emph{opposite} equivalence class: +\[ +\begin{tikzpicture} +\node (T) [interior] {\texttt{T}}; +\node (U) [interior, right=of T] {\texttt{U}}; + +\begin{scope}[on background layer] +\path (T) edge [arrow, bend left] node [yshift=5pt] {\tiny{\texttt{.A}}} (U); +\path (U) edge [arrow, bend left] node [yshift=-5pt] {\tiny{\texttt{.A}}} (T); +\end{scope} +\end{tikzpicture} +\] +Now, consider \texttt{Knot2}. Here is our list of reduced requirements: +\[\{\SameReq{T}{U.[N]A},\,\ConfReq{U}{N},\,\SameReq{U}{T.[N]A}\}\] +This list is again minimal, so we get the following generic signature for \texttt{Knot2}: \begin{quote} \begin{verbatim} - + \end{verbatim} \end{quote} +In \texttt{Knot2}, we have an explicit requirement $\ConfReq{U}{N}$, and we can derive $\ConfReq{T}{N}$: +\begin{gather*} +\ConfStep{U}{N}{1}\\ +\AssocConfStep{1}{U.[N]A}{N}{2}\\ +\SameStep{T}{U.[N]A}{3}\\ +\SameConfStep{2}{3}{T}{N}{4} +\end{gather*} +We've shown that each one of $\ConfReq{T}{N}$ and $\ConfReq{U}{N}$ can be derived in both \texttt{Knot1} and \texttt{Knot2}. Also, the remaining explicit requirements are identical, so we now see that we have two distinct minimal generic signatures that generate the same theory. -\item[\texttt{g()}:] Here, the requirement \verb|T.Element == NotHashable| conflicts with the conformance requirement in the parent declaration's generic signature, because \verb|NotHashable| does not conform to \verb|Hashable|. There is no replacement type for \texttt{T} which satisfies both requirements simultaneously, so a conflict is diagnosed. -\end{description} -\end{example} +Finally, we look at \texttt{Knot3}. Here are our reduced requirements: +\[\{\ConfReq{T}{N},\,\SameReq{T}{U.[N]A},\,\ConfReq{U}{N},\,\SameReq{U}{T.[N]A}\}\] +Unlike the first two, these requirements are not minimal; we already saw how each one of $\ConfReq{T}{N}$ and $\ConfReq{U}{N}$ can be derived from the other. In a situation like this, requirement minimization prefers to delete the redundant requirement with the larger subject type under the \index{type parameter order}type parameter order. -\paragraph{Formal definitions} -The list of requirements in a generic signature plays an important role in the Swift \index{ABI}ABI: it forms the basis for the calling convention of generic functions, the layout of generic nominal type metadata, the mangling of symbol names, and more. In the remainder of this section we expand upon the formal definition of requirement minimization that was first written down in \cite{gensig}. +In this case, that means we delete $\ConfReq{U}{N}$ first. Our derivation of $\ConfReq{T}{N}$ involves $\ConfReq{U}{N}$, so by \PropRef{equiv generic signatures}, we can replace the explicit requirement $\ConfReq{U}{N}$ with its derivation to obtain a new derivation of $\ConfReq{T}{N}$: +\begin{gather*} +\ConfStep{T}{N}{1}\\ +\AssocConfStep{1}{T.[N]A}{N}{2}\\ +\SameStep{U}{T.[N]A}{3}\\ +\SameConfStep{2}{3}{U}{N}{4}\\ +\AssocConfStep{4}{U.[N]A}{N}{5}\\ +\SameStep{T}{U.[N]A}{6}\\ +\SameConfStep{5}{6}{T}{N}{7} +\end{gather*} +We are no longer able to derive $\ConfReq{T}{N}$ from the \emph{remaining} requirements, because the above derivation of $\ConfReq{T}{N}$ starts with the elementary derivation step for $\ConfReq{T}{N}$. Thus, having deleted $\ConfReq{U}{N}$, the explicit requirement $\ConfReq{T}{N}$ is no longer redundant, and the generic signature of \texttt{Knot3} is the same as \texttt{Knot1}: +\begin{quote} +\begin{verbatim} + +\end{verbatim} +\end{quote} -\begin{definition}\label{generic signature invariants definition} The requirements of a generic signature are desugared, valid, minimal, and reduced. -\end{definition} +The fact that \texttt{Knot2} has a distinct generic signature from the other two was actually due to a quirk of the \Index{GenericSignatureBuilder@\texttt{GenericSignatureBuilder}}\texttt{GenericSignatureBuilder}, and this behavior is now part of the Swift \index{ABI}ABI. The stronger form of requirement minimization that guarantees uniqueness would actually be \emph{simpler} to implement, and we will explain the minor complication with the legacy behavior in \SecRef{minimal conformances}. -Desugared requirements were previously introduced in Definition~\ref{desugaredrequirementdef}; the remaining concepts are defined below. +There is another downside, from a theoretical standpoint. With type parameters, we are able to \index{reduced type equality}check equivalence by comparing their reduced types. However, we cannot check two lists of requirements for ``theory equivalence'' by comparing minimal generic signatures, because they are not unique. In practice though, nothing seems to call for this equivalence check; this is unlike type parameters, which are checked for reduced type equality all over the place. -\IndexDefinition{valid type parameter} -\Index{isValidTypeParameter()@\texttt{isValidTypeParameter()}} -\Index{requiresProtocol()@\texttt{requiresProtocol()}} -\begin{definition}\label{valid type parameter} -A requirement is \emph{valid} if the subject type and any type parameters appearing on the right hand side are valid type parameters. -\end{definition} -\begin{definition} A type parameter is \emph{valid} if one of the following holds: -\begin{itemize} -\item The type parameter is a generic parameter type in the generic signature. -\item The type parameter is a dependent member type \texttt{T.[P]A} with base type \texttt{T} and associated type \texttt{A} of protocol \texttt{P}, and the base type \texttt{T} is both recursively valid, and conforms to the protocol \texttt{P}, that is, the conformance requirement \texttt{T:\ P} is known to be satisfied via the \texttt{requiresProtocol()} generic signature query. -\end{itemize} -The \texttt{isValidTypeParameter()} generic signature query (Section~\ref{genericsigqueries}) determines if a type parameter is valid, except it also deals with unbound dependent member types, which the above definition does not cover (for the purposes of this section, only bound dependent member types are relevant, because a valid type parameter cannot contain an unbound dependent member type if it is also reduced). -\end{definition} +The important invariant that we do maintain, to make \index{textual interface}textual interfaces work for example, is \emph{idempotence}: meaning that when we pass from user-written requirements to minimal requirements for the first time, we are allowed to make choices amongst multiple minimal subsets, but if we take this output and build a new generic signature from those minimal requirements \emph{again}, we must arrive at the \emph{same} minimal generic signature. This also justifies the optimization of re-using requirement machines in~\ChapRef{rqm basic operation}. -\IndexDefinition{minimal requirement} -\index{requirement minimization|see {minimal requirement}} -\begin{definition} We can attempt to \emph{delete} a requirement by forming a new generic signature from the remaining requirements and checking the invariants of Definition~\ref{generic signature invariants definition}. A requirement is \emph{minimal} if one of the following holds: +\paragraph{Conflicting requirements.} In the previous section, we related the idea of the \index{well-formed substitution map}well-formed substitution map (\DefRef{valid subst map}) and the well-formed generic signature (\DefRef{valid generic signature def}); a necessary condition for a generic signature to have any well-formed substitution maps at all, is for it to be a well-formed generic signature. + +If we limit ourselves to the subset of the language with only \index{conformance requirement}conformance requirements and \index{same-type requirement}same-type requirements between type parameters, this condition is also sufficient, meaning we can mechanically construct a well-formed substitution map for a well-formed generic signature. We first declare a single concrete nominal type, call it struct~\texttt{S}, and conform \texttt{S} to each protocol $\texttt{P}_i$ that our generic signature~$G$ depends on: \begin{itemize} -\item The requirement cannot be deleted, because at least one of the remaining requirements would become invalid. -\item The requirement can be deleted, but the resulting generic signature does not satisfy the deleted requirement, in the sense of Algorithm~\ref{reqissatisfied}. +\item Any \index{function declaration}method, \index{variable declaration}variable and \index{subscript declaration}subscript requirements of $\texttt{P}_i$ can be witnessed by stub implementations that call \texttt{fatalError()}. +\item Any \index{associated type declaration}associated types of $\texttt{P}_i$ can be witnessed by type alias members of~\texttt{S} declared to have underlying type \texttt{S}. \end{itemize} -\end{definition} +We then build a substitution map for $G$ by replacing each generic parameter type with \texttt{S} and each conformance requirement with the corresponding \index{normal conformance}normal conformance $\ConfReq{S}{$\texttt{P}_i$}$. Every derived requirement of~$G$ has the form $\ConfReq{T}{$\texttt{P}_i$}$ for some type parameter \texttt{T} and protocol $\texttt{T}_i$, or $\SameReq{T}{U}$ for a pair of type parameters \texttt{T} and \texttt{U}. Applying our substitution map, we always get either $\ConfReq{S}{$\texttt{P}_i$}$ or $\SameReq{S}{S}$. Both requirements are satisfied no matter what, so our substitution map is well-formed. -\begin{example} Consider the generic signature of \verb|Box| from Example~\ref{conformance minimization}: -\begin{quote} -\begin{verbatim} - -\end{verbatim} -\end{quote} -The first requirement cannot be deleted; the second requirement's subject type would become invalid, since \texttt{T} would no longer conform to \texttt{Sequence}. The second requirement \emph{can} be deleted, giving us this signature: -\begin{quote} -\begin{verbatim} - -\end{verbatim} -\end{quote} -However, the second conformance requirement is no longer satisfied in this generic signature, since the conformance of \verb|T| to \verb|Sequence| alone does not imply that \verb|T.Element| is \verb|Hashable|. So we see that both requirements are minimal. -\end{example} -We now have all of Definition~\ref{generic signature invariants definition} except for ``reduced,'' which we will define now. In what follows, there is an important distinction between same-type requirements between type parameters (\verb|T.A == T.B|), and same-type requirements where the right hand side is a concrete type (\verb|T.A == Array|). Recall that the left hand side of a same-type requirement is called the subject type, and the right hand side is the constraint type. -\index{type parameter order} -\IndexDefinition{reduced requirement} -\begin{definition} -For all requirement kinds other than same-type requirements between two type parameters, the definition of a \emph{reduced} requirement in a generic signature is straightforward: +With \index{superclass requirement}superclass, \index{layout requirement}layout, and concrete \index{same-type requirement}same-type requirements, the picture is more complicated. When type parameters are required to have specific concrete types, we cannot ``collapse'' the entire generic signature down to a single point. Indeed, as we said, \index{conflicting requirement}\emph{conflicting requirements} might prevent our generic signature from being \index{satisfied requirement}satisfied by \emph{any} substitution map. There is another complication. While we can write derivations involving these ``exotic'' \index{requirement kind}requirement kinds, \index{limitation!derived requirements}certain \index{inference rule}inference rules are missing. The \index{type substitution}type substitution algebra will fill in some of these gaps. + +Suppose that $\Sigma$ is a substitution map where all replacement types are \index{fully-concrete type}fully concrete, so we can apply $\Sigma$ to a requirement and then apply \AlgRef{reqissatisfied} to check if its satisfied. Further, let's say that we can derive two concrete \index{same-type requirement}same-type requirements, where both have the same subject type parameter~\texttt{T} of $G$: +\begin{gather*} +\AnyStep{\SameReq{T}{$\texttt{X}_1$}}{1}\\ +\AnyStep{\SameReq{T}{$\texttt{X}_2$}}{2} +\end{gather*} +(Note that with our existing inference rules, each requirement is either explicit, or an associated requirement of a protocol after substitution of \texttt{Self}; there are no other ways to ``compose'' concrete same-type requirements right now.) We can apply $\Sigma$ to both requirements to get a pair of substituted requirements: +\[ \SameReq{$\texttt{T}\otimes\Sigma$}{$\texttt{X}_1\otimes\Sigma$}\qquad\text{and}\qquad + \SameReq{$\texttt{T}\otimes\Sigma$}{$\texttt{X}_2\otimes\Sigma$}. \] +If $\Sigma$ is a \index{well-formed substitution map}well-formed substitution map, it must satisfy both substituted requirements; that is, $\texttt{T}\otimes\Sigma$ must be \index{canonical type equality}canonically equal to $\texttt{X}_1\otimes\Sigma$, and also $\texttt{T}\otimes\Sigma$ must be canonically equal to $\texttt{X}_2\otimes\Sigma$. But canonical type equality is \index{transitive relation}transitive, so it follows that $\texttt{X}_1\otimes\Sigma$ must be canonically equal to $\texttt{X}_2\otimes\Sigma$. + +In other words, a necessary condition for $\Sigma$ to be well-formed is that it must satisfy the requirement $\SameReq{$\texttt{X}_1$}{$\texttt{X}_2$}\otimes\Sigma$. Now, $\SameReq{$\texttt{X}_1$}{$\texttt{X}_2$}$ is not a derived requirement of~$G$, because it has a concrete type on both sides. However, we know what to do if the user writes such a requirement directly; we apply \AlgRef{desugar same type algo}. If we do the same here, each of the possible outcomes gives us further information about~$G$: \begin{itemize} -\item A conformance or layout requirement is reduced if the requirement's subject type is a reduced type parameter. -\item A superclass requirement or a same-type requirement with a concrete constraint type is reduced if the subject type is a reduced type parameter, and if any type parameters contained in the constraint type are reduced. +\item If $\texttt{X}_1$ does not match $\texttt{X}_2$ in the sense used by the desugaring algorithm, then no substitution map $\Sigma$ can simultaneously satisfy both $\SameReq{T}{$\texttt{X}_1$}$ and $\SameReq{T}{$\texttt{X}_2$}$, so these two requirements are in conflict with each other, and~$G$ must be rejected. + +\item Otherwise, we always obtain a simpler list of requirements $\{R_1,\ldots,R_n\}$ with the property that $\Sigma$ satisfies $\SameReq{$\texttt{X}_1$}{$\texttt{X}_2$}$ if and only if it satisfies $R_i$ for all $1\le i\le n$. If one of the original derived requirements was actually an explicit requirement of~$G$, we can replace it with $\{R_1,\ldots,R_n\}$ without changing the ``intended meaning'' of the generic signature. (This might change the theory though, but only because our theory is missing some \index{inference rule}inference rules, as we already said.) \end{itemize} -\end{definition} -For a same-type requirement between two type parameters, the above definition does not work; \texttt{T.A == T.B} states that \texttt{T.A} and \texttt{T.B} have the same reduced type, so by definition at least one of \texttt{T.A} or \texttt{T.B} is not reduced. We need a few more steps before we can define a reduced same-type requirement between type parameters. -\index{same-type requirement} -\begin{definition} -A same-type requirement between two type parameters is \emph{oriented} if the subject type precedes the constraint type in type parameter order (Section~\ref{typeparams}). -\end{definition} -We can say that the constraint type of an oriented same-type requirement can be reduced to the subject type by the same-type requirement itself. The key property we want is that \emph{no other} same-type requirement in our generic signature can reduce the same type parameter. -\begin{definition} A same-type requirement is \index{left-reduced same-type requirement}\emph{right-reduced} if it is oriented, and the right hand side cannot be reduced by any combination of same-type requirements not involving this requirement itself. -\end{definition} -The last step is to state the condition satisfied by the left hand side of a same-type requirement. An example will be illustrative. In Example~\ref{same-type minimization example}, we started with \verb|T.A == T.B|, \verb|T.A == T.C|, and \verb|T.B == T.C|, and minimization output \verb|T.A == T.B| and \verb|T.B == T.C|. More generally, if you minimize a list of requirements equating each one of \texttt{T.A}, \texttt{T.B}, \texttt{T.C} and \texttt{T.D} with the rest (this is a complete graph of order 4), +Let's revisit \ExRef{concrete type query example}. We declare a protocol \texttt{Foo} with two associated types, together with an associated same-type requirement $\SameReq{Self.A}{Array}_\texttt{Foo}$: +\begin{Verbatim} +protocol Foo { + associatedtype A where A == Array + associatedtype B +} +\end{Verbatim} +Now, consider this function: +\begin{Verbatim} +func f1(_: T) where T.A == Array {} +\end{Verbatim} +We can derive two concrete same-type requirements involving \texttt{T}; the first one is explicit, the second is a consequence of the associated same-type requirement in~\texttt{Foo}: +\begin{gather*} +\ConcreteStep{T.A}{Array}{1}\\ +\ConfStep{T}{Foo}{2}\\ +\AssocConcreteStep{2}{T.A}{Array}{3} +\end{gather*} +As per our discussion above, a substitution map satisfying these requirements must also satisfy $\SameReq{Array}{Array}$, which desugars to $\SameReq{T.B}{Int}$. We can, in fact, replace our explicit same-type requirement with this desugared requirement, so the final generic signature is this: \begin{quote} \begin{verbatim} -T.A == T.B -T.A == T.C -T.A == T.D -T.B == T.C -T.B == T.D -T.C == T.D + \end{verbatim} \end{quote} -the minimization algorithm outputs the ``circuit,'' +Notice how these two lists of requirements are \emph{not} equivalent by \PropRef{equiv generic signatures}, because we cannot derive $\SameReq{T.B}{Int}$ from the first list of requirements; once the derived requirements formalism has been fully fleshed out, these ought to become equivalent: +\begin{gather*} +\{ \ConfReq{T}{Foo},\, \SameReq{T.A}{Array} \}\\ +\{ \ConfReq{T}{Foo},\, \SameReq{T.B}{Int} \} +\end{gather*} + +Next, we're going to change our example slightly to get a pair of conflicting requirements, and we see that we \index{diagnostic!conflicting requirement}diagnose an error: +\begin{Verbatim} +func f2(_: T) where T.A == Set {} + +// error: no type for `T.A' can satisfy both `T.A == Set' and +// `T.A == Array' +\end{Verbatim} + +What if \emph{both} of our same-type requirements are explicit? We now take this protocol: +\begin{Verbatim} +protocol Foe { + associatedtype X + associatedtype Y +} +\end{Verbatim} +So that we can define this function: +\begin{Verbatim} +func f3(_: T) where T.X == Set, T.A == Set {} +\end{Verbatim} +In \texttt{f3()}, we can replace \emph{either} requirement with $\SameReq{T.X}{Int}$ without changing the ``meaning'' of our generic signature, but since the first requirement's subject type is not reduced, we replace it first. Thus, we get the following generic signature: \begin{quote} \begin{verbatim} -T.A == T.B -T.B == T.C -T.C == T.D +, T.[Foe]Y == Int> \end{verbatim} \end{quote} -and not the ``star,'' + +We now state the general definition. + +\begin{definition}\label{conflicting req def} +Let $G$ be a well-formed \index{generic signature}generic signature. If $G$ has a pair of \index{derived requirement}derived requirements $R_1$~and~$R_2$ where $R_1\otimes\Sigma$ and $R_2\otimes\Sigma$ cannot both be \index{satisfied requirement}satisfied by the same substitution map~$\Sigma$, then $R_1$~and~$R_2$ define a pair of \IndexDefinition{conflicting requirement}\emph{confliciting requirements}. A generic signature $G$ is \emph{conflict-free} if it does not have any pairs of conflicting requirements. The pairs of derived requirements that can lead to conflicts are enumerated below: +\begin{enumerate} +\item For two concrete \index{same-type requirement}same-type requirements $\SameReq{T}{$\texttt{X}_1$}$ and $\SameReq{T}{$\texttt{X}_2$}$, we desugar the ``combined'' requirement $\SameReq{$\texttt{X}_1$}{$\texttt{X}_2$}$, as we already saw. Here and every remaining case below, desugaring will either detect a conflict, or produce a simpler list of requirements that can replace one of the two original requirements. +\item For a concrete same-type requirement $\SameReq{T}{X}$ and superclass requirement $\ConfReq{T}{C}$, we desugar $\ConfReq{X}{C}$, which can be satisfied only if~\texttt{X} is a class type that is also a subclass of~\texttt{C}. +\item For a same-type requirement $\SameReq{T}{X}$ and a \index{layout requirement}layout requirement $\ConfReq{T}{AnyObject}$, we desugar $\ConfReq{X}{AnyObject}$, which can be satisfied only if \texttt{X} is a class type. +\item For a same-type requirement $\SameReq{T}{X}$ and a \index{conformance requirement}conformance requirement $\ConfReq{T}{P}$, we desugar $\ConfReq{X}{P}$, which can be satisfied only if \texttt{X} conforms to \texttt{P}. +\item For two \index{superclass requirement}superclass requirements $\ConfReq{T}{$\texttt{C}_1$}$ and $\ConfReq{T}{$\texttt{C}_2$}$, we must consider the \index{superclass type}superclass relationship between the declarations of $\texttt{C}_1$~and~$\texttt{C}_2$: +\begin{enumerate} +\item If the \index{class declaration}class declaration of $\texttt{C}_1$ is a subclass of the declaration of $\texttt{C}_2$, we desugar $\ConfReq{$\texttt{C}_1$}{$\texttt{C}_2$}$, and $\ConfReq{T}{$\texttt{C}_2$}$ becomes redundant. +\item If the class declaration of $\texttt{C}_2$ is a subclass of the declaration of $\texttt{C}_1$, we desugar $\ConfReq{$\texttt{C}_2$}{$\texttt{C}_1$}$, and $\ConfReq{T}{$\texttt{C}_1$}$ becomes redundant. +\item If the two declarations are unrelated, the requirements conflict. +\end{enumerate} +\item For a superclass requirement $\ConfReq{T}{C}$ and a layout requirement $\ConfReq{T}{AnyObject}$, we desugar $\ConfReq{C}{AnyObject}$, which is always satisfied and cannot conflict. +\item For a superclass requirement $\ConfReq{T}{C}$ and a conformance requirement $\ConfReq{T}{P}$, we desugar $\ConfReq{C}{P}$. If \texttt{C} conforms to \texttt{P}, the conformance requirement $\ConfReq{T}{P}$ becomes redundant. However, if \texttt{C} does not conform to \texttt{P}, there is no conflict; the generic signature just requires a subclass of \texttt{C} that \emph{also} conforms to \texttt{P}. +\end{enumerate} +\end{definition} + +We will explain how the implementation deals with superclass, layout and concrete same-type requirements, sans theory, in Chapters \ref{propertymap}~and~\ref{concrete conformances}, but we're going to look at two examples here. + +\smallskip + +\begin{wrapfigure}[8]{r}{4.2cm} +\begin{center} +\begin{tikzpicture}[node distance=0.5cm] +\node (Shape) [class] {\texttt{\vphantom{Sp}Shape}}; +\node (Polygon) [class, below=of Shape] {\texttt{Polygon}}; +\node (Star) [class, right=of Polygon] {\texttt{\vphantom{Sp}Star}}; +\node (Pentagon) [class, below=of Polygon] {\texttt{Pentagon}}; + +\draw [arrow] (Shape) -- (Polygon); +\draw [arrow] (Shape) -- (Star); +\draw [arrow] (Polygon) -- (Pentagon); +\end{tikzpicture} +\end{center} +\end{wrapfigure} + +Our next example involves superclass requirements. While a description of generic classes awaits us in \ChapRef{classinheritance}, we only need non-generic classes to demonstrate the key ideas in requirement minimization. Since it is such a clich\'e at this point, we're going to follow the trend and go with the classic object oriented ``shape hierarchy'', shown on the right. We also introduce a \texttt{Canvas} protocol, which subjects it associated type to an \index{associated superclass requirement}associated superclass requirement $\ConfReq{Self.Boundary}{Polygon}_\texttt{Canvas}$: + +\begin{Verbatim} +class Shape {} +class Polygon: Shape {} +class Pentagon: Polygon {} +class Star: Shape {} + +protocol Canvas { + associatedtype Boundary: Polygon +} +\end{Verbatim} + +Our first function imposes a more general superclass bound on \texttt{C.Boundary} than what is implied by the conformance requirement $\ConfReq{C}{Canvas}$: +\begin{Verbatim} +func h1(_: C) where C.Boundary: Shape {} +\end{Verbatim} +Because every \texttt{Polygon} is also a \texttt{Shape}, the requirement $\ConfReq{C.Boundary}{Shape}$ is redundant, so we're just left with $\ConfReq{C}{Canvas}$. + +The second function tightens the superclass bound on \texttt{C.Boundary}: +\begin{Verbatim} +func h2(_: C) where C.Boundary: Pentagon {} +\end{Verbatim} +Not every \texttt{Polygon} is a \texttt{Pentagon}, so the minimal generic signature of \texttt{h2()} includes the requirement $\ConfReq{C.[Canvas]Boundary}{Pentagon}$ in addition to $\ConfReq{C}{Canvas}$. + +Finally, if we attempt to impose an unrelated superclass bound, we get an error \index{diagnostic!conflicting requirement}diagnosing the conflict: +\begin{Verbatim} +func h3(_: C) where C.Boundary: Star {} + +// error: no type for `C.Boundary' can satisfy both `C.Boundary : Star' +// and `C.Boundary : Polygon' +\end{Verbatim} + +Our final example looks at the interaction between \index{conformance requirement}conformance and concrete \index{same-type requirement}same-type requirements. Consider the generic signature of the \index{extension declaration}extension of \texttt{Box}: +\begin{Verbatim} +struct Box {} +extension Box where Contents == Array {} +\end{Verbatim} +To build the generic signature of the extension, we take the generic signature of \texttt{Box}, and add the requirement $\SameReq{Contents}{Array}$, so requirement minimization starts with this list of requirements: +\[\{\ConfReq{Contents}{Sequence},\,\SameReq{Contents}{Array}\}\] +By \DefRef{conflicting req def}, we can understand the interaction between the two requirements by desugaring the requirement $\ConfReq{Array}{Sequence}$. We look up the conformance and see that this requirement is satisfied, so our original requirement $\ConfReq{Contents}{Sequence}$ is redundant. We're left with the same-type requirement $\SameReq{Contents}{Array}$, so here is our extension's generic signature: \begin{quote} \begin{verbatim} -T.A == T.B -T.A == T.C -T.A == T.D +> \end{verbatim} \end{quote} -This formalizes as follows. -\begin{definition}\label{left-reduced requirement} A same-type requirement is \index{left-reduced same-type requirement}\emph{left-reduced} in a generic signature if two conditions hold: -\begin{enumerate} -\item The requirement's subject type is not equal to any other same-type requirement's subject type. -\item The requirement's subject type is either equal to the constraint type of some other same-type requirement, or it is a reduced type in our generic signature. -\end{enumerate} -\end{definition} -\begin{definition} -A same-type requirement between type parameters is \emph{reduced} if it is left-reduced and right-reduced. -\end{definition} -We now have a complete picture of what it means for a set of requirements to be well-formed; all that remains is to sort the requirements in a certain order when constructing the new generic signature. + +This generic signature certainly generates a different, much smaller theory than the original list of requirements that includes $\ConfReq{Contents}{Sequence}$. For example, the dependent member type \texttt{Contents.Element} is not a valid type parameter in the new generic signature, precisely because we cannot derive $\ConfReq{Contents}{Sequence}$. This gives us another example where requirement minimization does not preserve equivalence of generic signatures, because of gaps in our theoretical understanding. + +This is another one of those \Index{GenericSignatureBuilder@\texttt{GenericSignatureBuilder}}\texttt{GenericSignatureBuilder} behaviors that proved to be slightly obnoxious in hindsight; it would have been nicer if the conformance requirement was not actually considered to be redundant here. As a practical matter, it means that a valid type parameter of the extended type's generic signature might no longer be a valid type parameter in the generic signature of the extension. The implementation of the \Index{getReducedType()@\texttt{getReducedType()}}\texttt{getReducedType()} generic signature query makes a special exception to allow such type parameters anyway, by attempting to resolve the \index{concrete conformance}concrete conformance. Other consequences are explored in \SecRef{concrete contraction}. + +\paragraph{Requirement order.} +Once we have our minimal requirements, the last step is to sort them in a canonical way. As defined, the below algorithm is a \index{partial order}partial order because it can return ``$\bot$'', but this can only happen if both requirements have the same subject type and same kind, and they're not conformance requirements. Consulting \DefRef{conflicting req def}, we see this if two minimal requirements have this property, they must conflict. Thus, the requirements of a minimal conflict-free generic signature can be linearly ordered without ambiguity. + \begin{algorithm}[Requirement order]\label{requirement order} \IndexDefinition{requirement order} Takes two requirements as input, and returns one of ``$<$'', ``$>$'', ``$=$'' or \index{$\bot$}``$\bot$'' as output. \begin{enumerate} -\item (Equal) If both requirements are identically equal, return ``$=$''. -\item (Subject) Compare the subject types of the two requirements with Algorithm~\ref{type parameter order}. Return the result if it is ``$<$'' or ``$>$''. -\item (Kind) Otherwise, both requirements have the same subject type. If they have different kinds, return ``$<$'' or ``$>$'' based on the relative position of their kinds in the below list: -\begin{enumerate} -\item superclass, -\item layout, -\item conformance, -\item same-type. -\end{enumerate} -\item (Protocol) Otherwise, both requirements have the same subject type and the same kind. If both are conformance requirements, compare their protocols with Algorithm~\ref{linear protocol order}, and return the result if it is ``$<$'' or ``$>$''. -\item (Incomparable) Otherwise, the requirements are incomparable. Return ``$\bot$''. +\item (Subject) Compare the subject types of the two requirements with \AlgRef{type parameter order}. Return the result if it is ``$<$'' or ``$>$''. Otherwise, both requirements have the same subject type. +\item (Kind) Compare their kinds. If they have different kinds, return ``$<$'' or ``$>$'' based on the relative order below: +\[\text{superclass} < \text{layout} < \text{conformance} < \text{same-type}\] +Otherwise, both requirements have the same subject type and the same kind. +\item (Protocol) If both are conformance requirements, compare their protocols with \AlgRef{linear protocol order}, and return one of ``$<$'', ``$=$'', or ``$>$''. +\item (Incomparable) Otherwise, both requirements have the same subject type and kind, and they're not conformance requirements. Return ``$\bot$''. \end{enumerate} \end{algorithm} -As defined, the above algorithm is a \index{partial order}partial order because it can return ``$\bot$'', however, we can show that this only occurs on invalid inputs. -\begin{proposition} -The requirements of a minimal generic signature can be \index{linear order}linearly ordered. -\end{proposition} -\begin{proof} -Suppose we have two desugared, valid, minimal and reduced requirements that cannot be ordered. This means they have the same subject type and kind, but are not conformance requirements. We can show that each remaining requirement kind leads to a contradiction. - -If we have two layout requirements with the same subject type, they must be equal, as the only layout constraint that can be written in the source language is \texttt{AnyObject}. Either duplicate requirement can be deleted, and neither requirement is minimal. This contradicts our assumption that all requirements are minimal. -If we have two same-type requirements and at least one of the two is a same-type requirement between two type parameters, then the fact that it has the same subject type as the other violates Condition~1 of Definition~\ref{left-reduced requirement}. This means the same-type requirement is not left-reduced, so in particular it is not reduced. This contradicts our assumption that each requirement is reduced. +\paragraph{Requirement signatures.} The \index{associated requirement}associated requirements in a protocol's \index{requirement signature}requirement signature are minimized just like the explicit requirements of a generic signature. The proof of \PropRef{equiv generic signatures} can be tweaked slightly to instead use a \index{equivalence relation}equivalence relation on requirement signatures. (The key idea is that for every associated requirement of a protocol~\texttt{P}, we can derive the corresponding requirement in the \index{protocol generic signature}protocol generic signature~$G_\texttt{P}$.) Minimal and reduced associated requirements are defined in the same way, and the \index{requirement signature request}\Request{requirement signature request} always outputs a \IndexDefinition{minimal requirement signature}minimal requirement signature. We can describe the \index{conflicting requirement}conflicts among the associated requirements using \DefRef{conflicting req def}, and finally, sort the minimal associated requirements in a requirement signature using \AlgRef{requirement order}. -If we have two same-type requirements with concrete types on the right hand side, say \texttt{T == C} and \texttt{T == D} where \texttt{C} and \texttt{D} are concrete types, we know \texttt{T}, \texttt{C} and \texttt{D} all have the same reduced type. We also know that \texttt{C} and \texttt{D} are already reduced, because they appear on the right hand side of reduced same-type requirements. This implies that \texttt{C} and \texttt{D} are exactly equal. Again, it follows that we have duplicate requirements, so either requirement can be deleted, and neither requirement is minimal. This contradicts our assumption that all requirements are minimal. - -The only remaining case is that both are superclass requirements. Proving this also leads to a contradiction is left as an exercise for the reader. -\end{proof} -This shows it is not possible to have two layout, superclass or same-type requirements with the same subject type. We can prove an even stronger condition. -\begin{proposition} -The subject type of a same-type requirement with a concrete constraint type cannot equal the subject type of \emph{any} other requirement in a generic signature. -\end{proposition} -\begin{proof} -Suppose our generic signature contains a concrete same-type requirement \texttt{T~==~C} and a a conformance requirement \texttt{T:~P}. This means the subject type of the conformance requirement, \texttt{T}, can be reduced to \texttt{C}, violating the condition that all requirements are reduced. The proof for the other requirement kinds is similar. -\end{proof} +The major difference is that we must minimize all requirement signatures of a set of mutually-dependent protocols, or a \index{protocol component}\emph{protocol component}, simultaneously. We will discuss this again in \SecRef{protocol component} and see an example in \SecRef{homotopy reduction}. \section{Source Code Reference}\label{buildinggensigsourceref} @@ -1161,18 +1323,18 @@ \subsection*{Requests} \item \SourceFile{lib/AST/RequirementMachine/RequirementMachineRequests.cpp} \end{itemize} -The header file declares the requests; the evaluation functions are implemented by the Requirement Machine (Section~\ref{rqm basic operation source ref}). +The header file declares the requests; the evaluation functions are implemented by the Requirement Machine (\SecRef{rqm basic operation source ref}). \IndexSource{generic signature constructor} \apiref{GenericSignature}{class} -See also Section~\ref{genericsigsourceref}. +See also \SecRef{genericsigsourceref}. \begin{itemize} \item \texttt{get()} is the primitive constructor, which builds a generic signature directly from a list of generic parameters and minimal requirements. \end{itemize} \IndexSource{generic signature request} \apiref{GenericSignatureRequest}{class} -The \texttt{GenericContext::getGenericSignature()} method (Section~\ref{genericsigsourceref}) evaluates this request, which either returns the parent declaration's generic signature, or evaluates \texttt{InferredGenericSignatureRequest} with the appropriate arguments. +The \texttt{GenericContext::getGenericSignature()} method (\SecRef{genericsigsourceref}) evaluates this request, which either returns the parent declaration's generic signature, or evaluates \texttt{InferredGenericSignatureRequest} with the appropriate arguments. \IndexSource{inferred generic signature request} \apiref{InferredGenericSignatureRequest}{class} @@ -1218,25 +1380,24 @@ \subsection*{Requests} \apiref{buildGenericSignature()}{function} A utility function wrapping the \texttt{AbstractGenericSignatureRequest}. It checks and discards the returned error flags. If the \texttt{CompletionFailed} error flag is set, this aborts the compiler. The other two flags are ignored. -\index{conflicting requirement} \apiref{GenericSignatureErrorFlags}{enum class} -Error flags returned by \texttt{AbstractGenericSignatureRequest}. You'll see these conditions again in Chapter~\ref{rqm basic operation}; they prevent the requirement machine for this signature from being \emph{installed}. +Error flags returned by \texttt{AbstractGenericSignatureRequest}. We will see these conditions again in \ChapRef{rqm basic operation}; they prevent the requirement machine for this signature from being \emph{installed}. \begin{itemize} -\item \texttt{HasInvalidRequirements}: the original requirements referenced a non-existent type parameter, or the original requirements were in conflict with each other. Any errors in the requirements handed to this request usually mean there was another error diagnosed elsewhere, like an invalid conformance, so this flag being set is not really actionable to the rest of the compiler. Without source location information, this error cannot be diagnosed in a friendly manner. -\item \texttt{HasConcreteConformances}: the generic signature had non-redundant concrete conformance requirements, which is an internal condition that does not communicate any useful information to the caller. -\item \texttt{CompletionFailed}: the \index{completion}completion procedure could not construct a \index{confluence}confluent \index{rewrite system}rewrite system within the maximum number of steps. This is actually fatal, so the \texttt{buildGenericSignature()} wrapper function aborts the compiler in this case. +\item \texttt{HasInvalidRequirements}: the original requirements were not \IndexSource{well-formed requirement}well-formed, or were in \IndexSource{conflicting requirement}conflict with each other. Any errors in the requirements handed to this request usually mean there was another error diagnosed elsewhere, like an invalid conformance, so this flag being set is not really actionable to the rest of the compiler. Without source location information, this error cannot be diagnosed in a friendly manner. +\item \texttt{HasConcreteConformances}: the generic signature had non-redundant concrete conformance requirements, which is an internal flag used to prevent the requirement machine from being installed. It does not indicate an error condition to the caller. See \SecRef{concrete contraction} for discussion. +\item \texttt{CompletionFailed}: the \index{completion}completion procedure could not construct a \index{confluence}convergent \index{rewrite system}rewrite system within the maximum number of steps (see the discussion of termination that immediately follows \AlgRef{knuthbendix}). This is actually fatal, so the \texttt{buildGenericSignature()} wrapper function aborts the compiler in this case. \end{itemize} \IndexSource{requirement signature constructor} \apiref{RequirementSignature}{class} -See also Section~\ref{genericsigsourceref}. +See also \SecRef{genericsigsourceref}. \begin{itemize} \item \texttt{get()} is the primitive constructor, which builds a requirement signature directly from a list of minimal requirements and protocol type aliases. \end{itemize} \IndexSource{requirement signature request} \apiref{RequirementSignatureRequest}{class} -The \texttt{ProtocolDecl::getRequirementSignature()} method (Section~\ref{genericsigsourceref}) evaluates this request, which computes the protocol's requirement signature if the protocol is in the main module, or deserializes it if the protocol is from a serialized module. +The \texttt{ProtocolDecl::getRequirementSignature()} method (\SecRef{genericsigsourceref}) evaluates this request, which computes the protocol's requirement signature if the protocol is in the main module, or deserializes it if the protocol is from a serialized module. \IndexSource{structural requirements request} \apiref{StructuralRequirementsRequest}{class} @@ -1289,7 +1450,7 @@ \subsection*{Requirement Desugaring} The \texttt{realizeRequirement()} and \texttt{realizeInheritedRequirements()} functions also perform requirement desugaring. For \texttt{AbstractGenericSignatureRequest}, requirement desugaring is the entry point where the fun begins; it starts from a list of requirements instead of resolving user-written requirement representations. \apiref{rewriting::desugarRequirement()}{function} -Establishes the invariants in Definition~\ref{desugaredrequirementdef}, splitting up conformance requirements and simplifying requirements where the subject type is a concrete type. +Establishes the invariants in \DefRef{desugaredrequirementdef}, splitting up conformance requirements and simplifying requirements where the subject type is a concrete type. \apiref{RequirementError}{class} Represents a redundant or conflicting requirement detected by requirement desugaring or minimization. @@ -1300,13 +1461,13 @@ \subsection*{Requirement Minimization} \begin{itemize} \item \SourceFile{lib/AST/GenericSignature.cpp} \end{itemize} -Just as Section~\ref{minimal requirements} only describes the invariants around minimization, here we only call out the code related to checking those invariants. For the actual implementation of minimization, see Sections \ref{rqm minimization source ref}. +Just as \SecRef{minimal requirements} only describes the invariants around minimization, here we only call out the code related to checking those invariants. For the actual implementation of minimization, see \SecRef{rqm minimization source ref}. \IndexSource{requirement order} \apiref{Requirement}{class} -See also Section~\ref{genericsigsourceref}. +See also \SecRef{genericsigsourceref}. \begin{itemize} -\item \texttt{compare()} implements the requirement order (Algorithm~\ref{requirement order}), returning one of the following: +\item \texttt{compare()} implements the requirement order (\AlgRef{requirement order}), returning one of the following: \begin{itemize} \item $-1$ if this requirement precedes the given requirement, \item 0 if the two requirements are equal, @@ -1318,9 +1479,9 @@ \subsection*{Requirement Minimization} \IndexSource{minimal requirement} \IndexSource{reduced requirement} \apiref{GenericSignatureImpl}{class} -See also Section~\ref{genericsigsourceref}. +See also \SecRef{genericsigsourceref}. \begin{itemize} -\item \texttt{verify()} ensures that the requirements of a generic signature are desugared, minimal and reduced (Definition~\ref{generic signature invariants definition}) and correctly ordered (Algorithm~\ref{requirement order}). Any violations are a fatal error that crashes the compiler even in no-assert builds, since such generic signatures should not be built at all. +\item \texttt{verify()} ensures that all explicit requirements in this signature are desugared (\DefRef{desugaredrequirementdef}), reduced (\DefRef{reduced requirement}), minimal (\DefRef{minimal generic sig def}), and ordered (\AlgRef{requirement order}). Any violations report a fatal error that crashes the compiler even in no-assert builds, since such generic signatures should not be built at all. \end{itemize} -\end{document} \ No newline at end of file +\end{document} diff --git a/docs/Generics/chapters/class-inheritance.tex b/docs/Generics/chapters/class-inheritance.tex index 24ab34f5fdd68..605f0532162a9 100644 --- a/docs/Generics/chapters/class-inheritance.tex +++ b/docs/Generics/chapters/class-inheritance.tex @@ -2,7 +2,7 @@ \begin{document} -\chapter{Class Inheritance}\label{classinheritance} +\chapter[]{Class Inheritance}\label{classinheritance} \ifWIP @@ -34,7 +34,7 @@ \chapter{Class Inheritance}\label{classinheritance} class Base {} class Derived: Base {} \end{Verbatim} -Now, the declaration \texttt{Derived} has the generic superclass type \texttt{Base}. Intuitively, we expect that \texttt{Derived} is a subtype of \texttt{Base}, and \texttt{Derived} is a subtype of \texttt{Base}, but that \texttt{Derived} and \texttt{Base} are unrelated types. +Now, the declaration \texttt{Derived} has the generic superclass type \texttt{Base}. We expect that \texttt{Derived} is a subtype of \texttt{Base}, and \texttt{Derived} is a subtype of \texttt{Base}, but that \texttt{Derived} and \texttt{Base} are unrelated types. To get a complete picture of the subtype relationship, we need to define the concept of the superclass type \emph{of a type}, and not just the superclass type of a declaration. @@ -50,8 +50,6 @@ \chapter{Class Inheritance}\label{classinheritance} Now that we can compute the superclass type of a type, we can walk up the inheritance hierarchy by iterating the process, to get the superclass type of a superclass type, and so on. -\fi - \begin{algorithm}[Iterated superclass type]\label{superclassfordecl} As input, takes a class type \texttt{T} and a superclass declaration \texttt{D}. Returns the superclass type of \texttt{T} for \texttt{D}. \begin{enumerate} \item Let \texttt{C} be the class declaration referenced by \texttt{T}. If $\texttt{C}=\texttt{D}$, return \texttt{T}. @@ -60,7 +58,132 @@ \chapter{Class Inheritance}\label{classinheritance} \end{enumerate} \end{algorithm} -\ifWIP +\begin{listing}\captionabove{Satisfied and unsatisfied superclass requirements}\label{unsatisfied requirements superclass} +\begin{Verbatim} +class Base {} +class Derived: Base {} + +struct G, U> {} + +struct H { + // (1) requirement is satisfied + typealias A = G, Y> + + // (1) requirement is satisfied + typealias B = G + + // (2) requirement is unsatisfied + typealias C = G +} +\end{Verbatim} +\end{listing} +\begin{example} +\ListingRef{unsatisfied requirements superclass} shows two examples involving superclass requirements. The generic signature of \texttt{G} is: +\begin{quote} +\begin{verbatim} +> +\end{verbatim} +\end{quote} +The generic signature has a single requirement: +\begin{quote} +\begin{tabular}{|l|l|l|} +\hline +Kind&Subject type&Constraint type\\ +\hline +Superclass&\texttt{T}&\texttt{Base}\\ +\hline +\end{tabular} +\end{quote} + +\paragraph{First type alias} The context substitution map of the underlying type of \texttt{A}: +\[ +\SubstMap{ +\SubstType{T}{Base<$\archetype{Y}$>}\\ +\SubstType{U}{$\archetype{Y}$} +} +\] +We apply this substitution map to the requirement of our generic signature: +\begin{quote} +\begin{tabular}{|l|l|l|c|} +\hline +Kind&Subject type&Constraint type&Satisfied?\\ +\hline +Superclass&\texttt{Base<$\archetype{Y}$>}&\texttt{Base<$\archetype{Y}$>}&$\checkmark$\\ +\hline +\end{tabular} +\end{quote} +The requirement is satisfied, because the subject type is canonical-equal to the constraint type. + +\index{superclass type} +\paragraph{Second type alias} The context substitution map of the underlying type of \texttt{B}: +\[ +\SubstMap{ +\SubstType{T}{$\archetype{X}$}\\ +\SubstType{U}{Int} +} +\] +We apply this substitution map to the requirement of our generic signature: +\begin{quote} +\begin{tabular}{|l|l|l|} +\hline +Kind&Subject type&Constraint type\\ +\hline +Superclass&\archetype{X}&\texttt{Base}\\ +\hline +\end{tabular} +\end{quote} +The requirement hits the recursive case for superclass requirements in \AlgRef{reqissatisfied}. The archetype \archetype{X} is replaced with its superclass type \texttt{Derived}, via the generic signature of \texttt{H}: +\begin{quote} +\begin{tabular}{|l|l|l|} +\hline +Kind&Subject type&Constraint type\\ +\hline +Superclass&\texttt{Derived}&\texttt{Base}\\ +\hline +\end{tabular} +\end{quote} +The algorithm recurses again, after replacing the class type \texttt{Derived} with its superclass type \texttt{Base}: +\begin{quote} +\begin{tabular}{|l|l|l|c|} +\hline +Kind&Subject type&Constraint type&Satisfied?\\ +\hline +Superclass&\texttt{Base}&\texttt{Base}&$\checkmark$\\ +\hline +\end{tabular} +\end{quote} +In its final form, the substituted requirement is trivially seen to be satisfied because the subject type is canonical-equal to the constraint type. + +\index{superclass type} +\paragraph{Third type alias} The context substitution map of the underlying type of \texttt{C}: +\[ +\SubstMap{ +\SubstType{T}{$\archetype{X}$}\\ +\SubstType{U}{$\archetype{Y}$} +} +\] +We apply this substitution map to the requirement of our generic signature: +\begin{quote} +\begin{tabular}{|l|l|l|} +\hline +Kind&Subject type&Constraint type\\ +\hline +Superclass&\archetype{X}&\texttt{Base<\archetype{Y}>}\\ +\hline +\end{tabular} +\end{quote} +The requirement is seen to be unsatisfied, as follows. As above, the archetype \archetype{X} is replaced with its superclass type \texttt{Derived}, which is replaced with its superclass type \texttt{Base}: +\begin{quote} +\begin{tabular}{|l|l|l|c|} +\hline +Kind&Subject type&Constraint type&Satisfied?\\ +\hline +Superclass&\texttt{Base}&\texttt{Base<$\archetype{Y}$>}&$\times$\\ +\hline +\end{tabular} +\end{quote} +At this point, the substituted requirement is between two different specializations of the same class declaration, \texttt{Base} and \texttt{Base<$\archetype{Y}$>}. They are not canonical-equal, because $\archetype{Y}$ is not \texttt{Int}, so the requirement is unsatisfied. +\end{example} \begin{listing}\captionabove{Computing superclass types}\label{generic superclass example listing} \begin{Verbatim} @@ -77,7 +200,7 @@ \chapter{Class Inheritance}\label{classinheritance} \end{listing} \begin{example}\label{genericsuperclassexample} -Listing~\ref{generic superclass example listing} shows a class hierarchy demonstrating these behaviors: +\ListingRef{generic superclass example listing} shows a class hierarchy demonstrating these behaviors: \begin{enumerate} \item The superclass type of \texttt{Derived} is \texttt{Middle}. \item The superclass type of \texttt{Middle} is \texttt{Base<(T, T)>}. @@ -99,15 +222,15 @@ \chapter{Class Inheritance}\label{classinheritance} \] \end{example} -We can finally describe the implementation of Case~3 of Definition~\ref{context substitution map for decl context}. +We can finally describe the implementation of Case~3 of a context substitution map for a declaration context from \SecRef{checking generic arguments}. -The base type here is a class type, and the declaration context is some superclass declaration or an extension thereof. We first apply Algorithm~\ref{superclassfordecl} to the base type and superclass declaration to get the correct superclass type. Then, we compute the context substitution map of this superclass type with respect to our declaration context, which is now either the exact superclass declaration or an extension. Thus we have reduced the problem to Case~1, which we already know how to solve. +The base type here is a class type, and the declaration context is some superclass declaration or an extension thereof. We first apply \AlgRef{superclassfordecl} to the base type and superclass declaration to get the correct superclass type. Then, we compute the context substitution map of this superclass type with respect to our declaration context, which is now either the exact superclass declaration or an extension. Thus we have reduced the problem to Case~1, which we already know how to solve. TODO: example \fi -\section{Inherited Conformances}\label{inheritedconformance} +\section[]{Inherited Conformances}\label{inheritedconformance} \ifWIP @@ -125,7 +248,7 @@ \section{Inherited Conformances}\label{inheritedconformance} The lookup conformance table machinery actually introduces an additional level of indirection by wrapping these specialized conformances in a bespoke \emph{inherited conformance} data type. Conformances store their conforming type; the defining invariant is that if a conformance was the result of a lookup, the stored conforming type should equal the original type of the lookup. With class inheritance however, the conforming type of a conformance declared on the superclass is ultimately always some substitution of the type of the superclass. An inherited conformance stores the original subclass type, but otherwise just delegates to an underlying conformance, either normal or specialized. By wrapping inherited conformances in a special type, the compiler is able to keep track of the original type of a conformance lookup. \begin{example} -We can amend Example~\ref{genericsuperclassexample} to add a conformance to the \texttt{Base} class: +We can amend \ExRef{genericsuperclassexample} to add a conformance to the \texttt{Base} class: \begin{Verbatim} protocol P { associatedtype A @@ -138,14 +261,14 @@ \section{Inherited Conformances}\label{inheritedconformance} \fi -\section{Override Checking}\label{overridechecking} +\section[]{Override Checking}\label{overridechecking} \ifWIP When a subclass overrides a method from a superclass, the type checker must ensure the subclass method is compatible with the superclass method in order to guarantee that instances of the subclass are dynamically interchangeable with a superclass. If neither the superclass nor the subclass are generic, the compatibility check simply compares the fully concrete parameter and result types of the non-generic declarations. Otherwise, the superclass substitution map plays a critical role yet again, because the compatibility relation must project the superclass method's type into the subclass to meaningfully compare it with the override. \paragraph{Non-generic overrides} -The simple case is when the superclass or subclass is generic, but the superclass method does not define generic parameters of its own, either explicitly or via the opaque parameters of Section~\ref{opaque parameters}. Let's call such a method ``non-generic,'' even if the class it appears inside is generic. So a non-generic method has the same generic signature as its parent context, which in our case is a class. In the non-generic case, the superclass substitution map is enough to understand the relation between the interface type of the superclass method and its override. +The simple case is when the superclass or subclass is generic, but the superclass method does not define generic parameters of its own, either explicitly or via the opaque parameters of \SecRef{requirements}. Let's call such a method ``non-generic,'' even if the class it appears inside is generic. So a non-generic method has the same generic signature as its parent context, which in our case is a class. In the non-generic case, the superclass substitution map is enough to understand the relation between the interface type of the superclass method and its override. \begin{listing}\captionabove{Some method overrides}\label{method overrides} \begin{Verbatim} @@ -165,7 +288,7 @@ \section{Override Checking}\label{overridechecking} \end{Verbatim} \end{listing} -In Listing~\ref{method overrides}, the \texttt{Derived} class overrides the \texttt{doStuff()} method from \texttt{Outer.Inner}. Dropping the first level of function application from the interface type of \texttt{doStuff()} leaves us with \texttt{(T, U) -> ()}, to which we apply the superclass substitution map for \texttt{Derived} to get the final result: +In \ListingRef{method overrides}, the \texttt{Derived} class overrides the \texttt{doStuff()} method from \texttt{Outer.Inner}. Dropping the first level of function application from the interface type of \texttt{doStuff()} leaves us with \texttt{(T, U) -> ()}, to which we apply the superclass substitution map for \texttt{Derived} to get the final result: \[ \texttt{(T, U) -> ()} \otimes \SubstMap{ @@ -177,7 +300,7 @@ \section{Override Checking}\label{overridechecking} This happens to exactly equal the interface type of the subclass method \texttt{doStuff()} in \texttt{Derived}, again not including the self clause. An override with an exact type match is valid. (In fact, some variance in parameter and return types is permitted as well, but it's not particularly interesting from a generics point of view, so here is the executive summary: an override can narrow the return type, and widen the parameter types. This means it is valid to override a method returning \texttt{Optional} with a method returning \texttt{T}, because a \texttt{T} can also trivially become a \texttt{Optional} via an injection. Similarly, if \texttt{A} is a superclass of \texttt{B}, a method returning \texttt{A} can be overridden to return \texttt{B}, because a \texttt{B} is always an \texttt{A}. A dual set of rules are in play in method parameter position; if the original method takes an \texttt{Int} the override can accept \texttt{Optional}, etc.) \paragraph{Generic overrides} -In the non-generic case, applying the superclass substitution map directly to the interface type of a superclass method tells us what ``the type of the superclass method should be'' in the subclass, and this happens to work because the superclass method had the same generic signature as the superclass. Once this is no longer required to be so, the problem becomes more complicated, and the below details were not worked out until Swift 5.2 \cite{sr4206}. +In the non-generic case, applying the superclass substitution map directly to the interface type of a superclass method tells us what ``the type of the superclass method should be'' in the subclass, and this happens to work because the superclass method had the same generic signature as the superclass. Once this is no longer required to be so, the problem becomes more complicated, and the below details were not worked out until \IndexSwift{5.2}Swift 5.2 \cite{sr4206}. The generic signature of the superclass (resp. override) method is built by adding any additional generic parameters and requirements to the generic signature of the superclass (resp. subclass) itself. To relate these four generic signatures together, we generalize the superclass substitution map into something called the \emph{attaching map}. Once we can compute an attaching map, applying it to the interface type of the superclass method produces a substituted type which can be compared against the interface type of the override, just as before. However, while this part is still necessary, it is no longer sufficient, since we also need to compare the \emph{generic signatures} of the superclass method and its override for compatibility. Here the attaching map also plays a role. @@ -201,7 +324,7 @@ \section{Override Checking}\label{overridechecking} \end{algorithm} \begin{example} -To continue the \texttt{doGeneric()} example from Listing~\ref{method overrides}, the superclass method defines a generic parameter \texttt{A} at depth 2, but the ``same'' parameter has depth 1 in the subclass method of \texttt{Derived}. For clarity, the attaching map is written with canonical types (otherwise, it would replace \texttt{A} with \texttt{A}, with a different meaning of \texttt{A} on each side): +To continue the \texttt{doGeneric()} example from \ListingRef{method overrides}, the superclass method defines a generic parameter \texttt{A} at depth 2, but the ``same'' parameter has depth 1 in the subclass method of \texttt{Derived}. For clarity, the attaching map is written with canonical types (otherwise, it would replace \texttt{A} with \texttt{A}, with a different meaning of \texttt{A} on each side): \[ \SubstMapLongC{ \SubstType{\ttgp{0}{0}}{Int}\\ @@ -227,22 +350,22 @@ \section{Override Checking}\label{overridechecking} \begin{enumerate} \item Initialize \texttt{P} to an empty list of generic parameter types. \item Initialize \texttt{R} to an empty list of generic requirements. -\item Let \texttt{S} be the attaching map for \texttt{G}, \texttt{B} and \texttt{D} computed using Algorithm~\ref{superclass attaching map}. -\item (Parent signature) Let $\texttt{G}''$ be the generic signature of \texttt{D}. (In Algorithm~\ref{superclass attaching map}, $\texttt{G}'$ was used for the generic signature of \texttt{B}.) +\item Let \texttt{S} be the attaching map for \texttt{G}, \texttt{B} and \texttt{D} computed using \AlgRef{superclass attaching map}. +\item (Parent signature) Let $\texttt{G}''$ be the generic signature of \texttt{D}. (In \AlgRef{superclass attaching map}, $\texttt{G}'$ was used for the generic signature of \texttt{B}.) \item (Additional parameters) For each generic parameter of \texttt{G} at the innermost depth, apply \texttt{S} to the generic parameter. By construction, the result is another generic parameter type; record this type in \texttt{P}. \item (Additional requirements) For each requirement of \texttt{G}, apply \texttt{S} to the requirement and record the result in \texttt{R}. \item (Return) Build a minimized generic signature from $\texttt{G}''$, \texttt{P} and \texttt{R}, and return the result. \end{enumerate} \end{algorithm} -For the override to satisfy the contract of the superclass method, it should accept any valid set of concrete type arguments also accepted by the superclass method. The override might be more permissive, however. The correct relation is that each generic requirement of the actual override signature must be satisfied by the expected override signature, but not necessarily vice versa. This uses the same mechanism as conditional requirement checking for conditional conformances, described in Section~\ref{conditional conformance}. The requirements of one signature can be mapped to archetypes of the primary generic environment of another signature. This makes the requirement types concrete, which allows the \texttt{isSatisfied()} predicate to be checked against the substituted requirement. +For the override to satisfy the contract of the superclass method, it should accept any valid set of concrete type arguments also accepted by the superclass method. The override might be more permissive, however. The correct relation is that each generic requirement of the actual override signature must be satisfied by the expected override signature, but not necessarily vice versa. This uses the same mechanism as conditional requirement checking for conditional conformances, described in \SecRef{conditional conformance}. The requirements of one signature can be mapped to archetypes of the primary generic environment of another signature. This makes the requirement types concrete, which allows the \texttt{isSatisfied()} predicate to be checked against the substituted requirement. -\begin{example} In Listing~\ref{method overrides}, the superclass method generic signature is \texttt{}. The generic parameter \texttt{A} belongs to the method; the other two are from the generic signature of the superclass. The override signature glues together the innermost generic parameters and their requirements from the superclass method with the generic signature of the subclass, which is \texttt{}. This operation produces the signature \texttt{}. This is different from the actual override generic signature of \texttt{doStuff()} in \texttt{Derived}, which is \texttt{}. However, the actual signature's requirements are satisfied by the expected signature. +\begin{example} In \ListingRef{method overrides}, the superclass method generic signature is \texttt{}. The generic parameter \texttt{A} belongs to the method; the other two are from the generic signature of the superclass. The override signature glues together the innermost generic parameters and their requirements from the superclass method with the generic signature of the subclass, which is \texttt{}. This operation produces the signature \texttt{}. This is different from the actual override generic signature of \texttt{doStuff()} in \texttt{Derived}, which is \texttt{}. However, the actual signature's requirements are satisfied by the expected signature. \end{example} \fi -\section{Designated Initializer Inheritance} +\section[]{Designated Initializer Inheritance} \iffalse @@ -256,7 +379,501 @@ \section{Designated Initializer Inheritance} \fi -\section{Source Code Reference} +\section[]{Witness Thunks}\label{valuerequirements} + +\ifWIP + +When protocol conformances were introduced in \ChapRef{conformances}, our main focus was the mapping from associated type requirements to type witnesses, and how conformances participate in type substitution. Now let's look at the other facet of conformances, which is how they map value requirements to value witnesses.\footnote{The term ``value witness'' is overloaded to have two meanings in Swift. The first is a witness to a value requirement in a protocol. The second is an implementation of an intrinsic operation all types support, like copy, move, destroy, etc., appearing in the value witness table of runtime type metadata. Here I'm talking about the first meaning.} Recording a witness for a protocol requirement requires more detail than simply stating the witness. + +What is the relationship between the generic signature of a protocol requirement and the generic signature of the witness? Well, ``it's complicated.'' A protocol requirement's generic signature has a \texttt{Self} generic parameter constrained to that protocol. If the witness is a default implementation from a protocol extension, it will have a \texttt{Self} generic parameter, too, but it might conform to a \emph{different} protocol. Or if the witness is a member of the conforming type and the conforming type has generic parameters of its own, it will have its own set of generic parameters, with different requirements. A witness might be ``more generic'' than a protocol requirement, where the requirement is satisfied by a fixed specialization of the witness. Conditional conformance and class inheritance introduce even more possibilities. (There will be examples of all of these different cases at the end of \SecRef{witnessthunksignature}.) + +\index{SILGen} +All of this means that when the compiler generates a witness table to represent a conformance at runtime, the entries in the witness table cannot simply point directly to the witness implementations. The protocol requirement and the witness will have different calling conventions, so SILGen must emit a \emph{witness thunk} to translate the calling convention of the requirement into that of each witness. Conformance checking records a mapping between protocol requirements and witnesses together with the necessary details for witness thunk emission inside each normal conformance. + +The \texttt{ProtocolConformance::getWitness()} method takes the declaration of a protocol value requirement, and returns an instance of \texttt{Witness}, which stores all of the this information, obtainable by calling getter methods: +\begin{description} +\item[\texttt{getDecl()}] The witness declaration itself. +\item[\texttt{getWitnessThunkSignature()}] The \emph{witness thunk generic signature}, which bridges the gap between the protocol requirement's generic signature and the witness generic signature. Adopting this generic signature is what allows the witness thunk to have the correct calling convention that matches the caller's invocation of the protocol requirement, while providing the necessary type parameters and conformances to invoke a member of the concrete conforming type. +\item[\texttt{getSubstitutions()}] The \emph{witness substitution map}. Maps the witness generic signature to the type parameters of the witness thunk generic signature. This is the substitution map at the call of the actual witness from inside the witness thunk. +\item[\texttt{getRequirementToWitnessThunkSubs()}] The \emph{requirement substitution map}. Maps the protocol requirement generic signature to the type parameters of the witness thunk generic signature. This substituted map is used by SILGen to compute the interface type of the witness thunk, by applying it to the interface type of the protocol requirement. +\end{description} + +TODO: +\begin{itemize} +\item diagram with the protocol requirement caller, the protocol requirement type, the witness thunk signature/type, and the witness signature/type. +\item more details about how the witness\_method CC recovers self generic parameters in a special way +\end{itemize} + +\section[]{Covariant Self Problem} + +In Swift, subclasses inherit protocol conformances from their superclass. If a class conforms to a protocol, a requirement of this protocol can be called on an instance of a subclass. When the protocol requirement is witnessed by a default implementation in a protocol extension, the \texttt{Self} parameter of the protocol extension method is bound to the specific subclass substituted at the call site. The subclass can be observed if, for example, the protocol requirement returns an instance of \texttt{Self}, and the default implementation constructs a new instance via an \texttt{init()} requirement on the protocol. + +The protocol requirement can be invoked in one of two ways: +\begin{enumerate} +\item Directly on an instance of the class or one of its subclasses. Since the implementation is known to always be the default implementation, the call is statically dispatched to the default implementation without any indirection through the witness thunk. +\item Indirectly via some other generic function with a generic parameter constrained to the protocol. Since the implementation is unknown, the call inside the generic function is dynamically dispatched via the witness thunk stored in the witness table for the conformance. If the generic function in turn is called with an instance of the class or one of its subclasses, the witness thunk stored in the witness table for the conformance will statically dispatch to the default implementation. +\end{enumerate} +The two cases are demonstrated in \ListingRef{covariantselfexample}. The \texttt{Animal} protocol, which defines a \texttt{clone()} requirement returning an instance of \texttt{Self}. This requirement has a default implementation which constructs a new instance of \texttt{Self} via the \texttt{init()} requirement on the protocol. The \texttt{Horse} class conforms to \texttt{Animal}, using the default implementation for \texttt{clone()}. The \texttt{Horse} class also has a subclass, \texttt{Pony}. It follows from substitution semantics that both \texttt{newPony1} and \texttt{newPony2} should have a type of \texttt{Pony}: +\begin{itemize} +\item The definition of \texttt{newPony1} calls \texttt{clone()} with the substitution map $\texttt{Self} := \texttt{Pony}$. The original return type of \texttt{clone()} is \texttt{Self}, so the substituted type is \texttt{Pony}. +\item Similarly, the definition of \texttt{newPonyIndirect} calls \texttt{cloneAnimal()} with the substitution map $\texttt{A} := \texttt{Pony}$. The original return type of \texttt{cloneAnimal()} is \texttt{A}, so the substituted type is also \texttt{Pony}. +\end{itemize} +The second call dispatches through the witness thunk, so the witness thunk must also ultimately call the default implementation of \texttt{Animal.clone()} with the substitution map $\texttt{Self} := \texttt{Pony}$. When the conforming type is a struct or an enum, the \texttt{self} parameter of a witness thunk has a concrete type. If the conforming type was a class though, it would not be correct to use the concrete \texttt{Horse} type, because the witness thunk would then invoke the default implementation with the substitution map $\texttt{Self} := \texttt{Horse}$, and the second call would return an instance of \texttt{Horse} at runtime and not \texttt{Pony}, which would be a type soundness hole. + +\begin{listing}\captionabove{Statically and dynamically dispatched calls to a default implementation}\label{covariantselfexample} +\begin{Verbatim} +protocol Animal { + init() + func clone() -> Self +} + +extension Animal { + func clone() -> Self { + return Self() + } +} + +class Horse: Animal {} +class Pony: Horse {} + +func cloneAnimal(_ animal: A) -> A { + return animal.clone() +} + +let newPonyDirect = Pony().clone() +let newPonyIndirect = cloneAnimal(Pony()) +\end{Verbatim} +\end{listing} + +\Index{protocol Self type@protocol \texttt{Self} type} +This soundness hole was finally discovered and addressed in \IndexSwift{4.1}Swift~4.1 \cite{sr617}. The solution is to model the covariant behavior of \texttt{Self} with a superclass-constrained generic parameter. When the conforming type is a class, witness thunks dispatching to a default implementation have this special generic parameter, in addition to the generic parameters of the class itself (there are none in our example, so the witness thunk just has the single generic parameter for \texttt{Self}). In the next section, the algorithms for building the substitution map and generic signature all take a boolean flag indicating if a covariant \texttt{Self} type should be introduced. The specific conditions under which this flag is set are a bit subtle: +\begin{enumerate} +\item The conforming type must be a non-final class. If the class is final, there is no need to preserve variance since \texttt{Self} is always the exact class type. +\item The witness must be in a protocol extension. If the witness is a method on the class, there is no way to observe the concrete substitution for the protocol \texttt{Self} type, because it is not a generic parameter of the class method. +\item (The hack) The interface type of the protocol requirement must not mention any associated types. +\end{enumerate} +The determination of whether to use a static or covariant \texttt{Self} type for a class conformance is implemented by the type cheker function \texttt{matchWitness()}. + +Indeed, Condition~3 is a hack; it opens up an exception where the soundness hole we worked so hard to close is once again allowed. In an ideal world, Conditions 1~and~2 would be sufficient, but by the time the soundness hole was discovered and closed, existing code had already been written taking advantage of it. The scenario necessitating Condition~3 is when the default implementation appears in a \emph{constrained} protocol extension: +\begin{Verbatim} +protocol P { + associatedtype T = Self + func f() -> T +} + +extension P where Self.T == Self { + func f() -> Self { return self } +} + +class C: P {} +class D: C {} +\end{Verbatim} +The non-final class \texttt{C} does not declare a type witness for associated type \texttt{T} of protocol~\texttt{P}. The associated type specifies a default, so conformance checking proceeds with the default type witness. The language model is that a conformance is checked once, at the declaration of \texttt{C}, so the default type \texttt{Self} is the ``static'' \texttt{Self} type of the conformance, which is \texttt{C}. Moving on to value requirements, class \texttt{C} does not provide an implementation of the protocol requirement \texttt{f()} either, and the original intent of this code is that the default implementation of \texttt{f()} from the constrained extension of \texttt{P} should used. + +Without Condition~3, the requirement \texttt{Self.T == Self} would not be satisfied when matching the requirement \texttt{f()} with its witness; the left hand side of the requirement, \texttt{C}, is not exactly equal to the right hand side, which is the covariant \texttt{Self} type that is only known to be \emph{some subclass} of \texttt{C}. The conformance would be rejected unless \texttt{C} was declared final. With Condition~3, \texttt{Self.T == Self} is satisfied because the static type \texttt{C} is used in place of \texttt{Self} during witness matching. + +The compiler therefore continued to accept the above code, because it worked prior to Swift~4.1. Unfortunately, it means that a call to \texttt{D().f()} via the witness thunk will still return an instance of \texttt{C}, and not \texttt{D} as expected. One day, we might remove this exception and close the soundness hole completely, breaking source compatibility for the above example until the developer makes it type safe by declaring \texttt{C} as final. For now, a good guideline to ensure type safety when mixing classes with protocols is \textsl{only final classes should conform to protocols with associated types}. + +\fi + +\section[]{Witness Thunk Signatures}\label{witnessthunksignature} + +\ifWIP + +Now we turn our attention to the construction of the data recorded in the \texttt{Witness} type. This is done with the aid of the \texttt{RequirementEnvironment} class, which implements the ``builder'' pattern. + +Building the witness thunk signature is an expensive operation. The below algorithms only depend on the conformance being checked, the generic signature of a protocol requirement, and whether the witness requires the use of a covariant \texttt{Self} type. These three pieces of information can be used as a uniquing key to cache the results of these algorithms. Conformance checking might need to consider a number of protocol requirements, each requirement having multiple candidate witnesses that have to be checked to find the best one. In the common case, many protocol requirements will share a generic signature---for example, any protocol requirement without generic parameters of its own has the simple generic signature \texttt{}. Therefore this caching can eliminate a fair amount of duplicated work. + +The \textbf{witness substitution map} is built by the constraint solver when matching the interface type of a witness to the interface type of a requirement. A description of this process is outside of the scope of this manual. + +The \textbf{requirement substitution map} is built by mapping the requirement's \texttt{Self} parameter either to the witness thunk's \texttt{Self} parameter (if the witness has a covariant class \texttt{Self} type), or to the concrete conforming type otherwise. All other generic parameters of the requirement map over to generic parameters of the witness thunk, possibly at a different depth. The requirement's \texttt{Self} conformance is always a concrete conformance, even in the covariant \texttt{Self} case, because \texttt{Self} is subject to a superclass requirement in that case. All other conformance requirements of the requirement's generic signature remain abstract. + +The \textbf{witness thunk generic signature} is constructed by stitching together the generic signature of the conformance context with the generic signature of the protocol requirement. + +\begin{algorithm}[Build the requirement to witness thunk substitution map] As input, takes a normal conformance~\texttt{N}, the generic signature of a protocol requirement~\texttt{G}, and a flag indicating if the witness has a covariant class \texttt{Self} type,~\texttt{F}. Outputs a substitution map for \texttt{G}. +\begin{enumerate} +\item Initialize \texttt{R} to an empty list of replacement types. +\item Initialize \texttt{C} to an empty list of conformances. +\item (Remapping) First compute the depth at which non-\texttt{Self} generic parameters of \texttt{G} appear in the witness thunk signature. Let $\texttt{G}'$ be the generic signature of \texttt{N}, and let \texttt{D} be one greater than the depth of the last generic parameter of $\texttt{G}'$. If $\texttt{G}'$ has no generic parameters, set $\texttt{D}=0$. If \texttt{F} is set, increment \texttt{d} again. +\item (Self replacement) If \texttt{F} is set, record the replacement $\ttgp{0}{0} := \ttgp{0}{0}$ in \texttt{R}. Otherwise, let \texttt{T} be the type of \texttt{N}, and record the replacement $\ttgp{0}{0} := \texttt{T}$. +\item (Remaining replacements) Any remaining generic parameters of \texttt{G} must have a depth of 1. For each remaining generic parameter \ttgp{1}{i}, record the replacement $\ttgp{1}{i}~:=~\ttgp{D}{i}$. +\item (Self conformance) If \texttt{F} is set, build a substitution map $\texttt{S}$ for $\texttt{G}'$ mapping each generic parameter \ttgp{d}{i} to \ttgp{(d+1)}{i}. Apply this substitution map to \texttt{N} to get a specialized conformance, and record this specialized conformance in \texttt{C}. +\item (Self conformance) Otherwise if \texttt{F} is not set, just record \texttt{N} in \texttt{C}. +\item (Remaining conformances) Any remaining conformance requirements in \texttt{G} have a subject type rooted in a generic parameter at depth~1. For each remaining conformance requirement \texttt{T:~P}, record an abstract conformance to \texttt{P} in \texttt{C}. Abstract conformances do not store a conforming type, but if they did, the same remapping process would be applied here. +\item (Return) Build a substitution map for \texttt{G} from \texttt{R} and \texttt{C}. +\end{enumerate} +\end{algorithm} + +\begin{algorithm}[Build the witness thunk generic signature] As input, takes a normal conformance~\texttt{N}, the generic signature of a protocol requirement~\texttt{G}, and a flag indicating if the witness has a covariant class \texttt{Self} type,~\texttt{F}. Outputs a substitution map for \texttt{G}. +\begin{enumerate} +\item Initialize \texttt{P} to an empty list of generic parameter types. +\item Initialize \texttt{R} to an empty list of generic requirements. +\item (Remapping) First compute the depth at which non-\texttt{Self} generic parameters of \texttt{G} appear in the witness thunk signature. Let $\texttt{G}'$ be the generic signature of \texttt{N}, and let \texttt{d} be one greater than the depth of the last generic parameter of $\texttt{G}'$. If $\texttt{G}'$ has no generic parameters, set $\texttt{d}=0$. If \texttt{F} is set, increment \texttt{d} again. +\item If \texttt{F} is set, we must first introduce a generic parameter and superclass requirement for the covariant \texttt{Self} type: +\begin{enumerate} +\item (Self parameter) Add the generic parameter \ttgp{0}{0} to \texttt{P}. This generic parameter will represent the covariant \texttt{Self} type. +\item (Remap Self type) Build a substitution map for $\texttt{G}'$ mapping each generic parameter \ttgp{d}{i} to \ttgp{(d+1)}{i}. Apply this substitution map to the type of \texttt{N}, and call the result \texttt{T}. +\item (Self requirement) Add a superclass requirement \texttt{\ttgp{0}{0}:\ T} to \texttt{R}. +\item (Context generic parameters) For each generic parameter \ttgp{d}{i} in $\texttt{G}'$, add the generic parameter \ttgp{(d+1)}{i} to \texttt{P}. +\item (Context generic requirements) For each requirement of $\texttt{G}'$, apply \texttt{S} to the requirement and add the substituted requirement to \texttt{R}. +\end{enumerate} +\item If \texttt{F} is not set, the generic parameters and requirements of the conformance context carry over unchanged: +\begin{enumerate} +\item (Context generic parameters) Add all generic parameters of $\texttt{G}'$ to \texttt{P}. +\item (Context generic requirements) Add all generic requirements of $\texttt{G}'$ to \texttt{R}. +\end{enumerate} +\item (Remaining generic parameters) All non-\texttt{Self} generic parameters of \texttt{G} must have a depth of 1. For each remaining generic parameter \ttgp{1}{i}, add \ttgp{D}{i} to \texttt{P}. +\item (Trivial case) If no generic parameters have been added to \texttt{P} so far, the witness thunk generic signature is empty. Return. +\item (Remaining generic requirements) For each generic requirement of \texttt{G}, apply the requirement to witness thunk substitution map to the requirement, and add the substituted requirement to \texttt{R}. +\item (Return) Build a minimized generic signature from \texttt{P} and \texttt{R} and return the result. + +\end{enumerate} +\end{algorithm} + +\vfill +\eject + +\begin{example} If the neither the conforming type nor the witness is generic, and there is no covariant \texttt{Self} parameter, the witness thunk signature is trivial. +\begin{Verbatim} +protocol Animal { + associatedtype CommodityType: Commodity + func produce() -> CommodityType +} + +struct Chicken: Animal { + func produce() -> Egg {...} +} +\end{Verbatim} +\begin{description} +\item[Witness thunk signature] None. +\item[Witness generic signature] None. +\item[Witness substitution map] None. +\item[Requirement generic signature] \vphantom{a} +\begin{quote} +\texttt{} +\end{quote} +\item[Requirement substitution map] The protocol requirement does not have its own generic parameter list, but it still inherits a generic signature from the protocol declaration. +\[ +\SubstMapC{ +\SubstType{Self}{Chicken} +}{ +\SubstConf{Self}{Chicken}{Animal} +} +\] +\end{description} +\end{example} + +\vfill +\eject + +\begin{example} Generic conforming type. +\begin{Verbatim} +protocol Habitat { + associatedtype AnimalType: Animal + func adopt(_: AnimalType) +} + +struct Barn: Habitat { + func adopt(_: AnimalType) {...} +} +\end{Verbatim} +\begin{description} +\item[Witness thunk signature] \vphantom{a} +\begin{quote} +\texttt{<\ttgp{0}{0}, \ttgp{0}{1} where \ttgp{0}{0}:\ AnimalType>} +\end{quote} +\item[Witness generic signature] \vphantom{a} +\begin{quote} +\texttt{} +\end{quote} +\item[Witness substitution map] This is actually the identity substitution map because each generic parameter is replaced with its canonical form. +\[ +\SubstMapC{ +\SubstType{AnimalType}{\ttgp{0}{0}}\\ +\SubstType{StallType}{\ttgp{0}{1}} +}{ +\SubstConf{AnimalType}{AnimalType}{Animal} +} +\] + +\item[Requirement generic signature] \vphantom{a} +\begin{quote} +\texttt{} +\end{quote} +\item[Requirement substitution map] \phantom{a} +\[ +\SubstMapC{ +\SubstType{Self}{Barn<\ttgp{0}{0}, \ttgp{0}{1}>} +}{ +\SubstConf{Self}{Barn<\ttgp{0}{0}, \ttgp{0}{1}>}{Habitat} +} +\] +\end{description} +\end{example} + +\vfill +\eject + +\begin{example} Conditional conformance. +\begin{Verbatim} +struct Dictionary {...} + +extension Dictionary: Equatable where Value: Equatable { + static func ==(lhs: Self, rhs: Self) -> Bool {...} +} +\end{Verbatim} +\begin{description} +\item[Witness thunk signature] \vphantom{a} +\begin{quote} +\texttt{<\ttgp{0}{0}, \ttgp{0}{1} where \ttgp{0}{0}:\ Hashable, \ttgp{0}{1}:\ Equatable>} +\end{quote} +\item[Witness generic signature] \vphantom{a} +\begin{quote} +\texttt{} +\end{quote} +\item[Witness substitution map] This is again the identity substitution map because each generic parameter is replaced with its canonical form. +\[ +\SubstMapLongC{ +\SubstType{Key}{\ttgp{0}{0}}\\ +\SubstType{Value}{\ttgp{0}{1}} +}{ +\SubstConf{Key}{\ttgp{0}{0}}{Hashable}\\ +\SubstConf{Value}{\ttgp{0}{1}}{Equatable} +} +\] + +\item[Requirement generic signature] \vphantom{a} +\begin{quote} +\texttt{} +\end{quote} +\item[Requirement substitution map] \vphantom{a} +\[ +\SubstMapC{ +\SubstType{Self}{Dictionary<\ttgp{0}{0}, \ttgp{0}{1}>} +}{ +\SubstConf{Self}{Dictionary<\ttgp{0}{0}, \ttgp{0}{1}>}{Equatable}\\ +\text{with conditional requirement \texttt{\ttgp{0}{1}:\ Equatable}} +} +\] +\end{description} +\end{example} + +\vfill +\eject + +\begin{example} Witness is in a protocol extension. +\begin{Verbatim} +protocol Shape { + var children: [any Shape] +} + +protocol PrimitiveShape:\ Shape { + var children: [any Shape] { return [] } +} + +struct Empty: PrimitiveShape {} +\end{Verbatim} +\begin{description} +\item[Witness thunk signature] None. +\item[Witness generic signature] \vphantom{a} +\begin{quote} +\texttt{} +\end{quote} +\item[Witness substitution map] \vphantom{a} +\[ +\SubstMapC{ +\SubstType{Self}{Empty} +}{ +\SubstConf{Self}{Empty}{PrimitiveShape} +} +\] + +\item[Requirement generic signature] \vphantom{a} +\begin{quote} +\texttt{} +\end{quote} +\item[Requirement substitution map] \phantom{a} +\[ +\SubstMapC{ +\SubstType{Self}{Empty} +}{ +\SubstConf{Self}{Empty}{Shape} +} +\] +\end{description} +\end{example} + +\vfill +\eject + +\begin{example} Conforming type is a generic class, and the witness is in a protocol extension. +\begin{Verbatim} +protocol Cloneable { + init(from: Self) + func clone() -> Self +} + +extension Cloneable { + func clone() -> Self { + return Self(from: self) + } +} + +class Box: Cloneable { + var contents: Contents + + required init(from other: Self) { + self.contents = other.contents + } +} +\end{Verbatim} +\begin{description} +\item[Witness thunk signature] \vphantom{a} +\begin{quote} +\texttt{<\ttgp{0}{0}, \ttgp{1}{0} where \ttgp{0}{0}:\ Box<\ttgp{1}{0}>>} +\end{quote} +\item[Witness generic signature] \vphantom{a} +\begin{quote} +\texttt{} +\end{quote} +\item[Witness substitution map] \vphantom{a} +\[ +\SubstMapC{ +\SubstType{Self}{\ttgp{0}{0}} +}{ +\SubstConf{Self}{Box<\ttgp{1}{0}>}{Cloneable} +} +\] + +\item[Requirement generic signature] \vphantom{a} +\begin{quote} +\texttt{} +\end{quote} +\item[Requirement substitution map] \phantom{a} +\[ +\SubstMapC{ +\SubstType{Self}{\ttgp{0}{0}} +}{ +\SubstConf{Self}{Box<\ttgp{1}{0}>}{Cloneable} +} +\] +\end{description} +\end{example} + +\vfill +\eject + +\begin{example} Requirement is generic. +\begin{Verbatim} +protocol Q {} + +protocol P { + func f(_: A) +} + +struct Outer { + struct Inner: P { + func f(_: A) {} + } +} +\end{Verbatim} +\begin{description} +\item[Witness thunk signature] \vphantom{a} +\begin{quote} +\texttt{<\ttgp{0}{0}, \ttgp{1}{0}, \ttgp{2}{0} where \ttgp{2}{0}:\ Q>} +\end{quote} +\item[Witness generic signature] \vphantom{a} +\begin{quote} +\texttt{} +\end{quote} +\item[Witness substitution map] \vphantom{a} +\[ +\SubstMapC{ +\SubstType{T}{\ttgp{0}{0}}\\ +\SubstType{U}{\ttgp{1}{0}}\\ +\SubstType{A}{\ttgp{2}{0}} +}{ +\SubstConf{A}{\ttgp{2}{0}}{Q} +} +\] + +\item[Requirement generic signature] \vphantom{a} +\begin{quote} +\texttt{} +\end{quote} +\item[Requirement substitution map] \phantom{a} +\[ +\SubstMapC{ +\SubstType{Self}{Outer<\ttgp{0}{0}>.Inner<\ttgp{1}{0}>}\\ +\SubstType{A}{\ttgp{2}{0}} +}{ +\SubstConf{A}{\ttgp{2}{0}}{Q} +} +\] +\end{description} +\end{example} + +\vfill +\eject + +\begin{example} Witness is more generic than the requirement. +\begin{Verbatim} +protocol P { + associatedtype A: Equatable + associatedtype B: Equatable + + func f(_: A, _: B) +} + +struct S: P { + typealias B = Int + + func f(_: T, _: U) {} +} +\end{Verbatim} +The type witness for \texttt{A} is the generic parameter \texttt{A}, and the type witness for \texttt{B} is the concrete type \texttt{Int}. +The witness \texttt{S.f()} for \texttt{P.f()} is generic, and can be called with any two types that conform to \texttt{Equatable}. Since the type witnesses for \texttt{A} and \texttt{B} are both \texttt{Equatable}, a fixed specialization of \texttt{S.f()} witnesses \texttt{P.f()}. + +\begin{description} +\item[Witness thunk signature] \vphantom{a} +\begin{quote} +\texttt{<\ttgp{0}{0} where \ttgp{0}{0}:\ Equatable>} +\end{quote} +\item[Witness generic signature] \vphantom{a} +\begin{quote} +\texttt{} +\end{quote} +\item[Witness substitution map] \vphantom{a} +\[ +\SubstMapC{ +\SubstType{A}{\ttgp{0}{0}}\\ +\SubstType{T}{\ttgp{0}{0}}\\ +\SubstType{U}{Int} +}{ +\SubstConf{A}{\ttgp{0}{0}}{Equatable}\\ +\SubstConf{T}{\ttgp{0}{0}}{Equatable}\\ +\SubstConf{U}{Int}{Equatable} +} +\] + +\item[Requirement generic signature] \vphantom{a} +\begin{quote} +\texttt{} +\end{quote} +\item[Requirement substitution map] \phantom{a} +\[ +\SubstMapC{ +\SubstType{Self}{S<\ttgp{0}{0}>} +}{ +\SubstConf{Self}{S<\ttgp{0}{0}>}{P} +} +\] +\end{description} +\end{example} + +\fi + +\section[]{Source Code Reference} \iffalse @@ -264,4 +881,4 @@ \section{Source Code Reference} \fi -\end{document} \ No newline at end of file +\end{document} diff --git a/docs/Generics/chapters/compilation-model.tex b/docs/Generics/chapters/compilation-model.tex index 812a2d0895b8b..675d1a2b6ece1 100644 --- a/docs/Generics/chapters/compilation-model.tex +++ b/docs/Generics/chapters/compilation-model.tex @@ -17,11 +17,11 @@ \chapter{Compilation Model}\label{compilation model} Executables must define a \emph{main function}, which is the entry point invoked when the executable is run. There are three mechanisms for doing so: \begin{enumerate} \item If only a single source file was provided, this file becomes the \emph{main source file} of the module. If there are multiple source files and one of them is named \texttt{main.swift}, then this file becomes the main source file. The main source file is special, in that it can contain statements at the top level, outside of a function body. Top-level statements are collected into \emph{top-level code declarations}, and the frontend generates a main function which executes each top-level code declaration in source order. Source files other than the main source file cannot contain statements at the top level. -\item In the absence of a main source file, a struct, enum or class declaration can instead be annotated with the \texttt{@main} attribute, in which case the declaration must contain a static method named \texttt{main()}. This method becomes the main entry point. This attribute was introduced in Swift 5.3~\cite{se0281}. -\item The \texttt{@NSApplicationMain} and \texttt{@UIApplicationMain} attributes are an older way to specify the main entry point on Apple platforms. If one of these attributes is attached to a class conforming to the \texttt{NSApplicationMain} or \texttt{UIApplicationMain} protocol, a main entry point is generated which calls the \texttt{NSApplicationMain()} or \texttt{UIApplicationMain()} system framework function. +\item In the absence of a main source file, a struct, enum or class declaration can instead be annotated with the \texttt{@main} attribute, in which case the declaration must contain a static method named \texttt{main()}. This method becomes the main entry point. This attribute was introduced in \IndexSwift{5.3}Swift 5.3~\cite{se0281}. +\item The older \texttt{@NSApplicationMain} and \texttt{@UIApplicationMain} attributes, deprecated since \IndexSwift{5.a@5.10}Swift 5.10~\cite{se0383}, provided a similar mechanism specific to Apple platforms. Attaching one of these attributes to a class conforming to \texttt{NSApplicationMain} or \texttt{UIApplicationMain}, respectively, will generate a main entry point which calls the \texttt{NSApplicationMain()} or \texttt{UIApplicationMain()} system framework function. \end{enumerate} -To build a \index{framework}framework (Apple jargon for a \index{shared library}shared library), the driver is invoked with the \texttt{-emit-library} and \texttt{-emit-module} flags instead, which generates a shared library, together with the serialized module file consumed by the compiler when importing the framework (Section~\ref{module system}): +Invoking the driver with the \IndexFlag{emit-library}\texttt{-emit-library} and \IndexFlag{emit-module}\texttt{-emit-module} flags instructs it to generate a shared library, together with the serialized module file consumed by the compiler when importing the library (\SecRef{module system}): \begin{Verbatim} $ swiftc algorithm.swift utils.swift -module-name SudokuSolver -emit-library -emit-module @@ -31,16 +31,16 @@ \chapter{Compilation Model}\label{compilation model} The \IndexDefinition{Swift frontend}Swift frontend itself is single-threaded, but the driver can benefit from multi-core concurrency by running multiple \IndexDefinition{frontend job}frontend jobs in parallel. Each frontend job compiles one or more source files; these are the \IndexDefinition{primary file}\emph{primary source files} of the frontend job. All non-primary source files are the \IndexDefinition{secondary file}\emph{secondary source files} of the frontend job. The assignment of primary source files to each frontend job is determined by the \emph{compilation mode}: \begin{itemize} \item The \IndexFlag{wmo}\texttt{-wmo} driver flag selects \IndexDefinition{whole module optimization}\emph{whole module mode}, typically used for \index{release build}release builds. In this mode, the driver schedules a single frontend job. The primary files of this job are all the source files in the main module, and there are no secondary files. In whole module mode, the frontend is able to perform more aggressive optimization across source file boundaries, hence its usage for release builds. -\item The \IndexFlag{disable-batch-mode}\texttt{-disable-batch-mode} driver flag selects \IndexDefinition{single file mode}\emph{single file mode}, with one frontend job per source file. In this mode, each frontend job has a single primary file, with all other files being secondary files. Single file mode was the default for \index{debug build}debug builds until Swift~4.1, however these days it is only used for testing the compiler. +\item The \IndexFlag{disable-batch-mode}\texttt{-disable-batch-mode} driver flag selects \IndexDefinition{single file mode}\emph{single file mode}, with one frontend job per source file. In this mode, each frontend job has a single primary file, with all other files being secondary files. Single file mode was the default for \index{debug build}debug builds until \IndexSwift{4.1}Swift~4.1, however these days it is only used for testing the compiler. Single file mode incurs inexorable overhead in the form of duplicated work between frontend jobs; if two source files reference the same declaration in a third source file, the two frontend jobs will both need to parse and type check this declaration as there is no caching across frontend jobs (the next two sections detail how the frontend deals with secondary files, with delayed parsing and the request evaluator respectively). \item The \IndexFlag{enable-batch-mode}\texttt{-enable-batch-mode} driver flag selects \IndexDefinition{batch mode}\emph{batch mode}, which is a happy medium between whole module and single file mode. In batch mode, the list of source files is partitioned into fixed-size batches, up to the maximum batch size. The source files in each batch become the primary files of each frontend job. -By compiling multiple primary files in a single frontend job, batch mode amortizes the cost of parsing and type checking work performed on secondary files. At the same time, it still schedules multiple frontend jobs for parallelism on multi-core systems. \index{history}Batch mode was first introduced in Swift 4.2, and is now the default for debug builds. +By compiling multiple primary files in a single frontend job, batch mode amortizes the cost of parsing and type checking work performed on secondary files. At the same time, it still schedules multiple frontend jobs for parallelism on multi-core systems. Batch mode was first introduced in \IndexSwift{4.2}Swift 4.2, and is now the default for debug builds. \end{itemize} -Note that each source file is a primary source file of exactly one frontend job, and within a single frontend job, the primary files and secondary files together form the full list of source files in the module. A single source file is therefore the minimum unit of parallelism. By default, the number of concurrent frontend jobs is determined by the number of CPU cores; this can be overridden with the \IndexFlag{j}\texttt{-j} driver flag. If there are more frontend jobs than can be run simultaneously, the driver queues them and kicks them off as other frontend jobs complete. In batch mode and single file mode, the driver can also perform an \index{incremental build}\emph{incremental build} by re-using the result of previous compilations, providing an additional compile-time speedup. Incremental builds are described in Section~\ref{request evaluator}. +Note that each source file is a primary source file of exactly one frontend job, and within a single frontend job, the primary files and secondary files together form the full list of source files in the module. A single source file is therefore the minimum unit of parallelism. By default, the number of concurrent frontend jobs is determined by the number of CPU cores; this can be overridden with the \IndexFlag{j}\texttt{-j} driver flag. If there are more frontend jobs than can be run simultaneously, the driver queues them and kicks them off as other frontend jobs complete. In batch mode and single file mode, the driver can also perform an \index{incremental build}\emph{incremental build} by re-using the result of previous compilations, providing an additional compile-time speedup. Incremental builds are described in \SecRef{request evaluator}. The \index[flags]{###@\texttt{-\#\#\#}}\verb|-###| driver flag performs a ``dry run'' which prints all commands to run without actually doing anything. In the below example, the driver schedules three frontend jobs, with each job having a single primary source file and two secondary files. The final command is the linker invocation, which combines the output of each frontend job into our binary executable. \begin{Verbatim} @@ -52,16 +52,16 @@ \chapter{Compilation Model}\label{compilation model} \end{Verbatim} \paragraph{Compilation pipeline.} -The Swift frontend implements a classic multi-stage compiler pipeline, shown in Figure~\ref{compilerpipeline}: +\FigRef{compilerpipeline} shows a high-level view of the Swift frontend; this resembles the classic multi-pass compiler design, described in \cite{muchnick1997advanced} or \cite{cooper2004engineering} for example: \begin{itemize} -\item \IndexDefinition{parser}\textbf{Parse:} First, all source files are parsed to form the \IndexDefinition{abstract syntax tree}\index{AST|see{abstract syntax tree}}\index{syntax tree|see{abstract syntax tree}}abstract syntax \index{tree}tree. -\item \IndexDefinition{Sema}\textbf{Sema:} Semantic analysis type-checks and validates the abstract syntax tree. -\item \IndexDefinition{SILGen}\textbf{SILGen:} The type-checked syntax tree is lowered to \IndexDefinition{raw SIL}``raw SIL.'' SIL is the Swift Intermediate Language, described in \cite{sil} and \cite{siltalk}. +\item \IndexDefinition{parser}\textbf{Parse:} Source files are parsed, building the \IndexDefinition{abstract syntax tree}\index{AST|see{abstract syntax tree}}\index{syntax tree|see{abstract syntax tree}}abstract syntax \index{tree}tree. +\item \IndexDefinition{Sema}\textbf{Sema:} Semantic analysis is performed, producing a type-checked syntax tree. (We'll see shortly the first two stages are not completely sequential.) +\item \IndexDefinition{SILGen}\textbf{SILGen:} The type-checked syntax tree is lowered to \IndexDefinition{raw SIL}``raw SIL.'' \IndexDefinition{SIL}SIL is the Swift Intermediate Language, described in \cite{sil} and \cite{siltalk}. \item \IndexDefinition{SIL optimizer}\textbf{SILOptimizer:} The raw SIL is transformed into \IndexDefinition{canonical SIL}``canonical SIL'' by a series of \IndexDefinition{SIL mandatory pass}\emph{mandatory passes}, which analyze the control flow graph and emit diagnostics; for example, \IndexDefinition{definite initialization}\emph{definite initialization} ensures that all storage locations are initialized. -When the \IndexFlag{O}\texttt{-O} command line flag is specified, the canonical SIL is further optimized by a series of \IndexDefinition{SIL performance pass}\emph{performance passes} with the goal of improving run-time performance and reducing code size. -\item \IndexDefinition{IRGen}\textbf{IRGen:} The optimized SIL is then transformed into LLVM IR. (LLVM is, of course, the project formerly known as the ``Low Level Virtual Machine \cite{llvm}.'') -\item \index{LLVM}\textbf{LLVM:} Finally, the LLVM IR is handed off to LLVM, which performs various lower level optimizations before generating machine code. +When the \IndexFlag{O}\texttt{-O} command line flag is specified, the canonical SIL is optimized by a series of \IndexDefinition{SIL performance pass}\emph{performance passes} to improve run-time performance and code size. +\item \IndexDefinition{IRGen}\textbf{IRGen:} The optimized SIL is then transformed into LLVM IR. +\item \index{LLVM}\textbf{LLVM:} Finally, the LLVM IR is handed off to LLVM, which performs various lower level optimizations before generating machine code. (LLVM is, of course, the project formerly known as the ``Low Level Virtual Machine \cite{llvm}.'') \end{itemize} \begin{figure}\captionabove{The compilation pipeline}\label{compilerpipeline} @@ -97,26 +97,26 @@ \chapter{Compilation Model}\label{compilation model} \item \IndexFlag{S}\texttt{-S} prints the \index{assembly language}assembly output by LLVM. \end{itemize} -Each pipeline phase can emit \index{warning}warnings and \index{error}errors, collectively known as \index{diagnostic}\emph{diagnostics}. The parser attempts to recover from errors; the presence of parse errors does not prevent Sema from running. On the other hand, if Sema emits errors, compilation stops; SILGen does not attempt to lower an invalid abstract syntax tree to SIL (but SILGen can emit its own diagnostics, including those that result from lazy type checking of declarations in secondary files). +Each pipeline phase can emit \index{warning}warnings and \index{error}errors, collectively known as \IndexDefinition{diagnostic}\emph{diagnostics}. The parser attempts to recover from errors; the presence of parse errors does not prevent Sema from running. On the other hand, if Sema emits errors, compilation stops; SILGen does not attempt to lower an invalid abstract syntax tree to SIL (but SILGen can emit its own diagnostics, including those that result from lazy type checking of declarations in secondary files). \index{TBD} \index{textual interface} -The compilation pipeline will vary slightly depending on what the driver and frontend were asked to produce. When the frontend is instructed to emit a serialized module file only, and not an object file, compilation stops after the SIL optimizer. When generating a textual interface file or TBD file, compilation stops after Sema. (Textual interfaces are discussed in Section~\ref{module system}. A TBD file is a list of symbols in a shared library, which can be consumed by the linker and is faster to generate than the shared library itself; we're not going to talk about them here.) +The compilation pipeline will vary slightly depending on what the driver and frontend were asked to produce. When the frontend is instructed to emit a serialized module file only, and not an object file, compilation stops after the SIL optimizer. When generating a textual interface file or TBD file, compilation stops after Sema. (Textual interfaces are discussed in \SecRef{module system}. A TBD file is a list of symbols in a shared library, which can be consumed by the linker and is faster to generate than the shared library itself; we're not going to talk about them here.) \paragraph{Frontend flags.} \index{frontend flag} -The command line flags listed above are understood by both the driver and the frontend; the driver passes them down to the frontend. Various other flags used for compiler development and debugging and only known to the frontend. If the driver is invoked with the \IndexFlag{frontend}\texttt{-frontend} flag as the first command line flag, then instead of scheduling frontend jobs, the driver spawns a single frontend job, passing it the rest of the command line without further processing: +The flags we listed above for dumping various stages of compiler output are understood by both the driver and the frontend; the driver passes them down to the frontend. Various other flags used for compiler development and debugging and only known to the frontend. If the driver is invoked with the \IndexFlag{frontend}\texttt{-frontend} flag as the first command line flag, then instead of scheduling frontend jobs, the driver spawns a single frontend job, passing it the rest of the command line without further processing: \begin{Verbatim} $ swiftc -frontend -typecheck -primary-file a.swift b.swift \end{Verbatim} -Another mechanism for passing flags to the frontend is the \IndexFlag{Xfrontend}\texttt{-Xfrontend} flag. When this flag appears in a command-line invocation of the driver, the driver schedules job as usual, but the command line argument that comes immediately after is passed directly to the frontend: +Another mechanism for passing flags to the frontend is the \IndexFlag{Xfrontend}\texttt{-Xfrontend} flag. When this flag appears in a command-line invocation of the driver, the driver schedules job as usual, but the command line argument that comes immediately after is passed directly to each frontend job: \begin{Verbatim} $ swiftc a.swift b.swift -Xfrontend -dump-requirement-machine \end{Verbatim} \section{Name Lookup}\label{name lookup} -\IndexDefinition{name lookup}Name lookup is the process of resolving identifiers to declarations. The Swift compiler does not have a distinct ``name binding'' phase; instead, name lookup is queried from various points in the frontend process. Broadly speaking, there are two kinds of name lookup: \IndexDefinition{unqualified lookup}\emph{unqualified lookup} and \IndexDefinition{qualified lookup}\emph{qualified lookup}. An unqualified lookup resolves a single \index{identifier}identifier ``\texttt{foo}'', while qualified lookup resolves an identifier ``\texttt{bar}'' relative to a base, such as a member reference expression ``\texttt{foo.bar}''. There are also three important variants of these two fundamental kinds, for looking up top-level declartions in other modules, resolving operators, and performing dynamic lookups of Objective-C methods. +\IndexDefinition{name lookup}Name lookup is the process of resolving identifiers to declarations. The Swift compiler does not have a distinct ``name binding'' phase; instead, name lookup is queried from various points in the frontend process. Broadly speaking, there are two kinds of name lookup: \IndexDefinition{unqualified lookup}\emph{unqualified lookup} and \IndexDefinition{qualified lookup}\emph{qualified lookup}. An unqualified lookup resolves a single \index{identifier}identifier ``\texttt{foo}'', while qualified lookup resolves an identifier ``\texttt{bar}'' relative to a base, such as a \index{member reference expression}member reference expression ``\texttt{foo.bar}''. There are also three important variants of these two fundamental kinds, for looking up top-level declarations in other modules, resolving operators, and performing dynamic lookups of Objective-C methods. \paragraph{Unqualified lookup.} An unqualified lookup is always performed relative to the \index{source location}source location where the \index{identifier}identifier actually appears. The source location may either be in a primary or secondary file. @@ -125,14 +125,13 @@ \section{Name Lookup}\label{name lookup} Unqualified lookup first finds the innermost scope containing the source location, and proceeds to walk the scope tree up to the root, searching each parent node for bindings named by the given identifier. If the lookup reaches the root node, a \IndexDefinition{top-level lookup}\emph{top-level lookup} is performed next. This will look for top-level declarations named by the given identifier, first in all source files of the main module, followed by all imported modules. -The \IndexFlag{dump-scope-maps}\texttt{-dump-scope-maps} frontend flag dumps the scope map for each source file in the main module. Listing~\ref{dump scope map example} shows a simple program together with its scope map. - -\begin{listing}\captionabove{Example \texttt{-dump-scope-maps} output}\label{dump scope map example} +The \IndexFlag{dump-scope-maps}\texttt{-dump-scope-maps} frontend flag dumps the scope map for each source file in the main module. For example, with this program: \begin{Verbatim} func id(_ t: T) -> T { return t } \end{Verbatim} +We get the scope map below: \begin{Verbatim}[fontsize=\scriptsize,numbers=none] ASTSourceFileScope 0x14c131908, [1:1 - 5:1] 'id.swift' `-AbstractFunctionDeclScope 0x14c1392c0, [1:1 - 4:1] 'id(_:)' @@ -143,10 +142,11 @@ \section{Name Lookup}\label{name lookup} `-PatternEntryDeclScope 0x14c139450, [2:7 - 4:1] entry 0 'x' `-PatternEntryInitializerScope 0x14c139450, [2:11 - 2:11] entry 0 'x' \end{Verbatim} -\end{listing} + +Unqualified lookup is an important part of type resolution (\SecRef{identtyperepr}). \paragraph{Qualified lookup.} -A qualified lookup looks within a list of type declarations for members with a given name. Starting from an initial list of type declarations, qualified lookup also visits the superclass of a class declaration, and conformed protocols. The more primitive operation performed at each step is called a \index{direct lookup}\emph{direct lookup}, which searches inside a single type declaration and its extensions only, by consulting the type declaration's \index{member lookup table}\emph{member lookup table}. Direct lookup is explained in detail in Section~\ref{extension binding}. +A qualified lookup searches in a list of type declarations for a member with the given name. Qualified lookup recursively visits the conformed protocols of each struct, enum and class declaration, and the superclass of each class declaration. If the found member was from a protocol or superclass, we apply a substitution map, which will be described in \SecRef{member type repr}. The primitive operation, which searches inside a single type declaration and its extensions only, is called \index{direct lookup}\emph{direct lookup} (\SecRef{direct lookup}). \paragraph{Module lookup.} \IndexDefinition{module lookup}A qualified lookup where the base is a module declaration searches for a top-level declaration in the given module and any other modules that it re-exports via \texttt{@\_exported import}. @@ -196,9 +196,9 @@ \section{Name Lookup}\label{name lookup} let fn = { ($0 ++ $1) as Bool } \end{Verbatim} \end{listing} -Listing~\ref{customops} shows the definition of some custom operators and precedence groups. Note that the overload of \texttt{++} inside struct \texttt{Chicken} returns \texttt{Int}, and the overload of \texttt{++} inside struct \texttt{Sausage} returns \texttt{Bool}. The closure value stored in \texttt{fn} applies \texttt{++} to two anonymous closure parameters, \verb|$0| and \verb|$1|. While they do not have declared types, by simply coercing the \emph{return type} to \texttt{Bool}, we are able to unambiguously pick the overload of \texttt{++} declared in \texttt{Sausage}. (Whether this is good style is left to the reader to judge.) +\ListingRef{customops} shows the definition of some custom operators and precedence groups. Note that the overload of \texttt{++} inside struct \texttt{Chicken} returns \texttt{Int}, and the overload of \texttt{++} inside struct \texttt{Sausage} returns \texttt{Bool}. The closure value stored in \texttt{fn} applies \texttt{++} to two anonymous closure parameters, \verb|$0| and \verb|$1|. While they do not have declared types, by simply coercing the \emph{return type} to \texttt{Bool}, we are able to unambiguously pick the overload of \texttt{++} declared in \texttt{Sausage}. (Whether this is good style is left to the reader to judge.) -Initially, infix operators defined their precedence as an integer value; \index{history}Swift~3 introduced named precedence groups \cite{se0077}. The global lookup for operator functions dates back to when all operator functions were declared at the top level. Swift~3 also introduced the ability to declare operator functions as members of types, but the global lookup behavior was retained \cite{se0091}. +Initially, infix operators defined their precedence as an integer value; \IndexSwift{3.0}Swift~3 introduced named precedence groups \cite{se0077}. The global lookup for operator functions dates back to when all operator functions were declared at the top level. Swift~3 also introduced the ability to declare operator functions as members of types, but the global lookup behavior was retained \cite{se0091}. \section{Delayed Parsing}\label{delayed parsing} @@ -248,7 +248,7 @@ \section{Delayed Parsing}\label{delayed parsing} \end{listing} \begin{example}\label{anyobjectdelayedparseex} -Listing~\ref{anyobjectdelayedparse} shows an example of this behavior. This program consists of three files. Suppose that the driver kicks off three frontend jobs, with a single primary file for each frontend job. +\ListingRef{anyobjectdelayedparse} shows an example of this behavior. This program consists of three files. Suppose that the driver kicks off three frontend jobs, with a single primary file for each frontend job. The frontend jobs each do the following: \begin{itemize} @@ -260,7 +260,7 @@ \section{Delayed Parsing}\label{delayed parsing} \section{Request Evaluator}\label{request evaluator} -The \IndexDefinition{request evaluator}\emph{request evaluator} generalizes the idea behind delayed parsing to all of type checking. For various reasons, the classic compiler design, where a single semantic analysis pass walks declarations in source order, is not well-suited for Swift: +The \IndexDefinition{request evaluator}\emph{request evaluator} generalizes the idea behind delayed parsing to all of type checking. As with parsing, the classic compiler design, where a single semantic analysis pass walks declarations in source order, is not well-suited for Swift: \begin{itemize} \item Declarations may be written in any order within a Swift source file, without being \index{forward reference}forward declared (unlike \index{Pascal}Pascal or \index{C}C). Expressions and type annotations can also reference declarations in other source files without restriction. Finally, certain kinds of circular references are permitted. @@ -273,7 +273,7 @@ \section{Request Evaluator}\label{request evaluator} Thus, the work of type checking is split up into small, fine-grained \IndexDefinition{request}\emph{requests} which are evaluated on demand, instead of sequentially. There is still a semantic analysis pass that visits the declarations of each primary file in source order, but it merely kicks off requests and emits diagnostics. -Concretely, the request evaluator is a framework for performing queries performed against the \index{abstract syntax tree}abstract syntax tree. A \emph{request} packages a list of input parameters together with an \IndexDefinition{evaluation function}\emph{evaluation function}. With the exception of emitting diagnostics, the evaluation function should be referentially transparent. Only the request evaluator should directly invoke the evaluation function; the request evaluator caches the result of the evaluation function for subsequent requests. As well as caching results, the request evaluator implements automatic cycle detection, and dependency tracking for incremental builds. +Concretely, the request evaluator is a framework for performing queries performed against the \index{abstract syntax tree}abstract syntax tree. A \emph{request} packages a list of input parameters together with an \IndexDefinition{evaluation function}\emph{evaluation function}. With the exception of emitting diagnostics, the evaluation function should be referentially transparent. Only the request evaluator should directly invoke the evaluation function; the request evaluator caches the result of the evaluation function for subsequent requests. The request evaluator also detects request cycles automatically, and tracks dependency information for incremental builds. \IndexDefinition{type-check source file request} \IndexDefinition{AST lowering request} @@ -284,10 +284,10 @@ \section{Request Evaluator}\label{request evaluator} The Swift frontend defines several hundred request kinds; for our purposes, the most important are: \begin{itemize} \item The \Request{type-check source file request} visits each declaration in a primary source file. It is responsible for kicking off enough requests to ensure that SILGen can proceed if all requests succeeded without emitting diagnostics. -\item The \Request{AST lowering request} is the entry point into SILGen, generating SIL from the abstract syntax tree for a source file. +\item The \Request{AST lowering request} is the entry point into \index{SILGen}SILGen, generating SIL from the abstract syntax tree for a source file. \item The \Request{unqualified lookup request} and \Request{qualified lookup request} perform the two kinds of name lookup described in the previous section. -\item The \Request{interface type request} is explained in Chapter~\ref{decls}. -\item The \Request{generic signature request} is explained in Chapter~\ref{building generic signatures}. +\item The \Request{interface type request} is explained in \ChapRef{decls}. +\item The \Request{generic signature request} is explained in \ChapRef{building generic signatures}. \end{itemize} \begin{example} @@ -297,7 +297,7 @@ \section{Request Evaluator}\label{request evaluator} func cook() -> Food {} struct Food {} \end{Verbatim} -Notice how the initial value expression of the variable references the function, and the function's return type is the struct declared immediately after, so the inferred type of the variable is then this struct. This plays out with the request evaluator: +Notice how the \index{initial value expression}initial value expression of the variable references the function, and the function's return type is the struct declared immediately after, so the inferred type of the variable is then this struct. This plays out with the request evaluator: \begin{enumerate} \item The \Request{type-check source file request} begins by visiting the declaration of \texttt{food} and performing various semantic checks. \item One of these checks evaluates the \Request{interface type request} with the declaration of \texttt{food}. This is a variable declaration, so the evaluation function will type check the initial value expression and return the type of the result. @@ -313,7 +313,7 @@ \section{Request Evaluator}\label{request evaluator} \end{enumerate} \end{example} -The \Request{type-check source file request} is special, because it does not return a value; it is evaluated for the side effect of emitting diagnostics, whereas most other requests return a value. The implementation of the \Request{type-check source file request} guarantees that if no diagnostics were emitted, then SILGen can generate valid SIL for all declarations in a primary file. However, the next example shows that SILGen can still evaluate other requests which result in diagnostics being emitted in secondary files. +The \Request{type-check source file request} is special, because it does not return a value; it is evaluated for the side effect of emitting diagnostics, whereas most other requests return a value. The implementation of the \Request{type-check source file request} guarantees that if no diagnostics were emitted, then \index{SILGen}SILGen can generate valid SIL for all declarations in a primary file. However, the next example shows that SILGen can encounter invalid declarations, and diagnose errors in secondary files. \begin{example} Suppose we run a frontend job with the below primary file: @@ -329,10 +329,10 @@ \section{Request Evaluator}\label{request evaluator} } \end{Verbatim} -Our frontend job does not emit any diagnostics in the semantic analysis pass, because the \texttt{contents} stored property of \texttt{Box} is not actually referenced while type checking the primary file \texttt{a.swift}. However when SILGen runs, it needs to determine whether the parameter of type \texttt{Box} to the \texttt{open()} function needs to be passed directly in registers, or via an address by computing the \emph{type lowering} for the \texttt{Box} type. Type lowering recursively visits the stored properties of \texttt{Box} and computes their type lowering; this evaluates the \index{interface type request}\Request{interface type request} for the \texttt{contents} property of \texttt{Box}, which emits a diagnostic because the identifier \index{identifier}``\texttt{DoesNotExist}'' does not resolve to a valid type. This also means that SILGen must be prepared to deal with a potentially invalid abstract syntax tree. +Our frontend job does not emit any diagnostics in the semantic analysis pass, because the \texttt{contents} stored property of \texttt{Box} is not actually referenced while type checking the primary file \texttt{a.swift}. However when SILGen runs, it needs to determine whether the parameter of type \texttt{Box} to the \texttt{open()} function needs to be passed directly in registers, or via an address by computing the \emph{type lowering} for the \texttt{Box} type. The type lowering procedure recursively computes the type lowering of each stored property of \texttt{Box}; this evaluates the \index{interface type request}\Request{interface type request} for the \texttt{contents} property of \texttt{Box}, which emits a diagnostic because the identifier \index{identifier}``\texttt{DoesNotExist}'' does not resolve to a valid type. The interface type of the stored property then becomes the \index{error type}error type. \end{example} -The request evaluator framework was first introduced in \index{history}Swift~4.2 \cite{reqeval}. In subsequent releases, various ad-hoc mechanisms were gradually converted into request evaluator requests, with resulting gains to compiler performance, stability, and implementation maintainability. +The request evaluator framework was first introduced in \IndexSwift{4.2}Swift~4.2 \cite{reqeval}. In subsequent releases, various ad-hoc mechanisms were gradually converted into request evaluator requests, with resulting gains to compiler performance, stability, and implementation maintainability. \paragraph{Cycles.} In a language supporting \index{forward reference}forward references, it is possible to write a program that is syntactically well-formed, and where all identifiers resolve to valid declarations, but is nonetheless invalid because of circularity. The classic example of this is a pair of classes where each class \index{circular inheritance}inherits from the other: \begin{Verbatim} @@ -383,7 +383,7 @@ \section{Incremental Builds}\label{incremental builds} \item Do an incremental build, which rebuilds some subset of source files in the input program. If a source file was rebuilt but the resulting object file is identical to the one saved in Step~1, the incremental build performed \emph{wasted work}. \item Finally, do another clean build, which yet again rebuilds all source files in the input program. If a source file was rebuilt and the resulting object file is different to the one saved in Step~1, the incremental build was \emph{incorrect}. \end{enumerate} -This highlights the difficulty of the incremental compilation problem. Rebuilding \emph{too many} files is an annoyance; rebuilding \emph{too few} files is an error. A correct but ineffective implementation would rebuild all source files every time. The opposite approach of only rebuilding the subset of source files that have changed since the last compiler invocation is also too aggressive. To see why it is incorrect, consider the program shown in Listing~\ref{incrlisting1}. Let's say the programmer builds the program, adds the overload \verb|f: (Int) -> ()|, then builds it again. The new overload is more specific, so the call \texttt{f(123)} in \texttt{b.swift} now refers to the new overload; therefore, \texttt{b.swift} must also be rebuilt. +This highlights the difficulty of the incremental compilation problem. Rebuilding \emph{too many} files is an annoyance; rebuilding \emph{too few} files is an error. A correct but ineffective implementation would rebuild all source files every time. The opposite approach of only rebuilding the subset of source files that have changed since the last compiler invocation is also too aggressive. To see why it is incorrect, consider the program shown in \ListingRef{incrlisting1}. Let's say the programmer builds the program, adds the overload \verb|f: (Int) -> ()|, then builds it again. The new overload is more specific, so the call \texttt{f(123)} in \texttt{b.swift} now refers to the new overload; therefore, \texttt{b.swift} must also be rebuilt. \begin{listing}\captionabove{Rebuilding a file after adding a new overload}\label{incrlisting1} \begin{Verbatim} // a.swift @@ -439,7 +439,7 @@ \section{Incremental Builds}\label{incremental builds} \end{listing} \begin{example} -To understand how request caching interacts with dependency recording, consider the program shown in Listing~\ref{dependencyexample}. Suppose the driver decides to compile \emph{both} \texttt{a.swift} and \texttt{b.swift} in the same frontend job (in fact, the issue at hand can only appear in \index{batch mode}batch mode, when a frontend job has more than one primary file). First, the \Request{type-check source file request} runs with the source file \texttt{a.swift}. +To understand how request caching interacts with dependency recording, consider the program shown in \ListingRef{dependencyexample}. Suppose the driver decides to compile \emph{both} \texttt{a.swift} and \texttt{b.swift} in the same frontend job (in fact, the issue at hand can only appear in \index{batch mode}batch mode, when a frontend job has more than one primary file). First, the \Request{type-check source file request} runs with the source file \texttt{a.swift}. \begin{enumerate} \item While type checking the body of \texttt{breakfast()}, the type checker evaluates the \Request{unqualified lookup request} with the identifier ``\texttt{soup}.'' \item This records the identifier ``\texttt{soup}'' in the requires list of each active request. There is one active request, the \Request{type-check source file request} for \texttt{a.swift}. @@ -470,17 +470,17 @@ \section{Module System}\label{module system} The frontend represents a module by a \IndexDefinition{module declaration}\emph{module declaration} containing one or more \IndexDefinition{file unit}\emph{file units}. The list of source files in a compiler invocation form the \IndexDefinition{main module}\emph{main module}. The main module is special, because its \index{abstract syntax tree}abstract syntax tree is constructed directly by parsing source code; the file units are \IndexDefinition{source file}\emph{source files}. There are three other kinds of modules: \begin{itemize} -\item Serialized modules, containing one or more \IndexDefinition{serialized AST file unit}\emph{serialized AST file units}. When the main module imports another module written in Swift, the frontend reads a serialized module that was previously built. +\item \textbf{Serialized modules} containing one or more \IndexDefinition{serialized AST file unit}\emph{serialized AST file units}. When the main module imports another module written in Swift, the frontend reads a serialized module that was previously built. -\item Imported modules, consisting of one or more \IndexDefinition{Clang file unit}\emph{Clang file units}. These are the modules implemented in C, Objective-C or C++. +\item \textbf{Imported modules} consisting of one or more \IndexDefinition{Clang file unit}\emph{Clang file units}. These are the modules implemented in C, Objective-C or C++. -\item The builtin module, containing types and intrinsics implemented by the compiler itself. +\item \textbf{The builtin module} with exactly one file unit, containing types and intrinsics implemented by the compiler itself. \end{itemize} -The main module depends on other modules via the \texttt{import} keyword, which parses as an \IndexDefinition{import declaration}\emph{import declaration}. After parsing, one of the first stages in semantic analysis loads all modules imported the main module. The standard library is defined in the \texttt{Swift} module, which is imported automatically unless the frontend was invoked with the \IndexFlag{parse-stdlib}\texttt{-parse-stdlib} flag, used when building the standard library itself. As for the builtin module, it is ordinarily not visible, but the \texttt{-parse-stdlib} flag also causes it to be implicitly imported (Section~\ref{misc types}). +The main module depends on other modules via the \texttt{import} keyword, which parses as an \IndexDefinition{import declaration}\emph{import declaration}. After parsing, one of the first stages in semantic analysis loads all modules imported the main module. The standard library is defined in the \texttt{Swift} module, which is imported automatically unless the frontend was invoked with the \IndexFlag{parse-stdlib}\texttt{-parse-stdlib} flag, used when building the standard library itself. As for the builtin module, it is ordinarily not visible, but the \texttt{-parse-stdlib} flag also causes it to be implicitly imported (\SecRef{misc types}). \paragraph{Serialized modules.} The \IndexFlag{emit-module}\texttt{-emit-module} flag instructs the compiler to generate a \index{binary module|see{serialized module}}\IndexDefinition{serialized module}serialized module. Serialized module files use the ``\texttt{.swiftmodule}'' file name extension. Serialized modules are stored in a binary format, closely tied to the specific version of the Swift compiler (when building a shared library for distribution, it is better to publish a textual interface instead, as described at the end of this section). -Name lookup into a serialized module lazily constructs declarations by deserializing records from this binary format as needed. Deserialized declarations generally look like parsed and fully type-checked declarations, but they sometimes contain less information. For example, in Chapter~\ref{generic declarations}, we will see various syntactic representations of generic parameter lists, \texttt{where} clauses, and so on. Since this information is only used when type checking the declaration, it is not serialized. Instead, deserialized declarations only need to store a generic signature, described in Chapter~\ref{genericsig}. +Name lookup into a serialized module lazily constructs declarations by deserializing records from this binary format as needed. Deserialized declarations generally look like parsed and fully type-checked declarations, but they sometimes contain less information. For example, in \SecRef{requirements}, we will describe various syntactic representations of requirements, such as \texttt{where} clauses. Since this information is only used when type checking the declaration, it is not serialized. Instead, deserialized declarations only need to store a generic signature, described in \ChapRef{genericsig}. \index{expression} \index{statement} @@ -512,13 +512,13 @@ \section{Module System}\label{module system} \begin{enumerate} \item Non-\texttt{@inlinable} function bodies are skipped. Bodies of \texttt{@inlinable} functions are printed verbatim, including comments, except that \verb|#if| conditions are evaluated. \item Various synthesized declarations, such as type alias declarations from associated type inference, witnesses for derived conformances such as \texttt{Equatable}, and so on, are written out explicitly. -\item Opaque return types also require special handling (Section~\ref{reference opaque archetype}). +\item Opaque return types also require special handling (\SecRef{reference opaque archetype}). \end{enumerate} Note that (1) above means the textual interface format is target-specific; a separate textual interface needs to be generated for each target platform, alongside the shared library itself. When a module defined by a textual interface is imported for the first time, a frontend job parses and type checks the textual interface, and generates a serialized module file which is then consumed by the original frontend job. Serialized module files generated in this manner are cached, and can be reused between invocations of the same compiler version. -The \texttt{@inlinable} attribute was introduced in Swift 4.2~\cite{se0193}. The Swift \index{ABI}ABI was formally stabilized in Swift 5.0, when the standard library became part of the operating system on Apple platforms. Library evolution support and textual interfaces became user-visible features in Swift 5.1~\cite{se0260}. +The \texttt{@inlinable} attribute was introduced in \IndexSwift{4.2}Swift 4.2~\cite{se0193}. The Swift \index{ABI}ABI was formally stabilized in \IndexSwift{5.0}Swift 5, when the standard library became part of the operating system on Apple platforms. Library evolution support and textual interfaces became user-visible features in \IndexSwift{5.1}Swift 5.1~\cite{se0260}. \section{Source Code Reference}\label{compilation model source reference} @@ -533,7 +533,7 @@ \section{Source Code Reference}\label{compilation model source reference} \begin{quote} \url{https://github.com/apple/swift} \end{quote} -The major components of the Swift frontend live in their own subdirectories of the main repository. The entities modeling the abstract syntax tree are defined in \SourceFile{lib/AST/} and \SourceFile{include/swift/AST/}; among these, types and declarations are important for the purposes of this book, and will be covered in Chapter~\ref{types} and Chapter~\ref{decls}. The core of the SIL intermediate language is implemented in \SourceFile{lib/SIL/} and \SourceFile{include/swift/SIL/}. +The major components of the Swift frontend live in their own subdirectories of the main repository. The entities modeling the abstract syntax tree are defined in \SourceFile{lib/AST/} and \SourceFile{include/swift/AST/}; among these, types and declarations are important for the purposes of this book, and will be covered in \ChapRef{types} and \ChapRef{decls}. The core of the SIL intermediate language is implemented in \SourceFile{lib/SIL/} and \SourceFile{include/swift/SIL/}. Each stage of the compilation pipeline has its own subdirectory: \begin{itemize} @@ -571,7 +571,7 @@ \subsection*{Request Evaluator} \item \texttt{RequestFlags::Uncached}: indicates that the result of the evaluation function should not be cached. \item \texttt{RequestFlags::Cached}: indicates that the result of the evaluation function should be cached by the request evaluator, which uses a per-request kind \texttt{DenseMap} for this purpose. \item \texttt{RequestFlags::SeparatelyCached}: the result of the evaluation function should be cached by the request implementation itself, as described below. -\item \texttt{RequestFlags::DependencySource}, \texttt{DependencySink}: if one of these is set, the request kind becomes a dependency source or sink, as described in Section~\ref{incremental builds}. +\item \texttt{RequestFlags::DependencySource}, \texttt{DependencySink}: if one of these is set, the request kind becomes a dependency source or sink, as described in \SecRef{incremental builds}. \end{itemize} Separate caching can be more performant if it allows the cached value to be stored directly inside of an AST node, instead of requiring the request evaluator to consult a side table. For example, many requests taking a declaration as input store the result directly inside of the \texttt{Decl} instance or some subclass thereof. @@ -635,7 +635,7 @@ \subsection*{Name Lookup} Flags passed as part of an \texttt{UnqualifiedLookupDescriptor}. \begin{itemize} \item \texttt{UnqualifiedLookupFlags::TypeLookup}: if set, lookup ignores declarations other than type declarations. This is used in type resolution. -\item \texttt{UnqualifiedLookupFlags::AllowProtocolMembers}: if set, lookup finds members of protocols and protocol extensions. Generally should always be set, except to avoid request cycles in cases where it is known the result of the lookup cannot appear in a protocol or protocol extensions. +\item \texttt{UnqualifiedLookupFlags::AllowProtocolMembers}: if set, lookup finds members of protocols and \IndexSource{protocol extension}protocol extensions. Generally should always be set, except to avoid request cycles in cases where it is known the result of the lookup cannot appear in a protocol or protocol extensions. \item \texttt{UnqualifiedLookupFlags::IgnoreAccessControl} if set, lookup ignores access control. Generally should never be set, except when recovering from errors in diagnostics. \item \texttt{UnqualifiedLookupFlags::IncludeOuterResults} if set, lookup stops after finding results in an innermost scope, or to always proceed to a top-level lookup. \end{itemize} @@ -643,7 +643,7 @@ \subsection*{Name Lookup} \index{declaration context} \IndexSource{qualified lookup} \apiref{DeclContext}{class} -Declaration contexts will be introduced in Chapter~\ref{decls}, and the \texttt{DeclContext} class in Section~\ref{declarationssourceref}. +Declaration contexts will be introduced in \ChapRef{decls}, and the \texttt{DeclContext} class in \SecRef{declarationssourceref}. \begin{itemize} \item \texttt{lookupQualified()} has various overloads, which perform a qualified name lookup into one of various combinations of types or declarations. The ``\texttt{this}'' parameter---the \texttt{DeclContext~*} which the method is called on determines the visibility of declarations found via lookup through imports and access control; it is not the base type of the lookup. \end{itemize} @@ -658,7 +658,7 @@ \subsection*{Name Lookup} \IndexSource{direct lookup} \apiref{NominalTypeDecl}{class} -Nominal type declarations will be introduced in Chapter~\ref{decls}, and the \texttt{NominalTypeDecl} class in Section~\ref{declarationssourceref}. The implementation of direct lookup and lazy member loading is discussed in Section~\ref{extensionssourceref}. +Nominal type declarations will be introduced in \ChapRef{decls}, and the \texttt{NominalTypeDecl} class in \SecRef{declarationssourceref}. The implementation of direct lookup and lazy member loading is discussed in \SecRef{extensionssourceref}. \begin{itemize} \item \texttt{lookupDirect()} performs a direct lookup, which only searches the nominal type declaration itself and its extensions, ignoring access control. \end{itemize} @@ -690,6 +690,7 @@ \subsection*{Module System} \item \texttt{getFiles()} returns an array of \texttt{FileUnit}. \item \texttt{isMainModule()} answers if this is the main module. \end{itemize} +See \SecRef{conformancesourceref} and \SecRef{extensionssourceref} for the global conformance lookup operations defined on \texttt{ModuleDecl}. \apiref{FileUnit}{class} Abstract base class representing a file unit. \IndexSource{primary file} diff --git a/docs/Generics/chapters/completion.tex b/docs/Generics/chapters/completion.tex index 2d9225a6fc05a..56eeb9b7d48c5 100644 --- a/docs/Generics/chapters/completion.tex +++ b/docs/Generics/chapters/completion.tex @@ -6,20 +6,20 @@ \chapter{Completion}\label{completion} \IndexDefinition{Knuth-Bendix algorithm}% \index{completion!z@\igobble|seealso{Knuth-Bendix algorithm}} -\lettrine{K}{nuth-Bendix completion} is the central algorithm in the Requirement Machine. Completion attempts to construct a \index{convergent rewrite system}convergent rewrite system from a list of rewrite rules, and a convergent rewrite system allows us to decide if two terms have the same reduced form in a finite number of steps, solving the word problem. As we saw in the previous chapter, our initial rewrite rules are defined by the explicit requirements of a generic signature and its protocol dependencies. A desirable property of this mapping was given by Theorem~\ref{derivation to path}: a \emph{derived} requirement defines a rewrite path over these rewrite rules representing explicit requirements. All of this means that completion gives us a \emph{decision procedure} for the \index{derived requirement}derived requirements formalism: the question of whether any given derived requirement is satisfied---that is, if there exists a valid derivation built from explicit requirements---is easily solved by term reduction in a convergent rewrite system. This is the foundation on which we build both \index{generic signature query}generic signature queries and \index{requirement minimization}minimization. +\lettrine{K}{nuth-Bendix completion} is the central algorithm in the Requirement Machine. Completion attempts to construct a \index{convergent rewrite system}convergent rewrite system from a list of rewrite rules, and a convergent rewrite system allows us to decide if two terms have the same reduced form in a finite number of steps, solving the word problem. As we saw in the previous chapter, our initial rewrite rules are defined by the explicit requirements of a generic signature and its protocol dependencies. A desirable property of this mapping was given by \ThmRef{derivation to path}: a \emph{derived} requirement defines a rewrite path over these rewrite rules representing explicit requirements. All of this means that completion gives us a \emph{decision procedure} for the \index{derived requirement}derived requirements formalism: the question of whether any given derived requirement is satisfied---that is, if there exists a valid derivation built from explicit requirements---is easily solved by term reduction in a convergent rewrite system. This is the foundation on which we build both \index{generic signature query}generic signature queries and \index{requirement minimization}minimization. \paragraph{The algorithm.} We'll give a self-contained description first, with much of the rest of the chapter devoted to examples. Our description can be supplemented with any text on rewrite systems, such as \cite{book2012string} or \cite{andallthat}. The algorithm is somewhat clever; to really ``get it'' might require several attempts. \index{Donald~Knuth}Donald~E.~Knuth and \index{Peter Bendix}Peter Bendix described the algorithm for term rewrite systems in a 1970 paper \cite{Knuth1983}; a correctness proof was later given by \index{Gerard Huet@G\'erard Huet}G\'erard Huet in \cite{HUET198111}. In our application, the terms are elements of a free monoid, so we have a string rewrite system; this special case was studied in \cite{narendran}. A survey of related techniques appears in \cite{BUCHBERGER19873}. -The entry point into the Knuth-Bendix completion procedure is Algorithm~\ref{knuthbendix}, but we break off four smaller pieces before we get there, so that only the top-level loop remains: +The entry point into the Knuth-Bendix completion procedure is \AlgRef{knuthbendix}, but we break off four smaller pieces before we get there, so that only the top-level loop remains: \begin{itemize} -\item Algorithm~\ref{overlap trie lookup} finds all rules that overlap with a fixed rule at a fixed position. -\item Algorithm~\ref{find overlapping rule algo} finds all pairs of rules that overlap at any position. -\item Algorithm~\ref{critical pair algo} builds a critical pair from a pair of overlapping rules. -\item Algorithm~\ref{add rule derived algo} resolves a critical pair. +\item \AlgRef{overlap trie lookup} finds all rules that overlap with a fixed rule at a fixed position. +\item \AlgRef{find overlapping rule algo} finds all pairs of rules that overlap at any position. +\item \AlgRef{critical pair algo} builds a critical pair from a pair of overlapping rules. +\item \AlgRef{add rule derived algo} resolves a critical pair. \end{itemize} -We begin with Algorithm \ref{critical pair algo}~and~\ref{add rule derived algo}, proceeding from the inside out. The twin concepts of overlapping rule and critical pair are fundamental to the algorithm, and they provide the theoretical justification for the rest. +We begin with Algorithms \ref{critical pair algo}~and~\ref{add rule derived algo}, proceeding from the inside out. The twin concepts of overlapping rule and critical pair are fundamental to the algorithm, and they provide the theoretical justification for the rest. -\paragraph{Local confluence.} We would like our \index{reduction relation}reduction relation $\rightarrow$ to satisfy the \index{Church-Rosser property}Church-Rosser property: if $x\sim y$ are two equivalent terms, then $x\rightarrow z$ and $y\rightarrow z$ for some term $z$. By Theorem~\ref{church rosser theorem}, this is equivalent to $\rightarrow$ being \index{confluence}confluent, meaning any two \index{positive rewrite path}positive rewrite paths diverging from a common source can be extended to meet each other. This is difficult to verify directly, but a 1941 paper by Max~Newman~\cite{newman} shows there is a simpler equivalent condition when the reduction relation is \index{terminating reduction relation}terminating. +\paragraph{Local confluence.} We would like our \index{reduction relation}reduction relation $\rightarrow$ to satisfy the \index{Church-Rosser property}Church-Rosser property: if $x\sim y$ are two equivalent terms, then $x\rightarrow z$ and $y\rightarrow z$ for some term $z$. By \ThmRef{church rosser theorem}, this is equivalent to $\rightarrow$ being \index{confluence}confluent, meaning any two \index{positive rewrite path}positive rewrite paths diverging from a common source can be extended to meet each other. This is difficult to verify directly, but a 1941 paper by Max~Newman~\cite{newman} shows there is a simpler equivalent condition when the reduction relation is \index{terminating reduction relation}terminating. \begin{definition} A reduction relation $\rightarrow$ is \IndexDefinition{local confluence}\emph{locally confluent}, if whenever $s_1$ and $s_2$ are two positive rewrite steps with $\Src(s_1)=\Src(s_2)$, there exists a term $z$ such that $\Dst(s_1)\rightarrow z$ and $\Dst(s_2)\rightarrow z$. \end{definition} @@ -40,7 +40,7 @@ \chapter{Completion}\label{completion} &v_1xv_2 \end{tikzcd} \] -We can also visualize an orthogonal critical pair using the ``pictorial'' notation for rewrite steps we devised in Section~\ref{rewrite graph}: +We can also visualize an orthogonal critical pair using the ``pictorial'' notation for rewrite steps we devised in \SecRef{rewrite graph}: \[ \begin{array}{cc} \text{$s_1$ first:}& @@ -266,16 +266,16 @@ \chapter{Completion}\label{completion} As input, takes terms $t_1$ and $t_2$, and a rewrite path $p$ with $\Src(p)=t_1$ and $\Dst(p)=t_2$. Records a rewrite loop, and possibly adds a new rule, returning true if a rule was added. \begin{enumerate} \item (Fast path) If $t_1=t_2$, $p$ is already a loop; record it and return false. -\item (Left) Apply Algorithm~\ref{term reduction trie algo} to $t_1$, to reduce $t_1\rightarrow t_1^\prime$ with rewrite path $p_1$. -\item (Right) Apply Algorithm~\ref{term reduction trie algo} to $t_2$, to reduce $t_2\rightarrow t_2^\prime$ with rewrite path $p_2$. -\item (Compare) Use Algorithm~\ref{rqm reduction order} to compare $t_1^\prime$ with $t_2^\prime$. +\item (Left) Apply \AlgRef{term reduction trie algo} to $t_1$, to reduce $t_1\rightarrow t_1^\prime$ with rewrite path $p_1$. +\item (Right) Apply \AlgRef{term reduction trie algo} to $t_2$, to reduce $t_2\rightarrow t_2^\prime$ with rewrite path $p_2$. +\item (Compare) Use \AlgRef{rqm reduction order} to compare $t_1^\prime$ with $t_2^\prime$. \item (Trivial) \index{trivial critical pair}If $t_1^\prime=t_2^\prime$, record a loop $p_1^{-1}\circ p\circ p_2$ with basepoint $t_1^\prime=t_2^\prime$, and return false. \item (Smaller) If $t_2^\prime>>}. This is an absolute limit, so we will arbitrarily reject user-written requirements with deeply-nested concrete types. The default value is 30. +\item \IndexFlag{requirement-machine-max-concrete-nesting} \texttt{-requirement-machine-max-concrete-nesting} controls the maximum nesting of concrete types, to prevent \index{substitution simplification}substitution simplification from constructing an infinite type like \texttt{G>>}. As with the limit on rule length, we add the maximum nesting depth of user-written rules to get the actual limit. The default value is 30. \end{itemize} -In a runaway critical pairs scenario, it can take several seconds for completion to reach the rule count limit. The rule length limit enables earlier detection of situations where completion has clearly gone off the rails. The rule length limit being relative instead of just a total ban on terms of length 12 allows various pathological cases to succeed which would otherwise be needlessly rejected. We can type check a protocol representing the monoid $\mathbb{Z}/14\mathbb{Z}$ without fear: +While the first of the three is sufficient to detect non-termination, it takes a second or two for completion to record that many rules. The other two limits improve user experience in this case by rejecting clearly invalid programs sooner. The rule length limit being relative instead of just a total ban on terms of length 12 allows various pathological cases to succeed which would otherwise be needlessly rejected. + +The following protocol, for example, represents the monoid $\mathbb{Z}/14\mathbb{Z}$ and defines a rule of length 14, so the absolute rule length limit is really $14+12=26$. Completion does not add any longer rules so we accept it without issues: \begin{Verbatim} protocol Z14 { associatedtype A: Z14 @@ -410,8 +412,6 @@ \chapter{Completion}\label{completion} } \end{Verbatim} -A future improvement would be to change the concrete nesting limit to also be relative to the complexity of user-written requirements. There is no technical reason not to support deeply-nested concrete types here, it is only needed to catch runaway substitution simplification. - If completion fails when building a rewrite system for \index{requirement minimization}minimization, we have a source location associated with some protocol or generic declaration. An error is diagnosed at this source location, and we proceed with minimization producing an empty list of requirements. If completion fails on a rewrite system built from an existing generic signature or \index{protocol component}protocol component, there is no source location we can use for diagnostics; the compiler dumps the entire rewrite system and aborts with a fatal error. The latter scenario is unusual; if we successfully constructed a generic signature from user-written requirements, we should be able to build a rewrite system for it again. \paragraph{Debugging flags} @@ -463,7 +463,7 @@ \section{Rule Simplification}\label{rule reduction} In our implementation, we don't \emph{actually} delete rules, because we use the index of each rule as a stable reference elsewhere; instead, we set a pair of rule flags, \index{left-simplified rule}\textbf{left-simplified} and \index{right-simplified rule}\textbf{right-simplified}, and delete the rule from the \index{rule trie}\index{trie}rule trie. We've seen these flags mentioned already, so now we reveal their purpose. This will motivate the subsequent theory, setting the stage for the remaining two sections of this chapter. -\paragraph{Left simplification.} If the left-hand side of a rewrite rule $u_1\Rightarrow v_1$ can be reduced by another rewrite rule $u_2\Rightarrow v_2$, then $u_1=xu_2z$ for some $x$, $z\in A^*$, so we have an \index{overlapping rules}overlap of the first kind in the sense of Definition~\ref{overlappingrules}. Once we resolve all critical pairs, we don't need the first rule at all; we know that in a convergent rewrite system, both ways of reducing the overlap term $u_1:=xu_2z$ produce the same result: +\paragraph{Left simplification.} If the left-hand side of a rewrite rule $u_1\Rightarrow v_1$ can be reduced by another rewrite rule $u_2\Rightarrow v_2$, then $u_1=xu_2z$ for some $x$, $z\in A^*$, so we have an \index{overlapping rules}overlap of the first kind in the sense of \DefRef{overlappingrules}. Once we resolve all critical pairs, we don't need the first rule at all; we know that in a convergent rewrite system, both ways of reducing the overlap term $u_1:=xu_2z$ produce the same result: \[ \begin{tikzcd} &u_1 @@ -513,7 +513,7 @@ \section{Rule Simplification}\label{rule reduction} \begin{enumerate} \item (Initialize) Let \texttt{N} be the number of local rules in our rewrite system, and set $i:=0$. \item (Check) If $i=\texttt{N}$, return. Otherwise, let $u\Rightarrow v$ be the $i$th local rule. -\item (Reduce) Apply Algorithm~\ref{term reduction trie algo} to $v$ to get a rewrite path $p_v$. If $p_v$ is the \index{empty rewrite path}empty rewrite path $1_{v}$, the right-hand side $v$ is already reduced, so go to Step~7. +\item (Reduce) Apply \AlgRef{term reduction trie algo} to $v$ to get a rewrite path $p_v$. If $p_v$ is the \index{empty rewrite path}empty rewrite path $1_{v}$, the right-hand side $v$ is already reduced, so go to Step~7. \item (Record) Let $v^\prime=\Dst(p_v)$. Add a new rewrite rule $u\Rightarrow v^\prime$ to the list of local rules, and insert it into the rule trie with the key $u$, replacing the old rule $u\Rightarrow v$. \item (Relate) Add the rewrite loop $(u\Rightarrow v)\circ p\circ(v^\prime\Rightarrow u)$ with basepoint $u$, relating the old rule $u\Rightarrow v$ with the new rule $u\Rightarrow v^\prime$. \item (Mark) Mark the old rule as \textbf{right-simplified}. @@ -521,21 +521,21 @@ \section{Rule Simplification}\label{rule reduction} \end{enumerate} \end{algorithm} -Our justification for the validity of these passes worked from the assumption that we had a convergent rewrite system; that is, that completion had already been performed. In practice, Algorithm~\ref{knuthbendix} repeatedly runs both passes during completion, once per round of \index{critical pair}critical pair resolution. This is advantageous, because we can subsequently avoid considering overlaps that involve simplified rules. This strategy remains sound as long as we perform left simplification after computing critical pairs, but \emph{before} resolving them, which might add new rules. This narrows the candidates for left simplification to those rules whose overlaps have already been considered. As for the right simplification pass, it is actually fine to run it at any point; we choose to run it after \index{resolving critical pair}resolving critical pairs. +Our justification for the validity of these passes worked from the assumption that we had a convergent rewrite system; that is, that completion had already been performed. In practice, \AlgRef{knuthbendix} repeatedly runs both passes during completion, once per round of \index{critical pair}critical pair resolution. This is advantageous, because we can subsequently avoid considering overlaps that involve simplified rules. This strategy remains sound as long as we perform left simplification after computing critical pairs, but \emph{before} resolving them, which might add new rules. This narrows the candidates for left simplification to those rules whose overlaps have already been considered. As for the right simplification pass, it is actually fine to run it at any point; we choose to run it after \index{resolving critical pair}resolving critical pairs. \paragraph{Related concepts.} -We previously saw in Section~\ref{minimal requirements} that the same-type requirements in a generic signature are subject to similar conditions of being left-reduced and right-reduced. There is a connection here, because as we will see in Section~\ref{requirement builder}, the minimal requirements of a generic signature are ultimately constructed from the rules of a reduced rewrite system. However, there are a few notational differences: +We previously saw in \SecRef{minimal requirements} that the same-type requirements in a generic signature are subject to similar conditions of being left-reduced and right-reduced. There is a connection here, because as we will see in \SecRef{requirement builder}, the minimal requirements of a generic signature are ultimately constructed from the rules of a reduced rewrite system. However, there are a few notational differences: \begin{itemize} -\item The roles of ``left'' and ``right'' are reversed because requirements use a different convention; in a reduced same-type requirement $\FormalReq{U == V}$, we have $\texttt{U} < \texttt{V}$, whereas in a rewrite rule $u\Rightarrow v$ we have $v @@ -55,12 +55,12 @@ \chapter{Conformance Paths}\label{conformance paths} \SubstConf{T}{String}{Collection} } \] -The substitution map $\Sigma$ satisfies the requirements of our generic signature. To see why, we apply $\Sigma$ to both sides of the requirement $\FormalReq{U == T.[Collection]SubSequence}$: +The substitution map $\Sigma$ satisfies the requirements of our generic signature. To see why, we apply $\Sigma$ to both sides of the requirement $\SameReq{U}{T.[Collection]SubSequence}$: \begin{gather*} \texttt{U} \otimes \Sigma\\ \texttt{T.[Collection]SubSequence} \otimes \Sigma \end{gather*} -The first substituted type is the replacement type for \texttt{U} in $\Sigma$, which is \texttt{Substring}. The second substituted type can be computed with what we learned about dependent member type substitution from Section~\ref{abstract conformances}. Recalling that a dependent member type is a \index{type witness}type witness of an \index{abstract conformance}abstract conformance, we proceed as follows: +The first substituted type is the replacement type for \texttt{U} in $\Sigma$, which is \texttt{Substring}. The second substituted type can be computed with what we learned about dependent member type substitution from \SecRef{abstract conformances}. Recalling that a dependent member type is a \index{type witness}type witness of an \index{abstract conformance}abstract conformance, we proceed as follows: \begin{gather*} \texttt{T.[Collection]SubSequence} \otimes \Sigma\\ \qquad {} = \bigl(\AssocType{[Collection]SubSequence} \otimes \ConfReq{T}{Collection}\bigr) \otimes \Sigma\\ @@ -80,14 +80,14 @@ \chapter{Conformance Paths}\label{conformance paths} \[\texttt{U.[Sequence]Iterator} \otimes \Sigma = \AssocType{[Sequence]Iterator} \otimes (\ConfReq{U}{Sequence} \otimes \Sigma)\] However, $\ConfReq{U}{Sequence}$ does not appear in our generic signature, thus the corresponding conformance is not directly stored inside $\Sigma$. To understand what happens next, we need some terminology. -\paragraph{Root conformances.} The \IndexDefinition{root conformance}\emph{root conformances} of a substitution map are those directly stored in the substitution map. The \IndexDefinition{root abstract conformance}\emph{root abstract conformances} of a generic signature are those corresponding to the explicit conformance requirements of the generic signature. These two concepts are closely related via the \index{identity substitution map}identity substitution map $1_G$: the root conformances of $1_G$ are the root abstract conformances of its input generic signature $G$. In our example, $\ConfReq{T}{Collection}$ is a root abstract conformance of our generic signature, but $\ConfReq{U}{Sequence}$ is not, because the latter represents a derived conformance requirement, not explicitly stated in the generic signature (we will present a derivation later). +\paragraph{Root conformances.} The \IndexDefinition{root conformance}\emph{root conformances} of a \index{substitution map}substitution map are those directly stored in the substitution map. The \IndexDefinition{root abstract conformance}\emph{root abstract conformances} of a generic signature are those corresponding to the explicit conformance requirements of the generic signature. These two concepts are closely related via the \index{identity substitution map}identity substitution map $1_G$: the root conformances of $1_G$ are the root abstract conformances of its \index{input generic signature}input generic signature $G$. In our example, $\ConfReq{T}{Collection}$ is a root abstract conformance of our generic signature, but $\ConfReq{U}{Sequence}$ is not, because the latter represents a derived conformance requirement, not explicitly stated in the generic signature (we will present a derivation later). The general case of applying a substitution map to an abstract conformance is called \emph{local conformance lookup}. Without knowing anything about local conformance lookup, we can already deduce the result of $\ConfReq{U}{Sequence}\otimes \Sigma$ only using concepts introduced previously. We know the abstract conformance $\ConfReq{U}{Sequence}$ can be obtained by the \emph{global} conformance lookup $\protosym{Sequence}\otimes\texttt{U}$. We also know that $\texttt{U}\otimes \Sigma=\texttt{Substring}$. Therefore, from the \index{associative operation}associativity of ``$\otimes$'', we see that: \begin{gather*} \ConfReq{U}{Sequence}\otimes \Sigma\\ -\qquad {} = \bigl(\Proto{Sequence}\otimes\texttt{U}\bigr)\otimes \Sigma\\ -\qquad {} = \Proto{Sequence}\otimes\bigl(\texttt{U}\otimes \Sigma\bigr)\\ -\qquad {} = \Proto{Sequence}\otimes\texttt{Substring}\\ +\qquad {} = \bigl(\protosym{Sequence}\otimes\texttt{U}\bigr)\otimes \Sigma\\ +\qquad {} = \protosym{Sequence}\otimes\bigl(\texttt{U}\otimes \Sigma\bigr)\\ +\qquad {} = \protosym{Sequence}\otimes\texttt{Substring}\\ \qquad {} = \ConfReq{Substring}{Sequence}. \end{gather*} So, to be consistent with the rest of our theory, local conformance lookup must output the conformance $\ConfReq{Substring}{Sequence}$, given $\ConfReq{U}{Sequence}$ and $\Sigma$. As an operation on substitution maps, local conformance lookup is also limited in what it can do. Starting from one of the root conformances in a substitution map, the only way to derive new conformances is by associated conformance projection, possibly repeated multiple times. @@ -134,7 +134,7 @@ \chapter{Conformance Paths}\label{conformance paths} \end{quote} \end{figure} -\paragraph{Conformance paths.} Our example is based on a handful of concrete conformances, simplified from their real definitions in the Swift standard library. Figure~\ref{associated conformance examples} lists the type witnesses and \index{associated conformance}associated conformances of each one: +\paragraph{Conformance paths.} Our example is based on a handful of concrete conformances, simplified from their real definitions in the Swift standard library. \FigRef{associated conformance examples} lists the type witnesses and \index{associated conformance}associated conformances of each one: \begin{gather*} \ConfReq{String}{Collection}\\ \ConfReq{Substring}{Collection}\\ @@ -182,13 +182,13 @@ \chapter{Conformance Paths}\label{conformance paths} First, let's assume we already have the means to obtain a conformance path for an abstract conformance. A conformance path only depends on the generic signature, and not the contents of the substitution map. Local conformance lookup \index{conformance path evaluation}\emph{evaluates} this conformance path with the given substitution map $\Sigma$. This evaluation operation can be understood with our type substitution algebra; this justifies our right-to-left notation: \[s_n\otimes\bigl(\cdots\otimes \bigl(s_1 \otimes (s_0 \otimes \Sigma)\bigr) \cdots \bigr)\] -We now show the algorithms for local conformance lookup and dependent member type substitution. In the next section, we will build up some more theory in anticipation of revealing Algorithm~\ref{find conformance path algorithm} for actually finding conformance paths. +We now show the algorithms for local conformance lookup and dependent member type substitution. In the next section, we will build up some more theory in anticipation of revealing \AlgRef{find conformance path algorithm} for actually finding conformance paths. \begin{algorithm}[Local conformance lookup]\label{local conformance lookup algorithm} As input, takes a substitution map $\Sigma$, and an abstract conformance $\ConfReq{T}{P}$. \begin{enumerate} \item Let \texttt{C} be an invalid conformance. This will be the return value. -\item Find a conformance path $s_n\otimes \cdots \otimes s_1\otimes s_0$ for $\ConfReq{T}{P}$ using Algorithm~\ref{find conformance path algorithm}. +\item Find a conformance path $s_n\otimes \cdots \otimes s_1\otimes s_0$ for $\ConfReq{T}{P}$ using \AlgRef{find conformance path algorithm}. \item (Initialize) Let $i := 1$. \item (Root) Suppose $s_0$ is the root abstract conformance $\ConfReq{$\texttt{T}_0$}{$\texttt{P}_0$}$. Project the root conformance corresponding to $\ConfReq{$\texttt{T}_0$}{$\texttt{P}_0$}$ from $\Sigma$, and assign the result to \texttt{C}. \item (Check) If $i=n+1$, return \texttt{C}. @@ -197,14 +197,14 @@ \chapter{Conformance Paths}\label{conformance paths} \end{enumerate} \end{algorithm} -In Step~6, \texttt{C} must be a conformance to $\texttt{P}_{i-1}$ for the projection to make sense; also, we expect that on the last iteration, $\texttt{P}_n$ is equal to $\texttt{P}$, the conformed protocol of the original abstract conformance. This gives us a validity condition on conformance paths, which we will explore in the next section. For now, we again assume that Algorithm~\ref{find conformance path algorithm} gives us such a path. +In Step~6, \texttt{C} must be a conformance to $\texttt{P}_{i-1}$ for the projection to make sense; also, we expect that on the last iteration, $\texttt{P}_n$ is equal to $\texttt{P}$, the conformed protocol of the original abstract conformance. This gives us a validity condition on conformance paths, which we will explore in the next section. For now, we again assume that \AlgRef{find conformance path algorithm} gives us such a path. \begin{algorithm}[Dependent member type substitution]\label{dependent member type substitution} -As input, takes a \index{dependent member type}dependent member type \texttt{T.[P]A} and a substitution map $\Sigma$. The dependent member type is understood to be a valid type parameter in the substitution map's input generic signature. Outputs the substituted type $\texttt{T.[P]A}\otimes\Sigma$. +As input, takes a \index{dependent member type}dependent member type \texttt{T.[P]A} and a substitution map $\Sigma$. The dependent member type is understood to be a \index{valid type parameter}valid type parameter in the substitution map's \index{input generic signature}input generic signature. Outputs the substituted type $\texttt{T.[P]A}\otimes\Sigma$. \begin{enumerate} \item Let \texttt{A} be the associated type declaration referenced by the dependent member type. (This algorithm does not support \index{unbound dependent member type}unbound dependent member types.) \item Let \texttt{P} be the protocol containing this associated type declaration. \item Let \texttt{T} be the base type parameter of the dependent member type. (This could be another dependent member type, or a generic parameter.) -\item (Lookup) Construct the abstract conformance $\ConfReq{T}{P}$ and invoke Algorithm~\ref{local conformance lookup algorithm} to perform the local conformance lookup $\ConfReq{T}{P}\otimes\Sigma$. +\item (Lookup) Construct the abstract conformance $\ConfReq{T}{P}$ and invoke \AlgRef{local conformance lookup algorithm} to perform the local conformance lookup $\ConfReq{T}{P}\otimes\Sigma$. \item (Project) Apply the type witness projection $\AssocType{[P]A}$ to this conformance and return the result. \end{enumerate} \end{algorithm} @@ -220,7 +220,7 @@ \section{Validity and Existence}\label{conformance paths exist} We claimed this conformance path represents $\ConfReq{U}{Sequence}$, but simplifying it in the above manner gives us $\ConfReq{T.SubSequence}{Sequence}$, which is not identical. We can justify this by noting that both conformances have the same conformed protocol, and the two subject types \texttt{T.SubSequence} and \texttt{U} belong to the same equivalence class, due to the explicit same-type requirement of our generic signature. -Thus, just like the reduced type equality relation on type parameters (Section~\ref{typeparams}), we can define an \index{equivalence relation}equivalence relation on abstract conformances. We say that two abstract conformances are equivalent if they name the same protocol, and their subject types are \index{reduced type equality}equivalent. A \IndexDefinition{reduced abstract conformance}\emph{reduced abstract conformance} is then one whose subject type is a reduced type parameter. Every equivalence class of abstract conformances contains a unique reduced abstract conformance, and two abstract conformances are equivalent if their reduced abstract conformances are identical. +Thus, just like the \index{reduced type equality}reduced type equality relation on type parameters (\SecRef{type params}), we can define an \index{equivalence relation}equivalence relation on abstract conformances. We say that two abstract conformances are equivalent if they name the same protocol, and their subject types are \index{reduced type equality}equivalent. A \IndexDefinition{reduced abstract conformance}\emph{reduced abstract conformance} is then one whose subject type is a reduced type parameter. Every equivalence class of abstract conformances contains a unique reduced abstract conformance, and two abstract conformances are equivalent if their reduced abstract conformances are identical. In our example, $\ConfReq{U}{Sequence}$ and $\ConfReq{T.SubSequence}{Sequence}$ are two abstract conformances that belong to the same equivalence class. The former is reduced, while the latter is not (and the former is the reduced abstract conformance of the latter). @@ -230,7 +230,7 @@ \section{Validity and Existence}\label{conformance paths exist} \item Does every valid abstract conformance have at least one conformance path? \item How do we find a conformance path for a valid abstract conformance? \end{enumerate} -We will answer (1) and (2) first, and present an algorithm for (3) in the next section. We begin with an algorithm for directly evaluating a conformance path to an abstract conformance, without a substitution map. This is called \emph{simplifying} a conformance path. Studying the preconditions of this algorithm leads to a notion of validity for conformance paths. This algorithm is also later used by Algorithm~\ref{find conformance path algorithm}. +We will answer (1) and (2) first, and present an algorithm for (3) in the next section. We begin with an algorithm for directly evaluating a conformance path to an abstract conformance, without a substitution map. This is called \emph{simplifying} a conformance path. Studying the preconditions of this algorithm leads to a notion of validity for conformance paths. This algorithm is also later used by \AlgRef{find conformance path algorithm}. \begin{algorithm}[Simplifying a conformance path]\label{invertconformancepath} Takes a generic signature and a conformance path $s_n\otimes\cdots\otimes s_1\otimes s_0$. Outputs an abstract conformance. This conformance will have the same conformed protocol as the last step of the conformance path. \begin{enumerate} @@ -248,28 +248,19 @@ \section{Validity and Existence}\label{conformance paths exist} \item Every subsequent step $s_i$ is an associated conformance projection defined in the protocol $\texttt{P}_{i-1}$. \end{itemize} -Recall that an \index{valid abstract conformance}abstract conformance $\ConfReq{T}{P}$ is valid in a generic signature $G$ if the conformance requirement $\ConfReq{T}{P}$ can be \index{derived requirement}derived in $G$. To show that a valid conformance path simplifies to a valid abstract conformance, we construct a special kind of derivation, called a \IndexDefinition{primitive derivation}\emph{primitive derivation}, from the conformance path. A primitive derivation can only contain three kinds of derivation steps: -\begin{enumerate} -\item \IndexStep{GenSig}\textsc{GenSig} steps, deriving explicit conformance requirements: $\vdash \ConfReq{T}{P}$. -\item \IndexStep{ReqSig}\textsc{ReqSig} steps, deriving \index{associated conformance requirement}associated conformance requirements: $\vdash \ConfReq{Self.U}{Q}$. -\item \IndexStep{Conf}\textsc{Conf} steps, deriving conformance requirements from (1), (2) and prior \textsc{Conf} steps: $\ConfReq{T}{P},\,\ConfReq{Self.U}{Q}\vdash\ConfReq{T.U}{Q}$. -\end{enumerate} -The first step of the conformance path becomes a \textsc{GenSig} derivation step (in fact, this is where we rely the assumption that the conformance path begins with a \index{root abstract conformance}\emph{root} abstract conformance, and not an arbitrary abstract conformance, for otherwise we could not make use of the \textsc{GenSig} derivation step): +Recall that an \index{valid abstract conformance}abstract conformance $\ConfReq{T}{P}$ is valid in a generic signature $G$ if the conformance requirement $\ConfReq{T}{P}$ can be \index{derived requirement}derived in $G$. To show that a valid conformance path simplifies to a valid abstract conformance, we construct a special kind of derivation, called a \IndexDefinition{primitive derivation}\emph{primitive derivation}, from the conformance path. + +The first step of the conformance path translates to a \textsc{Conf} derivation step (recall that a conformance path begins with a \index{root abstract conformance}\emph{root} abstract conformance, and not an arbitrary abstract conformance, for otherwise we could not make use of the \textsc{Conf} derivation step): \begin{gather*} \vdash\ConfReq{$\texttt{T}_0$}{$\texttt{P}_0$}\tag{1} \end{gather*} -If the conformance path has length 1, we're done; we have our primitive derivation. Otherwise, say the second step in the conformance path is $\AssocConf{Self.[$\texttt{P}_0$]$\texttt{A}_1$}{$\texttt{P}_1$}$. We extend the primitive derivation with two additional derivation steps: +If the conformance path has length 1, we're done; we have our primitive derivation. Otherwise, for each remaining step in the conformance path, we add an \textsc{AssocConf} step for the corresponding associated conformance requirement: \begin{gather*} -\vdash\ConfReq{Self.[$\texttt{P}_0$]$\texttt{A}_1$}{$\texttt{P}_1$}_{\texttt{P}_1}\tag{2}\\ -(1),\,(2)\vdash\ConfReq{$\texttt{T}_0$.$\texttt{A}_1$}{$\texttt{P}_1$}\tag{3} -\end{gather*} -We can repeat this process for each remaining step in the conformance path. If the $i$th element of the conformance path is $\AssocConf{Self.[$\texttt{P}_{i-1}$]$\texttt{A}_i$}{$\texttt{P}_i$}$, we first construct a primitive derivation for the conformance path $s_{i-1}\otimes\cdots\otimes s_0$. This primitive derivation will have $2i-1$ derivation steps. Then, we introduce this associated conformance requirement with a \textsc{ReqSig} derivation step, and combine it with the derivation thus far using a \textsc{Conf} derivation step: -\begin{gather*} -\ldots\vdash\ConfReq{$\texttt{T}_0$.$\texttt{A}_1$...$\texttt{A}_{i-1}$}{$\texttt{P}_{i-1}$}\tag{$2i-1$}\\ -\vdash\ConfReq{Self.$\texttt{A}_i$}{$\texttt{P}_i$}_{\texttt{P}_{i-1}}\tag{$2i$}\\ -(2i-1),\,(2i)\vdash\ConfReq{$\texttt{T}_0$.$\texttt{A}_1$...$\texttt{A}_{i-1}$.$\texttt{A}_i$}{$\texttt{P}_i$}\tag{$2i+1$} +\ldots\\ +(n),\,\ConfReq{Self.$\texttt{A}_n$}{$\texttt{P}_n$}_{\texttt{P}_{n+1}}\vdash\ConfReq{$\texttt{T}_0$.$\texttt{A}_1$...$\texttt{A}_{n}$.$\texttt{A}_{n+1}$}{$\texttt{P}_{n+1}$}\tag{$n+1$}\\ +\ldots \end{gather*} -The validity of the conformance path ensures the constructed primitive derivation is well-formed. Also, the subject type and conformed protocol of the derived conformance requirement matches the abstract conformance output by Algorithm~\ref{invertconformancepath}. This shows the aforesaid algorithm outputs a valid abstract conformance, as was claimed. +The derivation is well-formed at every step, and the subject type and conformed protocol of the final derived conformance requirement is the same as the abstract conformance output by \AlgRef{invertconformancepath}. This shows the aforesaid algorithm outputs a valid abstract conformance, as was claimed. Now, recall our favorite conformance path: \[\AssocConf{Self}{Sequence} \otimes \AssocConf{Self.SubSequence}{Collection} \otimes \ConfReq{T}{Collection}\] @@ -282,74 +273,92 @@ \section{Validity and Existence}\label{conformance paths exist} \vdash\ConfReq{Self}{Sequence}_\texttt{Collection}\tag{4}\\ (3),\,(4)\vdash\ConfReq{T.SubSequence}{Sequence}\tag{5} \end{gather*} -We saw that $\ConfReq{T.SubSequence}{Sequence}$ is equivalent to $\ConfReq{U}{Sequence}$ via the same-type requirement $\FormalReq{U == T.SubSequence}$. This means we can extend the above primitive derivation with a \IndexStep{Same}\textsc{Same} derivation step to get a derivation for $\ConfReq{U}{Sequence}$: +We saw that $\ConfReq{T.SubSequence}{Sequence}$ is equivalent to $\ConfReq{U}{Sequence}$ via the same-type requirement $\SameReq{U}{T.SubSequence}$. This means we can extend the above primitive derivation with a \IndexStep{Same}\textsc{Same} derivation step to get a derivation for $\ConfReq{U}{Sequence}$: \begin{gather*} -\vdash\FormalReq{U == T.SubSequence}\tag{6}\\ +\vdash\SameReq{U}{T.SubSequence}\tag{6}\\ (5),\,(6)\vdash\ConfReq{U}{Sequence}\tag{7} \end{gather*} \paragraph{Existence.} The above procedure always works in general. If we receive a conformance path for $\ConfReq{T}{P}$, and the conformance path simplifies to $\ConfReq{$\texttt{T}^\prime$}{P}$ for a possibly different type parameter $\texttt{T}^\prime$, -then we can construct a primitive derivation for $\ConfReq{$\texttt{T}^\prime$}{P}$. We also know that \texttt{T} and $\texttt{T}^\prime$ must be equivalent, so there exists a derivation for the same-type requirement $\FormalReq{T == $\texttt{T}^\prime$}$. From these two derivations, we get a derivation for $\ConfReq{T}{P}$. +then we can construct a primitive derivation for $\ConfReq{$\texttt{T}^\prime$}{P}$. We also know that \texttt{T} and $\texttt{T}^\prime$ must be equivalent, so there exists a derivation for the same-type requirement $\SameReq{T}{$\texttt{T}^\prime$}$. From these two derivations, we get a derivation for $\ConfReq{T}{P}$. + +Now, we will go in the other direction. First, observe that a primitive derivation defines a conformance path, as follows: the initial \textsc{GenSig} derivation step becomes the root abstract conformance at the start of the path, and each subsequent \textsc{ReqSig} derivation step becomes an associated conformance projection. Then, the next theorem shows a derivation of a conformance requirement $\ConfReq{T}{P}$ always splits into two parts: a primitive derivation $\ConfReq{$\texttt{T}^\prime$}{P}$, together with a same-type requirement $\SameReq{T}{$\texttt{T}^\prime$}$. Since a derived conformance requirement defines an abstract conformance, and a primitive derivation defines a conformance path, we can conclude that every abstract conformance has a conformance path. + +We now look at more ways of building new derivations from existing ones; we will use these results in \SecRef{conformance paths exist}. Recall that we can take a same-type requirement $\SameReq{T}{U}$ from $\ConfReq{U}{P}$ and $\SameReq{T.[P]A}{U.[P]A}$, where protocol \texttt{P} declares an associated type \texttt{A}. By iterated application of the \textsc{Member} step, we perform a construction with \emph{any} valid type parameter \texttt{Self.V} in the protocol generic signature \verb||, to obtain a derived requirement $\SameReq{T.V}{U.V}$. + +\begin{lemma}\label{general member type} +Let $G$ be a well-formed generic signature, and suppose that $G\vDash\SameReq{T}{U}$ for type parameters \texttt{T} and \texttt{U}. If $\texttt{T}^\prime$ is any valid type parameter having \texttt{T} as a prefix, then $G\vDash\SameReq{$\texttt{T}^\prime$}{$\texttt{U}^\prime$}$, where $\texttt{U}^\prime$ is the type parameter obtained by replacing \texttt{T} with \texttt{U} in $\texttt{T}^\prime$. +\end{lemma} +\begin{proof} +We proceed by \index{induction}induction on the \index{type parameter length}length of $\texttt{T}^\prime$, building the desired same-type requirement by repeated application of the \IndexStep{SameDecl}\textsc{SameDecl} or \IndexStep{SameName}\textsc{SameName} derivation step. + +\smallskip -Now, we will go in the other direction. First, observe that a primitive derivation defines a conformance path, as follows: the initial \textsc{GenSig} derivation step becomes the root abstract conformance at the start of the path, and each subsequent \textsc{ReqSig} derivation step becomes an associated conformance projection. Then, the next theorem shows a derivation of a conformance requirement $\ConfReq{T}{P}$ always splits into two parts: a primitive derivation $\ConfReq{$\texttt{T}^\prime$}{P}$, together with a same-type requirement $\FormalReq{T == $\texttt{T}^\prime$}$. Since a derived conformance requirement defines an abstract conformance, and a primitive derivation defines a conformance path, we can conclude that every abstract conformance has a conformance path. +\index{base case}\emph{Base case:} When $\texttt{T}^\prime$ has the same length as \texttt{T}, they must in fact be identical, since the latter is a prefix of the former, and thus $\texttt{U}^\prime$ is also the same as \texttt{U}. We have the necessary derivation $G\vDash\SameReq{T}{U}$ by assumption. + +\index{inductive step}\emph{Inductive step:} We write $\texttt{T}^\prime$ as a dependent member type \texttt{$\texttt{T}^{\prime\prime}$.[P]A} or \texttt{$\texttt{T}^{\prime\prime}$.A}, with base type $\texttt{T}^{\prime\prime}$ and associated type~\texttt{A}. We have $G\vDash\texttt{T}^\prime$ by assumption, and so $G\vDash\texttt{T}^{\prime\prime}$ by \PropRef{prefix prop}. By the inductive hypothesis, we have a derivation $G\vDash\SameReq{$\texttt{T}^{\prime\prime}$}{$\texttt{U}^{\prime\prime}$}$. + +Note that $G\vDash\texttt{T}^\prime$ also implies that $G\vDash\ConfReq{$\texttt{T}^{\prime\prime}$}{P}$. If $\texttt{T}^\prime$ is \index{bound dependent member type}bound, we apply a \textsc{SameDecl} step to $\SameReq{$\texttt{T}^{\prime\prime}$}{$\texttt{U}^{\prime\prime}$}$ and $\ConfReq{$\texttt{T}^{\prime\prime}$}{P}$: +\begin{gather*} +\ldots\vdash\ConfReq{$\texttt{T}^{\prime\prime}$}{P}\tag{1}\\ +\ldots\vdash\SameReq{$\texttt{T}^{\prime\prime}$}{$\texttt{U}^{\prime\prime}$}\tag{2}\\ +(1),\,(2)\vdash\SameReq{$\texttt{T}^{\prime\prime}$.[P]A}{$\texttt{U}^{\prime\prime}$.[P]A}\tag{3} +\end{gather*} +If $\texttt{T}^\prime$ is unbound, we just change the final step to a \textsc{SameName}: +\begin{gather*} +(1),\,(2)\vdash\SameReq{$\texttt{T}^{\prime\prime}$.A}{$\texttt{U}^{\prime\prime}$.A}\tag{3} +\end{gather*} +In both cases, we get the desired derivation $G\vDash\SameReq{$\texttt{T}^\prime$}{$\texttt{U}^\prime$}$. +\end{proof} \begin{theorem}\label{conformance paths theorem} Let $G$ be a valid generic signature, with \texttt{T} a type parameter and \texttt{P} some protocol. If $G\vDash\ConfReq{T}{P}$, then there exists at least one conformance path for $\ConfReq{T}{P}$. In other words, there exists a type parameter $\texttt{T}^\prime\in\TypeObj{G}$, such that: \begin{enumerate} \item $G\vDash\ConfReq{$\texttt{T}^\prime$}{P}$, via a primitive derivation. -\item $G\vDash\FormalReq{T == $\texttt{T}^\prime$}$. +\item $G\vDash\SameReq{T}{$\texttt{T}^\prime$}$. \end{enumerate} \end{theorem} \begin{proof} -The proof relies on a few results developed in Section~\ref{generic signature validity}. Let's call the two requisite derivations $D_1$ and $D_2$. We do a \index{structural induction}structural induction on the derivation for $\ConfReq{T}{P}$, building up $D_1$ and $D_2$ step by step. We only need to consider the derivation steps which produce new conformance requirements: +The proof relies on a few results developed in \SecRef{generic signature validity}. Let's call the two requisite derivations $D_1$ and $D_2$. We do a \index{structural induction}structural induction on the derivation for $\ConfReq{T}{P}$, building up $D_1$ and $D_2$ step by step. We only need to consider the derivation steps which produce new conformance requirements: \begin{enumerate} -\item \IndexStep{GenSig}\textsc{GenSig} steps: $\vdash\ConfReq{T}{P}$. -\item \IndexStep{Same}\textsc{Same} steps: $\FormalReq{T == U},\,\ConfReq{U}{P}\vdash\ConfReq{T}{P}$. -\item \IndexStep{Conf}\textsc{Conf} steps: $\ConfReq{U}{P},\,\ConfReq{Self.V}{Q}\vdash\ConfReq{U.V}{Q}$. +\item \IndexStep{Conf}\textsc{Conf} steps: $\vdash\ConfReq{T}{P}$. +\item \IndexStep{SameConf}\textsc{SameConf} steps: $\SameReq{T}{U},\,\ConfReq{U}{P}\vdash\ConfReq{T}{P}$. +\item \IndexStep{AssocConf}\textsc{AssocConf} steps: $\ConfReq{U}{P},\,\ConfReq{Self.V}{Q}\vdash\ConfReq{U.V}{Q}$. \end{enumerate} -\noindent \textbf{First case.} Notice that a \textsc{GenSig} step is the \index{base case}base case of our structural induction, because there are no assumptions on the left-hand side of $\vdash$. We set $\texttt{T}^\prime := \texttt{T}$. A derivation of a single explicit conformance requirement is already primitive by our definition, so $D_1$ is just: +\noindent \textbf{First case.} Notice that a \textsc{Conf} step is the \index{base case}base case of our structural induction, because there are no assumptions on the left-hand side of $\vdash$. We set $\texttt{T}^\prime := \texttt{T}$. A derivation of a single explicit conformance requirement is already primitive by our definition, so $D_1$ is just: \begin{gather*} \vdash\ConfReq{T}{P}\tag{1} \end{gather*} -To construct $D_2$, we recall that $G$ is valid, thus $G\vDash\texttt{T}$, since \texttt{T} appears in the explicit requirement $\ConfReq{T}{P}$. We then derive $\FormalReq{T == T}$ via an \IndexStep{Equiv}\textsc{Equiv} derivation step: +To construct $D_2$, we recall that $G$ is well-formed, thus $G\vDash\texttt{T}$, since \texttt{T} appears in $\ConfReq{T}{P}$. We then derive $\SameReq{T}{T}$ via an \IndexStep{Ident}\textsc{Ident} derivation step: \begin{gather*} \ldots\vdash\texttt{T}\tag{1}\\ -(2)\vdash\FormalReq{T == T}\tag{2} +(2)\vdash\SameReq{T}{T}\tag{2} \end{gather*} -\noindent \textbf{Second case.} We have a \textsc{Same} derivation step: -\[\FormalReq{T == U},\,\ConfReq{U}{P}\vdash\ConfReq{T}{P}\] -By the \index{inductive step}inductive hypothesis, we can assume the derivation of $\ConfReq{U}{P}$ has already been split up into two derivations: a primitive derivation $D_1^\prime$ of a conformance requirement $\ConfReq{$\texttt{U}^\prime$}{P}$, and a derivation $D_2^\prime$ of a same-type requirement $\FormalReq{U == $\texttt{U}^\prime$}$. We set $D_1:=D_1^\prime$. Then, we construct $D_2$ from $D_2^\prime$ by adding an \textsc{Equiv} derivation step: +\noindent \textbf{Second case.} We have a \textsc{SameConf} derivation step: +\[\SameReq{T}{U},\,\ConfReq{U}{P}\vdash\ConfReq{T}{P}\] +By the \index{inductive step}inductive hypothesis, we can assume the derivation of $\ConfReq{U}{P}$ has already been split up into two derivations: a primitive derivation $D_1^\prime$ of $G\vDash\ConfReq{$\texttt{U}^\prime$}{P}$, and a derivation $D_2^\prime$ of $G\vDash\SameReq{U}{$\texttt{U}^\prime$}$. We set $D_1:=D_1^\prime$. Then, we construct $D_2$ from $D_2^\prime$ by adding an \textsc{Equiv} derivation step: \begin{gather*} -\ldots\FormalReq{T == U}\tag{1}\\ -\ldots\FormalReq{U == $\texttt{U}^\prime$}\tag{2}\\ -(1),\,(2)\vdash\FormalReq{T == $\texttt{U}^\prime$}\tag{3} +\ldots\SameReq{T}{U}\tag{1}\\ +\ldots\SameReq{U}{$\texttt{U}^\prime$}\tag{2}\\ +(1),\,(2)\vdash\SameReq{T}{$\texttt{U}^\prime$}\tag{3} \end{gather*} -\noindent \textbf{Third case.} The trickiest scenario is when we have a \textsc{Conf} derivation step: +\noindent \textbf{Third case.} The trickiest scenario is when we have a \textsc{AssocConf} derivation step: \[\ConfReq{U}{P},\,\ConfReq{Self.V}{Q}\vdash\ConfReq{U.V}{Q}\] -By induction, we again assume the derivation of $\ConfReq{U}{P}$ has been split up into $D_1^\prime$ and $D_2^\prime$ as above. We construct $D_1$ from $D_1^\prime$ by adding a \textsc{Conf} derivation step: +By induction, we again assume the derivation of $\ConfReq{U}{P}$ has been split up into $D_1^\prime$ and $D_2^\prime$ as above. We construct $D_1$ from $D_1^\prime$ by adding an \textsc{AssocConf} derivation step for the same associated requirement: \begin{gather*} \ldots\ConfReq{$\texttt{U}^\prime$}{P}\tag{1}\\ -\vdash\ConfReq{Self.V}{Q}_\texttt{P}\tag{2}\\ -(1),\,(2)\vdash\ConfReq{$\texttt{U}^\prime$.V}{Q}\tag{3} +(1),\,\ConfReq{Self.V}{Q}_\texttt{P}\vdash\ConfReq{$\texttt{U}^\prime$.V}{Q}\tag{2} \end{gather*} -This is a primitive derivation by construction. Now it remains to derive a same-type requirement $\FormalReq{U.V == $\texttt{U}^\prime$.V}$, giving us the desired derivation $D_2$. To do that, first note that since $G$ is valid and $G\vDash\ConfReq{T}{P}$, then $G_\texttt{P}$ is also valid by Proposition~\ref{protocol generic signature valid}. - -Then, we observe that the conditions of Proposition~\ref{general member type} are satisfied: -\begin{itemize} -\item The derivation $D_1$ gives us $G\vDash\ConfReq{$\texttt{U}^\prime$}{P}$. -\item The derivation $D_2^\prime$ gives us $G\vDash\FormalReq{U == $\texttt{U}^\prime$}$. -\item The validity of $G_\texttt{P}$ gives us $G_\texttt{P}\vDash\texttt{Self.V}$, since \texttt{Self.V} appears in the explicit requirement $\ConfReq{Self.V}{Q}$ of \texttt{P}. -\end{itemize} -Thus, $G\vDash\FormalReq{U.V == $\texttt{U}^\prime$.V}$. This is our new derivation $D_2$, completing the induction. +This is a primitive derivation by construction. We derive the same-type requirement $G\vDash\SameReq{U.V}{$\texttt{U}^\prime$.V}$, giving us $D_2$. To do that, we note that $G$ is well-formed, so $G\vDash\texttt{U.V}$. \LemmaRef{general member type} then gives us the desired derivation, completing the induction. \end{proof} \section{The Conformance Path Graph}\label{finding conformance paths} -Theorem~\ref{conformance paths theorem} gives us a way to construct a conformance path from a derivation of a conformance requirement, but this does not immediately lead to an effective algorithm, for two reasons. First, the derived requirements formalism is a purely theoretical tool; we don't actually directly build derivations in the implementation. Second, a conformance requirement may have multiple derivations and thus multiple conformance paths. We must be able to deterministically choose the ``best'' conformance path in some sense, since conformance paths are part of the \index{ABI}ABI in the form of symbol \index{mangling}mangling. +\ThmRef{conformance paths theorem} gives us a way to construct a conformance path from a derivation of a conformance requirement, but this does not immediately lead to an effective algorithm, for two reasons. First, the derived requirements formalism is a purely theoretical tool; we don't actually directly build derivations in the implementation. Second, a conformance requirement may have multiple derivations and thus multiple conformance paths. We must be able to deterministically choose the ``best'' conformance path in some sense, since conformance paths are part of the \index{ABI}ABI in the form of symbol \index{mangling}mangling. -To address these issues, we reformulate finding a conformance path as a graph theory problem. We construct a graph where the paths through the graph are the conformance paths of a generic signature. This graph might be infinite, but we can visit these paths in a certain order. We simplify each path to an abstract conformance with Algorithm~\ref{invertconformancepath}, and compare this with the abstract conformance whose conformance path we were asked to produce. If we have a match, we're done. Otherwise, we check the next path, and so on. Theorem~\ref{conformance paths theorem} now reveals its worth, because it ensures we must eventually find a conformance path which simplifies to our abstract conformance. Thus, our search must end after a finite number of steps. +To address these issues, we reformulate finding a conformance path as a graph theory problem. We construct a graph where the paths through the graph are the conformance paths of a generic signature. This graph might be infinite, but we can visit these paths in a certain order. We simplify each path to an abstract conformance with \AlgRef{invertconformancepath}, and compare this with the abstract conformance whose conformance path we were asked to produce. If we have a match, we're done. Otherwise, we check the next path, and so on. \ThmRef{conformance paths theorem} now reveals its worth, because it ensures we must eventually find a conformance path which simplifies to our abstract conformance. Thus, our search must end after a finite number of steps. \begin{definition} The \IndexDefinition{conformance path graph}\emph{conformance path graph} of a generic signature $G$ is the \index{directed graph}directed graph defined as follows: @@ -360,7 +369,10 @@ \section{The Conformance Path Graph}\label{finding conformance paths} That is, if $\ConfReq{T}{P}$ is an abstract conformance and $\AssocConf{Self.V}{Q}$ is an \index{associated conformance requirement}associated conformance requirement of \texttt{P}, we form the abstract conformance $\AssocConf{Self.V}{Q}\otimes \ConfReq{T}{P}=\ConfReq{T.V}{Q}$. If \texttt{U} is the reduced type of \texttt{T.V}, there is an edge with \index{source vertex}source vertex $\ConfReq{T}{P}$ and \index{destination vertex}destination vertex $\ConfReq{U}{Q}$. \end{itemize} -Crucially, a conformance path is actually a path, in the graph theoretical sense, whose source vertex is a root abstract conformance. Theorem~\ref{conformance paths theorem} can be interpreted as a statement about the conformance path graph, namely that every vertex is reachable by a path from a root vertex. +\vfill +\eject + +Crucially, a conformance path is actually a path, in the graph theoretical sense, whose source vertex is a root abstract conformance. \ThmRef{conformance paths theorem} can be interpreted as a statement about the conformance path graph, namely that every vertex is reachable by a path from a root vertex. \end{definition} Let's construct the conformance path graph for our running example. We have a single conformance path of length~1: @@ -379,7 +391,7 @@ \section{The Conformance Path Graph}\label{finding conformance paths} p_{32} = \AssocConf{Self}{Sequence}\otimes p_{22}\\ p_{33} = \AssocConf{Self.SubSequence}{Collection}\otimes p_{22} \end{gather*} -We do not need to consider the successors of $p_{33}$, because $p_{22}$ and $p_{33}$ simplify to two equivalent abstract conformances, by the same-type requirement in the \texttt{Collection} protocol, $\FormalReq{Self.SubSequence == Sub.SubSequence.SubSequence}$: +We do not need to consider the successors of $p_{33}$, because $p_{22}$ and $p_{33}$ simplify to two equivalent abstract conformances, by the same-type requirement in the \texttt{Collection} protocol, $\SameReq{Self.SubSequence}{Sub.SubSequence.SubSequence}$: \begin{gather*} p_{22}=\ConfReq{T.SubSequence}{Collection}\\ p_{33}=\ConfReq{T.SubSequence.SubSequence}{Collection} @@ -407,8 +419,8 @@ \section{The Conformance Path Graph}\label{finding conformance paths} p_4&\Rightarrow&\ConfReq{T.SubSequence.Iterator}{IteratorProtocol} \end{array} \] -Figure~\ref{conformance path graph example} shows the graph; notice how $\ConfReq{U}{Collection}$ has an edge looping back to itself. -\begin{figure}\captionabove{Conformance path graph for Listing~\ref{conformance paths listing}}\label{conformance path graph example} +\FigRef{conformance path graph example} shows the graph; notice how $\ConfReq{U}{Collection}$ has an edge looping back to itself. +\begin{figure}\captionabove{Conformance path graph for \ListingRef{conformance paths listing}}\label{conformance path graph example} \begin{center} \begin{tikzpicture}[sibling distance=5cm, level distance=1.6cm, edge from parent path={[->] (\tikzparentnode) .. controls +(0,-1) and +(0,1) .. (\tikzchildnode.north)}] @@ -437,7 +449,7 @@ \section{The Conformance Path Graph}\label{finding conformance paths} \end{figure} \paragraph{D\'ej\`a vu.} -If this looks familiar, recall that we already studied another directed graph associated with a generic signature, the \index{type parameter graph}type parameter graph of Section~\ref{type parameter graph}. Both graphs are generated by a generic signature, and describe equivalence classes of the reduced type equality relation. It is instructive to compare the two: +If this looks familiar, recall that we already studied another directed graph associated with a generic signature, the \index{type parameter graph}type parameter graph of \SecRef{type parameter graph}. Both graphs are generated by a generic signature, and describe equivalence classes of the reduced type equality relation. It is instructive to compare the two: \begin{quote} \begin{tabular}{lll} \toprule @@ -460,15 +472,15 @@ \section{The Conformance Path Graph}\label{finding conformance paths} \item (Initialize) If $x$ and $y$ have the same length, we compare their elements. Let $i:=0$ and $\texttt{N}:=|x|$. \item (Equal) If $i=\texttt{N}$, we didn't find any differences, so $x=y$. Return ``$=$''. \item (Subscript) Let $x_i$, $y_i$ be the $i$th elements of $x$ and $y$, respectively. -\item (Compare) Treating $x_i$ and $y_i$ as requirements, compare them with \index{requirement order}Algorithm~\ref{requirement order}. Return the result if it is ``$<$'' or ``$>$''. +\item (Compare) Treating $x_i$ and $y_i$ as requirements, compare them with \index{requirement order}\AlgRef{requirement order}. Return the result if it is ``$<$'' or ``$>$''. \item (Next) Increment $i$ and go back to Step~4. \end{enumerate} \end{algorithm} -Note that this is a linear order, because in Step~6, Algorithm~\ref{requirement order} cannot return ``$\bot$'', since protocol conformance requirements are always linearly ordered with respect to each other. Much like the type parameter order of Algorithm~\ref{type parameter order}, this is a special case of a shortlex order, which we generalize in Section~\ref{rewritesystemintro}. +Note that this is a linear order, because in Step~6, \AlgRef{requirement order} cannot return ``$\bot$'', since protocol conformance requirements are always linearly ordered with respect to each other. Much like the type parameter order of \AlgRef{type parameter order}, this is a special case of a shortlex order, which we generalize in \SecRef{rewritesystemintro}. -\paragraph{Enumeration.} The conformance path graph from our previous example has a finite set of vertices, but this is not true in general. However, it is true that each vertex only has finitely many \index{successor}successors, as there can only be finitely many associated conformance requirements. (This is also true in the \index{type parameter graph}type parameter graph, where successors are given by associated \emph{type} declarations). A graph with this property is said to be \index{locally finite graph}\emph{locally finite}. A locally finite graph only has finitely many paths of any fixed length $n\in\mathbb{N}$ (this can be shown by induction on path length). This implies that the conformance path order is \index{well-founded order}well-founded (by the same argument as Proposition~\ref{well founded type order}). It follows that every equivalence class of conformance paths contains a reduced conformance path. Hence, enumerating conformance paths in increasing order performs a \index{breadth-first search}breadth-first search which visits every path, and always visits the reduced conformance paths first. +\paragraph{Enumeration.} The conformance path graph from our previous example has a finite set of vertices, but this is not true in general. However, it is true that each vertex only has finitely many \index{successor}successors, as there can only be finitely many associated conformance requirements. (This is also true in the \index{type parameter graph}type parameter graph, where successors are given by associated \emph{type} declarations). A graph with this property is said to be \index{locally finite graph}\emph{locally finite}. A locally finite graph only has finitely many paths of any fixed length $n\in\mathbb{N}$ (this can be shown by induction on path length). This implies that the conformance path order is \index{well-founded order}well-founded (by the same argument as \PropRef{well founded type order}). It follows that every equivalence class of conformance paths contains a reduced conformance path. Hence, enumerating conformance paths in increasing order performs a \index{breadth-first search}breadth-first search which visits every path, and always visits the reduced conformance paths first. -Theorem~\ref{conformance paths theorem} provides a necessary termination condition for our search. This is actually also sufficient; the only other potential source of non-termination is the reduced type computation using the \IndexDefinition{getReducedType()@\texttt{getReducedType()}}\texttt{getReducedType()} generic signature query, but the theory of rewrite systems will give us that guarantee in Section~\ref{rewritesystemintro}. Thus, we can always find a conformance path in a finite number of steps. This exhaustive enumeration is relatively inefficient, but we can improve upon it with two modifications: +\ThmRef{conformance paths theorem} provides a necessary termination condition for our search. This is actually also sufficient; the only other potential source of non-termination is the reduced type computation using the \Index{getReducedType()@\texttt{getReducedType()}}\texttt{getReducedType()} generic signature query, but the theory of rewrite systems will give us that guarantee in \SecRef{rewritesystemintro}. Thus, we can always find a conformance path in a finite number of steps. This exhaustive enumeration is relatively inefficient, but we can improve upon it with two modifications: \begin{enumerate} \item While searching for a conformance path, we visit all preceding conformance paths. If we cache the result of simplifying each conformance path, we can reuse these results if a subsequent lookup requests an earlier conformance path, and avoid restarting the search. \item If we encounter a conformance path equivalent to one we've already seen, the new conformance path must not be reduced, since we visit paths in increasing order. Thus its successors are not reduced either, and do not need to be considered. This happened in our example when we encountered $p_{33}$ after already visiting $p_{22}$. @@ -497,7 +509,7 @@ \section{The Conformance Path Graph}\label{finding conformance paths} \begin{enumerate} -\item (Simplify) Invoke Algorithm~\ref{invertconformancepath} to simplify $c$ to an abstract conformance. Denote this abstract conformance by $\ConfReq{$\texttt{T}_c$}{$\texttt{P}_c$}$. +\item (Simplify) Invoke \AlgRef{invertconformancepath} to simplify $c$ to an abstract conformance. Denote this abstract conformance by $\ConfReq{$\texttt{T}_c$}{$\texttt{P}_c$}$. \item (Reduce) Replace $\texttt{T}_c$ with its reduced type, to get a reduced abstract conformance. @@ -515,7 +527,7 @@ \section{The Conformance Path Graph}\label{finding conformance paths} \end{algorithm} \paragraph{Algorithmic complexity.} -Despite the memoization performed above, Algorithm~\ref{find conformance path algorithm} requires exponential time in the worst case. We can demonstrate this by constructing a generic signature with $2^{n-1}$ unique conformance paths of length $n$. Consider the protocol generic signature $G_\texttt{P}$ with the following protocol \texttt{P}: +Despite the memoization performed above, \AlgRef{find conformance path algorithm} requires exponential time in the worst case. We can demonstrate this by constructing a generic signature with $2^{n-1}$ unique conformance paths of length $n$. Consider the protocol generic signature $G_\texttt{P}$ with the following protocol \texttt{P}: \begin{Verbatim} protocol P { associatedtype A: P @@ -532,11 +544,11 @@ \section{The Conformance Path Graph}\label{finding conformance paths} } \end{Verbatim} -In Section~\ref{minimal conformances}, we will show the algorithm for finding a minimal set of conformance requirements in a generic signature. The algorithm is based on the idea that we can calculate a finite set of \emph{conformance equations} which completely describe the potentially-infinite conformance path graph. While some details would need to be worked out, it should be possible to one day construct conformance paths directly from conformance equations, instead of the current approach of exhaustive enumeration. +In \SecRef{minimal conformances}, we will show the algorithm for finding a minimal set of conformance requirements in a generic signature. The algorithm is based on the idea that we can calculate a finite set of \emph{conformance equations} which completely describe the potentially-infinite conformance path graph. While some details would need to be worked out, it should be possible to one day construct conformance paths directly from conformance equations, instead of the current approach of exhaustive enumeration. \section{Recursive Conformances}\label{recursive conformances} -We saw a generic signature with an infinite type parameter graph in Section~\ref{type parameter graph}, and the previous section mentioned the possibility of an infinite conformance path graph. Now, we will show that both graphs are infinite if either one is infinite, and then attempt to better understand generic signatures where this is the case. +We saw a generic signature with an infinite type parameter graph in \SecRef{type parameter graph}, and the previous section mentioned the possibility of an infinite conformance path graph. Now, we will show that both graphs are infinite if either one is infinite, and then attempt to better understand generic signatures where this is the case. \begin{proposition}\label{infinite signature lemma} For a \index{generic signature}generic signature $G$, the following are equivalent: \begin{enumerate} @@ -548,7 +560,7 @@ \section{Recursive Conformances}\label{recursive conformances} \begin{proof} For $(1)\Rightarrow(2)$, note that the set of generic parameter types is always finite, so it suffices to only consider reduced \index{dependent member type}dependent member types. Suppose we're given an infinite set of reduced dependent member types; we must produce an infinite set of abstract conformances. Each dependent member type \texttt{T.[P]A} is equivalent to an ordered pair consisting of a type witness projection $\AssocType{[P]A}$ and an abstract conformance $\ConfReq{T}{P}$; the first element of the pair is drawn from a finite set, so a counting argument shows that the mapping that takes the second element of each pair must give us an infinite set of abstract conformances. -Furthermore, these abstract conformances must be reduced, meaning their subject types are reduced. To see why, note that whenever \texttt{T.[P]A} is a reduced dependent member type, its base type \texttt{T} must be reduced as well (otherwise, if $G\vDash\FormalReq{$\texttt{T}^\prime$ == T}$ with $\texttt{T}^\prime<\texttt{T}$, we could construct from this a derivation of $\FormalReq{$\texttt{T}^\prime$.[P]A == T.[P]A}$ with $\texttt{$\texttt{T}^\prime$.[P]A} < \texttt{T.[P]A}$, contradicting the asumption that \texttt{T.[P]A} is reduced). +Furthermore, these abstract conformances must be reduced, meaning their subject types are reduced. To see why, note that whenever \texttt{T.[P]A} is a reduced dependent member type, its base type \texttt{T} must be reduced as well (otherwise, if $G\vDash\SameReq{$\texttt{T}^\prime$}{T}$ with $\texttt{T}^\prime<\texttt{T}$, we could construct from this a derivation of $\SameReq{$\texttt{T}^\prime$.[P]A}{T.[P]A}$ with $\texttt{$\texttt{T}^\prime$.[P]A} < \texttt{T.[P]A}$, contradicting the asumption that \texttt{T.[P]A} is reduced). A similar argument establishes $(2)\Rightarrow(1)$. We're given an infinite set of reduced abstract conformances, and we must produce an infinite set of reduced type parameters. Each abstract conformance $\ConfReq{T}{P}$ uniquely determines an ordered pair, consisting of a \index{protocol declaration}protocol declaration $\protosym{P}$ and a type parameter \texttt{T}. The set of protocol declarations is finite, so again, taking the second element of each pair gives us an infinite set of reduced type parameters. \end{proof} @@ -571,7 +583,7 @@ \section{Recursive Conformances}\label{recursive conformances} A conformance path defines a path in the protocol dependency graph: we map each abstract conformance to its conformed protocol, observing that the edge relation in both graphs is associated conformance projection. Also, an infinite generic signature must have conformance paths of arbitrary length, as there are only finitely many conformance paths of any \emph{fixed} length. The protocol dependency graph is finite, so the protocol dependency path induced by a sufficiently-long conformance path then has a \index{cycle}cycle. -A \IndexDefinition{recursive conformance requirement}\emph{recursive conformance requirement} is an associated conformance requirement that is part of a cycle; hence, a necessary condition for writing down an infinite generic signature is that the protocol dependency graph must contain a cycle. Thus, prior to Swift 4.1 introducing recursive conformance requirements \cite{se0157}, the protocol dependency graph was required to be acyclic, and every generic signature was necessarily finite. +A \IndexDefinition{recursive conformance requirement}\emph{recursive conformance requirement} is an associated conformance requirement that is part of a cycle; hence, a necessary condition for writing down an infinite generic signature is that the protocol dependency graph must contain a cycle. Thus, prior to \IndexSwift{4.1}Swift 4.1 introducing recursive conformance requirements \cite{se0157}, the protocol dependency graph was required to be acyclic, and every generic signature was necessarily finite. \smallskip @@ -589,7 +601,7 @@ \section{Recursive Conformances}\label{recursive conformances} \path [->,every loop/.style={min distance=13mm}] (N) edge [loop left] (); \end{tikzpicture} \end{wrapfigure} -The protocol dependency graph for Listing~\ref{protocol dependency graph listing} is shown on the right. It has two distinct cycles; the first joins $\protosym{Q}$ with $\protosym{R}$, and the second is a loop at $\protosym{N}$. This gives us three recursive associated conformance requirements, by the above definition: +The protocol dependency graph for \ListingRef{protocol dependency graph listing} is shown on the right. It has two distinct cycles; the first joins $\protosym{Q}$ with $\protosym{R}$, and the second is a loop at $\protosym{N}$. This gives us three recursive associated conformance requirements, by the above definition: \begin{gather*} \AssocConf{Self.[N]A}{N}\\ \AssocConf{Self.[Q]B}{R}\\ @@ -620,16 +632,16 @@ \section{Recursive Conformances}\label{recursive conformances} \end{Verbatim} \end{listing} -Now consider the protocol generic signature $G_\texttt{Q}$ from Listing~\ref{protocol dependency graph listing}. Figure~\ref{infinite tree graph} shows the conformance path graph for this generic signature, an infinite tree with the abstract conformance $\ConfReq{Self}{Q}$ as the root. One possible path from the root is the conformance path for $\ConfReq{Self.B.D.B.C}{N}$ in $G_\texttt{Q}$: +Now consider the protocol generic signature $G_\texttt{Q}$ from \ListingRef{protocol dependency graph listing}. \FigRef{infinite tree graph} shows the conformance path graph for this generic signature, an infinite tree with the abstract conformance $\ConfReq{Self}{Q}$ as the root. One possible path from the root is the conformance path for $\ConfReq{Self.B.D.B.C}{N}$ in $G_\texttt{Q}$: \[ \AssocConf{Self.C}{N} \otimes \AssocConf{Self.B}{R} \otimes \AssocConf{Self.D}{Q} \otimes \AssocConf{Self.B}{R} \otimes \ConfReq{Self}{Q} \] Writing down the list of protocols from each step (remember that conformance paths are read from right to left!) we get a path in the protocol dependency graph; the path visits $\protosym{Q}$ and $\protosym{R}$ twice, exhibiting the existence of a cycle: \[\protosym{Q}\longrightarrow\protosym{R}\longrightarrow\protosym{Q}\longrightarrow\protosym{R}\longrightarrow\protosym{N}\] -We will encounter protocol dependency graphs again in Chapter~\ref{rqm basic operation}, when we describe the construction of the rewrite system for a generic signature. +We will encounter protocol dependency graphs again in \ChapRef{rqm basic operation}, when we describe the construction of the rewrite system for a generic signature. -\begin{figure}\captionabove{Conformance path graph for $G_\texttt{Q}$ from Listing~\ref{protocol dependency graph listing}}\label{infinite tree graph} +\begin{figure}\captionabove{Conformance path graph for $G_\texttt{Q}$ from \ListingRef{protocol dependency graph listing}}\label{infinite tree graph} \begin{center} \begin{tikzpicture}[sibling distance=3.8cm, level distance=1.5cm, edge from parent path={[->] (\tikzparentnode) .. controls +(0,-1) and +(0,1) .. (\tikzchildnode.north)}] @@ -672,7 +684,7 @@ \section{Recursive Conformances}\label{recursive conformances} \newcommand{\SelfAToN}{\AssocConf{Self.A}{N}} -\begin{figure}\captionabove{Conformance path graph for protocol \texttt{N} from Listing~\ref{protocol dependency graph listing}}\label{ray conformance path graph} +\begin{figure}\captionabove{Conformance path graph for protocol \texttt{N} from \ListingRef{protocol dependency graph listing}}\label{ray conformance path graph} \begin{center} \begin{tikzpicture}[level distance=1.5cm, edge from parent path={[->] (\tikzparentnode) .. controls +(0,-1) and +(0,1) .. (\tikzchildnode.north)}] @@ -693,22 +705,22 @@ \section{Recursive Conformances}\label{recursive conformances} \end{center} \end{figure} -The protocol generic signature $G_\texttt{N}$ with the protocol \texttt{N} from Listing~\ref{protocol dependency graph listing} is then the quintessential infinite generic signature, in a sense, because its conformance path graph is just one ray, shown in Figure~\ref{ray conformance path graph}. Let $\Sigma_{\ConfReq{T}{N}}$ be a \index{protocol substitution map}protocol substitution map for a conformance to \texttt{N}. This is a a substitution map with input generic signature $G_\texttt{N}$, which we can apply to each abstract conformance of $G_\texttt{N}$, obtaining a sequence of concrete conformances: +The protocol generic signature $G_\texttt{N}$ with the protocol \texttt{N} from \ListingRef{protocol dependency graph listing} is then the quintessential infinite generic signature, in a sense, because its conformance path graph is just one ray, shown in \FigRef{ray conformance path graph}. Let $\Sigma_{\ConfReq{T}{N}}$ be a \index{protocol substitution map}protocol substitution map for a conformance to \texttt{N}. This is a substitution map with input generic signature $G_\texttt{N}$, which we can apply to each abstract conformance of $G_\texttt{N}$, obtaining a sequence of concrete conformances: \begin{gather*} \ConfReq{\ttgp{0}{0}}{N}\otimes\Sigma_{\ConfReq{T}{N}}=\ConfReq{T}{N}\\ \ConfReq{\ttgp{0}{0}.Body}{N}\otimes\Sigma_{\ConfReq{T}{N}}=\SelfAToN\otimes\ConfReq{T}{N}\\ \ConfReq{\ttgp{0}{0}.Body.Body}{N}\otimes\Sigma_{\ConfReq{T}{N}}=\SelfAToN\otimes\SelfAToN\otimes\ConfReq{T}{N}\\ \ldots \end{gather*} -Local conformance lookup associates a substituted conformance with each abstract conformance. The substituted conformances are not necessarily distinct, but the mapping is compatible with associated conformance projection, as we saw in Section~\ref{associated conformances}. To understand the structure we obtain here, we define yet another directed graph. +Local conformance lookup associates a substituted conformance with each abstract conformance. The substituted conformances are not necessarily distinct, but the mapping is compatible with associated conformance projection, as we saw in \SecRef{associated conformances}. To understand the structure we obtain here, we define yet another directed graph. \begin{definition} -The \IndexDefinition{conformance evaluation graph}\emph{conformance evaluation graph} of a substitution map is the following directed graph: +The \IndexDefinition{conformance evaluation graph}\emph{conformance evaluation graph} of a substitution map is the following \index{directed graph}directed graph: \begin{itemize} -\item The vertices are the substituted conformances obtained by applying the substitution map to each abstract conformance of its input generic signature. -\item The edge relation is given by associated conformance projection. +\item The \index{vertex}vertices are the substituted conformances obtained by applying the substitution map to each abstract conformance of its \index{input generic signature}input generic signature. +\item The \index{edge}edge relation is given by associated conformance projection. \end{itemize} -Intuitively, the conformance path graph encodes all conformance paths of a \emph{generic signature}, while the conformance evaluation graph encodes the substituted conformances one can obtain from a \emph{substitution map}. +The conformance path graph encodes all conformance paths of a \emph{generic signature}, while the conformance evaluation graph encodes the substituted conformances one can obtain from a \emph{substitution map}. \end{definition} In fact, this mapping from the conformance path graph to the conformance evaluation graph, given by perform a local conformance lookup on each abstract conformance, is a special kind of mapping between directed graphs. \begin{definition} @@ -722,7 +734,7 @@ \section{Recursive Conformances}\label{recursive conformances} That is, if two vertices are joined by an edge in $E_1$, their image must be joined by an edge in $E_2$ (and if both vertices map to the same vertex, then $E_2$ contains a loop at this vertex). An immediate consequence of this definition is that a graph homomorphism also maps paths in $G$ to paths in $H$. \end{definition} -Let's now reconsider the application of a protocol substitution map $\Sigma_{\ConfReq{T}{N}}$, where \texttt{T} is an arbitrary concrete type, to each abstract conformance of $G_{\texttt{N}}$. In light of the above, that we're actually looking at is a graph homomorphism from the conformance path graph of $G_{\texttt{N}}$, to the conformance evaluation graph of $\Sigma_{\ConfReq{T}{N}}$. We saw that the conformance path graph of $G_{\texttt{N}}$ is a ray, and local conformance lookup maps each successive vertex to the result of applying $\SelfAToN$ some number of times to the root conformance $\ConfReq{T}{N}$. There are three possibilities, each one corresponding to different outcomes of the repeated application of $\SelfAToN$ to $\ConfReq{T}{N}$: +Let's now reconsider the application of a \index{protocol substitution map}protocol substitution map $\Sigma_{\ConfReq{T}{N}}$, where \texttt{T} is an arbitrary concrete type, to each abstract conformance of $G_{\texttt{N}}$. In light of the above, that we're actually looking at is a graph homomorphism from the conformance path graph of $G_{\texttt{N}}$, to the conformance evaluation graph of $\Sigma_{\ConfReq{T}{N}}$. We saw that the conformance path graph of $G_{\texttt{N}}$ is a ray, and local conformance lookup maps each successive vertex to the result of applying $\SelfAToN$ some number of times to the root conformance $\ConfReq{T}{N}$. There are three possibilities, each one corresponding to different outcomes of the repeated application of $\SelfAToN$ to $\ConfReq{T}{N}$: \begin{enumerate} \item We eventually end up back at $\ConfReq{T}{N}$; the ray is mapped to a cycle, and the conformance evaluation graph is finite. \item Every application of $\SelfAToN$ gives us a new conformance we have not seen before; the ray is mapped to another ray, and we have an infinite conformance evaluation graph. @@ -893,7 +905,7 @@ \section{Recursive Conformances}\label{recursive conformances} \end{quote} \paragraph{Non-terminating substitutions.} -We gave a termination proof for Algorithm~\ref{find conformance path algorithm}; we can always find a conformance path in a finite number of steps. However, we cannot make the same guarantee about Algorithm~\ref{dependent member type substitution} for evaluating of a conformance path. \index{limitation} This example demonstrates \index{non-terminating computation}non-terminating substitution: +We gave a termination proof for \AlgRef{find conformance path algorithm}; we can always find a conformance path in a finite number of steps. However, we cannot make the same guarantee about \AlgRef{dependent member type substitution} for evaluating of a conformance path. \index{limitation!non-terminating type substitution} This example demonstrates \index{non-terminating computation}non-terminating substitution: \begin{Verbatim} struct S: N { typealias A = F @@ -958,7 +970,7 @@ \section{Recursive Conformances}\label{recursive conformances} \AssocType{[N]A} \otimes \bigl( \SelfAToN \otimes \ConfReq{S}{N} \bigr) \\ \qquad {} = \AssocType{[N]A} \otimes \bigl( \ConfReq{F<\ttgp{0}{0}>}{N} \otimes \Sigma_\texttt{S} \bigr) \end{gather*} -It appears that our attempt to evaluate $\texttt{\ttgp{0}{0}.[N]A}\otimes \Sigma_\texttt{F}\otimes\Sigma_\texttt{S}$ gets stuck in a loop. This exposes a hole in our theory, because it shows that local conformance lookup---and our type substitution operator ``$\otimes$''---are actually \index{partial function}\emph{partial} functions, which do not always have a well-defined result. Similarly, the substitution maps $\Sigma_\texttt{S}$ and $\Sigma_\texttt{F}$ do not have conformance evaluation graphs. (While for the most part this theoretical hole doesn't matter, in Section~\ref{rewritesystemintro}, we will sketch out an alternative way of formalizing ``$\otimes$'' which avoids this difficulty.) +It appears that our attempt to evaluate $\texttt{\ttgp{0}{0}.[N]A}\otimes \Sigma_\texttt{F}\otimes\Sigma_\texttt{S}$ gets stuck in a loop. This exposes a hole in our theory, because it shows that local conformance lookup---and our type substitution operator ``$\otimes$''---are actually \index{partial function}\emph{partial} functions, which do not always have a well-defined result. Similarly, the substitution maps $\Sigma_\texttt{S}$ and $\Sigma_\texttt{F}$ do not have conformance evaluation graphs. (While for the most part this theoretical hole doesn't matter, in \SecRef{rewritesystemintro}, we will sketch out an alternative way of formalizing ``$\otimes$'' which avoids this difficulty.) Indeed, at the time of writing, the compiler actually terminates with a stack overflow while attempting to perform this type substitution. In the future, the compiler should detect this situation and diagnose it, instead of crashing. A savvy compiler engineer will immediately suggest at least two possible approaches for doing so: \begin{enumerate} @@ -1077,7 +1089,7 @@ \section{The Halting Problem}\label{tag systems} Evaluation now stops; the length of ``$a$'' is smaller than the deletion number of 2. Now consider the evaluation steps where the input string consists entirely of ``$a$''; in order, we have ``$aaaa$'', ``$aa$'', and ``$a$''. This is the Collatz sequence for 4, which is 4, 2, 1. If we had instead started with the input string ``$aaaaa$'', we would get the Collatz sequence for 5, and so on. \paragraph{Swift type substitution.} -To encode this tag system in Swift, we begin with a protocol declaration, together with the five concrete types conforming to \texttt{P} shown in Listing~\ref{collatz listing}: +To encode this tag system in Swift, we begin with a protocol declaration, together with the five concrete types conforming to \texttt{P} shown in \ListingRef{collatz listing}: \begin{Verbatim} protocol P { associatedtype A: P @@ -1343,7 +1355,7 @@ \section{The Halting Problem}\label{tag systems} We reach the halting state, because the Collatz sequence for $n=2$ reaches 1. \paragraph{Further discussion.} -The discussion of generic function types in Section~\ref{misc types} alluded to the undecidability of type inference in the System~F formalism. In Section~\ref{conditional conformance}, we looked at an example of a non-terminating conditional conformance check in Swift, and cited a similar example in \index{Rust}Rust. Later, Section~\ref{word problem} will show that Swift reduced type equality is also undecidable in the general case, but we will provide a termination guarantee by restricting the problem. Countless other instances of undecidable type checking problems are described in the literature, exploiting clever and varied tricks to encode arbitrary computation in terms of types. We will cite just two more: \index{Java}Java generics were shown to be Turing-complete in \cite{java_undecidable}, and \index{TypeScript}TypeScript in \cite{tscollatz}; the latter example also encodes the Collatz sequence, but in a completely different manner than we did in this section. Finally, the Collatz conjecture, which fundamentally has no relation to type systems or programming languages, is discussed in \cite{collatzbook} and \cite{wolframtag}. +The discussion of generic function types in \SecRef{misc types} alluded to the undecidability of type inference in the System~F formalism. In \SecRef{conditional conformance}, we looked at an example of a non-terminating conditional conformance check in Swift, and cited a similar example in \index{Rust}Rust. Later, \SecRef{word problem} will show that Swift reduced type equality is also undecidable in the general case, but we will provide a termination guarantee by restricting the problem. Countless other instances of undecidable type checking problems are described in the literature, exploiting clever and varied tricks to encode arbitrary computation in terms of types. We will cite just two more: \index{Java}Java generics were shown to be Turing-complete in \cite{java_undecidable}, and \index{TypeScript}TypeScript in \cite{tscollatz}; the latter example also encodes the Collatz sequence, but in a completely different manner than we did in this section. Finally, the Collatz conjecture, which fundamentally has no relation to type systems or programming languages, is discussed in \cite{collatzbook} and \cite{wolframtag}. \section{Source Code Reference} @@ -1364,14 +1376,14 @@ \section{Source Code Reference} Represents a conformance path as an array of one or more entries. The first entry corresponds to a conformance requirement in the generic signature; each subsequent entry is an associated conformance requirement. \apiref{GenericSignatureImpl}{class} -The \verb|getConformancePath()| method returns the conformance path for a type parameter and protocol declaration. For other methods, see Section~\ref{genericsigsourceref}. +The \verb|getConformancePath()| method returns the conformance path for a type parameter and protocol declaration. For other methods, see \SecRef{genericsigsourceref}. \IndexSource{local conformance lookup} \apiref{SubstitutionMap}{class} -The \verb|lookupConformance()| method implements Algorithm~\ref{local conformance lookup algorithm} for performing a local conformance lookup. For other methods, see Section~\ref{substmapsourcecoderef}. +The \verb|lookupConformance()| method implements \AlgRef{local conformance lookup algorithm} for performing a local conformance lookup. For other methods, see \SecRef{substmapsourcecoderef}. \apiref{getMemberForBaseType()}{function} -A static helper function in \verb|Type.cpp|, used by \verb|Type::subst()|. Implements Algorithm~\ref{dependent member type substitution}. +A static helper function in \verb|Type.cpp|, used by \verb|Type::subst()|. Implements \AlgRef{dependent member type substitution}. \subsection*{Finding Conformance Paths} @@ -1379,11 +1391,11 @@ \subsection*{Finding Conformance Paths} \begin{itemize} \item \SourceFile{lib/AST/RequirementMachine/GenericSignatureQueries.cpp} \end{itemize} -The \verb|getConformancePath()| method on \verb|GenericSignature| calls the method with the same name in \verb|RequirementMachine|. The latter implements Algorithm~\ref{find conformance path algorithm}. A pair of instance variables model the algorithm's persistent state: +The \verb|getConformancePath()| method on \verb|GenericSignature| calls the method with the same name in \verb|RequirementMachine|. The latter implements \AlgRef{find conformance path algorithm}. A pair of instance variables model the algorithm's persistent state: \begin{itemize} \item \verb|ConformancePaths| is the table of known conformance paths. \item \verb|CurrentConformancePaths| is the buffer of conformance paths at the currently enumerated length. \end{itemize} -Algorithm~\ref{find conformance path algorithm} traffics in reduced type parameters, while the actual implementation deals with instances of \verb|Term|. A term is the internal Requirement Machine representation of a type parameter, as you will see in Chapter~\ref{symbols terms rules}. This avoids round-trip conversions between \verb|Term| and \verb|Type| when computing reduced types that are only used for comparing against other reduced types. A \verb|Term| can function as a hash table key and is otherwise equivalent to a \verb|Type| here. +\AlgRef{find conformance path algorithm} traffics in reduced type parameters, while the actual implementation deals with instances of \verb|Term|. A term is the internal Requirement Machine representation of a type parameter, as you will see in \ChapRef{symbols terms rules}. This avoids round-trip conversions between \verb|Term| and \verb|Type| when computing reduced types that are only used for comparing against other reduced types. A \verb|Term| can function as a hash table key and is otherwise equivalent to a \verb|Type| here. -\end{document} \ No newline at end of file +\end{document} diff --git a/docs/Generics/chapters/conformances.tex b/docs/Generics/chapters/conformances.tex index b78dba34b68ec..c5f325effee05 100644 --- a/docs/Generics/chapters/conformances.tex +++ b/docs/Generics/chapters/conformances.tex @@ -17,16 +17,16 @@ \chapter{Conformances}\label{conformances} Formally, a conformance describes how a concrete type \emph{witnesses} each requirement of a protocol that it conforms to. There are three kinds of conformance: \begin{enumerate} \item An \textbf{invalid conformance} denotes that a type does not actually conform to the protocol. -\item An \textbf{abstract conformance} denotes that a type conforms to the protocol, but it is not known where this conformance was declared (Section~\ref{abstract conformances}). +\item An \textbf{abstract conformance} denotes that a type conforms to the protocol, but it is not known where this conformance was declared (\SecRef{abstract conformances}). \item A \textbf{concrete conformance} represents a conformance with a known definition. \end{enumerate} Concrete conformances break down further into four sub-kinds, the first two being our primary focus for now: \begin{enumerate} \item A \textbf{normal conformance} declares a conformance on a nominal type or extension. -\item A \textbf{specialized conformance} results from type substitution (Section~\ref{conformance subst}). -\item A \textbf{self conformance} describes a self-conforming protocol, which is only possible in a few very special cases (Section~\ref{selfconformingprotocols}). -\item An \textbf{inherited conformance} describes the conformance of a subclass when the conformance was declared on the superclass (Section~\ref{inheritedconformance}). +\item A \textbf{specialized conformance} results from type substitution (\SecRef{conformance subst}). +\item A \textbf{self conformance} describes a self-conforming protocol, which is only possible in a few very special cases (\SecRef{selfconformingprotocols}). +\item An \textbf{inherited conformance} describes the conformance of a subclass when the conformance was declared on the superclass (\SecRef{inheritedconformance}). \end{enumerate} \index{extension declaration}% @@ -34,7 +34,7 @@ \chapter{Conformances}\label{conformances} \index{inheritance clause}% \IndexDefinition{conformance}% \index{protocol conformance|see{conformance}}% -\paragraph{Normal conformances} Structs, enums and classes can conform to protocols. A normal conformance represents the \emph{declaration} of a conformance. Normal conformances are declared in the inheritance clause of a nominal type or extension:\index{horse} +\paragraph{Normal conformances.} Structs, enums and classes can conform to protocols. A normal conformance represents the \emph{declaration} of a conformance. Normal conformances are declared in the inheritance clause of a nominal type or extension:\index{horse} \begin{Verbatim} struct Horse: Animal {...} @@ -57,10 +57,10 @@ \chapter{Conformances}\label{conformances} \item \textbf{The type:} the declared interface type of the conforming context. \item \textbf{The protocol:} this is the protocol being conformed to. \item \textbf{The conforming context:} either a nominal type declaration (if the conformance is stated on the type) or an extension thereof (if the conformance is stated on an extension). -\item \textbf{The generic signature:} the generic signature of the conforming context. If the conformance context is a nominal type declaration or an unconstrained extension, this is the generic signature of the nominal type. If the conformance context is a constrained extension, this generic signature will have additional requirements, and the conformance becomes a conditional conformance. Conditional conformances are described in Section~\ref{conditional conformance}. -\item \textbf{Type witnesses:} a mapping from each associated type of the protocol to the concrete type witnessing the associated type requirement. This is an interface type written in terms of the generic signature of the conformance. Section~\ref{type witnesses} will talk about type witnesses. -\item \textbf{Associated conformances:} a mapping from the conformance requirements of the requirement signature to a conformance of the substituted subject type to the requirement's protocol. Section~\ref{associated conformances} will talk about associated conformances. -\item \textbf{Value witnesses:} for each value requirement of the protocol, the declaration witnessing the requirement. This declaration is either a member of the conforming nominal type, an extension of the conforming nominal type, or it is a default implementation from a protocol extension. The mapping is more elaborate than just storing the witness declaration; Chapter~\ref{valuerequirements} goes into the details. +\item \textbf{The generic signature:} the generic signature of the conforming context. If the conformance context is a nominal type declaration or an unconstrained extension, this is the generic signature of the nominal type. If the conformance context is a constrained extension, this generic signature will have additional requirements, and the conformance becomes a conditional conformance. Conditional conformances are described in \SecRef{conditional conformance}. +\item \textbf{Type witnesses:} a mapping from each associated type of the protocol to the concrete type witnessing the associated type requirement. This is an interface type written in terms of the generic signature of the conformance. \SecRef{type witnesses} will talk about type witnesses. +\item \textbf{Associated conformances:} a mapping from the conformance requirements of the requirement signature to a conformance of the substituted subject type to the requirement's protocol. \SecRef{associated conformances} will talk about associated conformances. +\item \textbf{Value witnesses:} for each value requirement of the protocol, the declaration witnessing the requirement. This declaration is either a member of the conforming nominal type, an extension of the conforming nominal type, or it is a \index{default witness}default witness from a \index{protocol extension}protocol extension. The mapping is more elaborate than just storing the witness declaration; \ChapRef{valuerequirements} goes into the details. \end{itemize} \section{Conformance Lookup}\label{conformance lookup} @@ -73,7 +73,7 @@ \section{Conformance Lookup}\label{conformance lookup} \begin{itemize} \item For the declared interface type of a \index{nominal type declaration}nominal type declaration, it consults the \index{conformance lookup table}conformance lookup table, and returns a \index{normal conformance}normal conformance to the protocol, if one exists. \item For another specialized type of a nominal type declaration, it returns a specialized conformance, as described in the next section. -\item For a type parameter, it returns an abstract conformance (Section~\ref{abstract conformances}). +\item For a type parameter, it returns an abstract conformance (\SecRef{abstract conformances}). \item In any case not covered by the above, it returns an invalid conformance. \end{itemize} @@ -96,45 +96,35 @@ \section{Conformance Lookup}\label{conformance lookup} We will look at more equations and commutative diagrams in the next section, after a brief interlude to discuss a conceptual difficulty. -\IndexDefinition{coherence} -\index{module declaration} -\index{dynamic cast} -\index{limitation} -\paragraph{Coherence} In reality, our diagram above hand-waves away a significant complication. Since a conformance can be declared on an extension, and the extended type might be defined in a different module, it is possible that two modules may define the same conformance in two different ways. Global conformance lookup is not guaranteed to be \emph{coherent}. - -For example, imagine if there were two different conformances of some concrete type \texttt{K} to \texttt{Hashable}. Then it would be possible for two different modules to construct values of type \texttt{Set} with incompatible hash functions; passing such a value from one module to the other would result in undefined behavior. - -For now, there's no real answer to this dilemma. The compiler rejects duplicate conformance definitions if an existing conformance is statically visible, so this scenario cannot occur with \texttt{Int} and \texttt{Hashable} for instance, because the conformance of \texttt{Int} to \texttt{Hashable} in the standard library is always visible, so any attempt to define a new conformance would be diagnosed as an error. - -However, if the concrete type \texttt{K} is defined in some common module, and two separately-compiled modules both define a conformance of \texttt{K} to \texttt{Hashable}, a module that imports all three will observe both conformances statically, with unpredictable results. - -A similar scenario can occur with library evolution. Suppose a library publishes the concrete type \texttt{K}, and a third party defines a conformance of \texttt{K} to \texttt{Hashable}. If the library vendor then adds their own conformance of \texttt{K} to \texttt{Hashable}, the previously-compiled client might encounter incorrect behavior at runtime. - -The global conformance lookup operation as implemented by the compiler actually takes a module declaration as an input, along with the type and protocol. The intent behind passing the module was that it should be taken into account somehow, perhaps restricting the search to those conformances that are transitively visible via import declarations, with an error diagnostic in the case of a true ambiguity. At the time of writing, this module declaration is ignored. - -The runtime equivalent of a global conformance lookup is a \emph{dynamic cast} from a concrete type to an existential type. Dynamic casts suffer from a similar ambiguity issue. To be coherent, this dynamic cast operation would need to inspect something akin to a module import graph reified at the call site to be able to disambiguate duplicate conformances. - -In the absence of proper compiler support for addressing this problem, there is a rule of thumb that, if followed by Swift users, mostly guarantees coherence. The rule is that when defining a conformance on an extension, either the extended type or the protocol should be part of the current module. - -That is, the following is fine, because our own type conforms to a standard library protocol: +\paragraph{Coherence.} If a conformance is declared on an extension, the conforming type and the conformed protocol might be in different \index{module declaration}modules. We consider the three possibilities. First, we can conform our own type to a protocol from another module: \begin{Verbatim} struct MyType {...} extension MyType: Hashable {...} \end{Verbatim} -This is fine too, because a standard library type conforms to our own protocol: +We can also conform a type from another module to our own protocol: \begin{Verbatim} protocol MyProtocol {...} extension Int: MyProtocol {...} \end{Verbatim} - -However the next example is potentially problematic; we're defining the conformance of a standard library type to a standard library protocol, and nothing prevents some other module from declaring the same conformance: +Finally, in the most general case, both the conforming type and conformed protocol are declared in modules separate from ours. We call this a \IndexDefinition{retroactive conformance}\emph{retroactive conformance}: \begin{Verbatim} extension String.UTF8View: Hashable {...} +// warning: extension declares a conformance of imported type +// 'UTF8View' to imported protocol 'Hashable'; this will not +// behave correctly if the owners of 'Swift' introduce this +// conformance in the future +\end{Verbatim} +As of \IndexSwift{6.0}Swift~6, the compiler diagnoses this situation, enforcing what is sometimes known as the \index{orphan rule|see{retroactive conformance}}\emph{orphan rule} in other languages with similar capabilities. For source compatibility reasons, this is just a warning, but it ought to be an error. The \index{diagnostic!retroactive conformance}diagnostic can be silenced by explicitly stating the intent with the \texttt{@retroactive} attribute to the conformance, offering an escape hatch for this situations that warrant it (and our example certainly does not): +\begin{Verbatim} +extension String.UTF8View: @retroactive Hashable {...} \end{Verbatim} +Retroactive conformances are problematic because they allow one to declare \index{overlapping conformance}\index{overlapping conformance|seealso{retroactive conformance}}\emph{overlapping conformances}. + +The compiler rejects a conformance declaration if some existing conformance is statically visible, so this scenario cannot occur with \texttt{Int} and \texttt{Hashable} for instance, because any attempt to define a overlapping conformance will be caught. With retroactive conformances, this is no longer the case, so we cannot guarantee the \IndexDefinition{coherence}\emph{coherence} of global conformance lookup. Suppose a concrete type \texttt{K} is defined in some common module, and two unrelated modules each declare a conformance of \texttt{K} to \texttt{Hashable} unbeknownst to the other. If a module then imports all three, it becomes possible to construct two values of type \texttt{Set} with incompatible hash functions; passing such a value across module boundaries will result in undefined behavior. -A conformance where neither the conforming type nor the protocol is part of the current module is called a \IndexDefinition{retroactive conformance}\emph{retroactive conformance}. Today, retroactive conformances are allowed without any restrictions, but an evolution proposal was accepted to have them generate a warning \cite{se0364}. +A similar scenario can occur with library evolution. Suppose a library author publishes the concrete type \texttt{K}, and a third party defines a retroactive conformance of \texttt{K} to \texttt{Hashable}. If the library vendor then adds their own conformance of \texttt{K} to \texttt{Hashable}, the previously-compiled client might encounter incorrect behavior at runtime. -Unfortunately, avoiding retroactive conformances does not completely solve the issue either, because there is another possible hole with class inheritance and library evolution. Consider a framework which defines an open class and a protocol: +Unfortunately, the orphan rule does not completely \index{limitation!orphan rule and subclassing}close the hole, because of class inheritance and library evolution. Consider a framework which defines an open class and a protocol: \begin{Verbatim} public protocol MyProtocol {} open class BaseClass {} @@ -146,11 +136,15 @@ \section{Conformance Lookup}\label{conformance lookup} class DerivedClass: BaseClass {} extension DerivedClass: MyProtocol {} \end{Verbatim} -However, in the next version of the framework, the framework author might decide to conform \texttt{BaseClass} to \texttt{MyProtocol}. At this point, \texttt{DerivedClass} has two duplicate conformances to \texttt{MyProtocol}; the inherited conformance from \texttt{BaseClass}, and the local conformance of \texttt{DerivedClass}. +However, in the next version of the framework, the framework author might decide to conform \texttt{BaseClass} to \texttt{MyProtocol}. At this point, \texttt{DerivedClass} has two overlapping conformances to \texttt{MyProtocol}; the inherited conformance from \texttt{BaseClass}, and the local conformance of \texttt{DerivedClass}. + +It is possible to some day extend the compiler to correctly reason about overlapping conformances. As implemented, the global conformance lookup operation takes a \index{module declaration}module declaration as an input, along with the type and protocol. The intent behind passing the module was that it should determine visibility of conformances via the \index{import declaration}import graph. At the time of writing, this module declaration is ignored. + +The runtime equivalent of a global conformance lookup is a \index{dynamic cast}\emph{dynamic cast} from a concrete type to an existential type. Dynamic casts would also need to be extended to deal with overlapping conformances, inspecting something akin to a module import graph reified at the call site. \section{Conformance Substitution}\label{conformance subst} -Now, we will define global conformance lookup with a \index{specialized type}specialized nominal type \texttt{T}. We know from Section~\ref{contextsubstmap} that \texttt{T} splits up into the \index{declared interface type}declared interface type $\texttt{T}_d$ and \index{context substitution map}context substitution map, $\Sigma$. We begin by expanding out \texttt{T} in the global conformance lookup $\protosym{P}\otimes\texttt{T}$: +Now, we will define global conformance lookup with a \index{specialized type}specialized nominal type \texttt{T}. We know from \SecRef{contextsubstmap} that \texttt{T} splits up into the \index{declared interface type}declared interface type $\texttt{T}_d$ and \index{context substitution map}context substitution map, $\Sigma$. We begin by expanding out \texttt{T} in the global conformance lookup $\protosym{P}\otimes\texttt{T}$: \[ \protosym{P}\otimes\texttt{T} = \protosym{P} \otimes (\texttt{T}_d \otimes \Sigma) \] @@ -187,12 +181,12 @@ \section{Conformance Substitution}\label{conformance subst} \end{quote} For example, the conforming type of $\ConfReq{Array}{Sequence}$ is \texttt{Array}. -\paragraph{Canonical conformances} +\paragraph{Canonical conformances.} \IndexDefinition{canonical conformance} \index{canonical substitution map} Just like types and substitution maps, specialized conformances are immutable and uniquely-allocated. A specialized conformance is \emph{canonical} if the substitution map is canonical. We compute the canonical specialized conformance from an arbitrary specialized conformance by replacing its substitution map with a canonical substitution map. Normal and abstract conformances are always canonical. -\paragraph{More about substitution} As with substitution maps, we define the \index{output generic signature}output generic signature of a conformance. For a \index{normal conformance}normal conformance, this is the generic signature of the conforming context (the nominal itself, or an extension). For a \index{specialized conformance}specialized conformance, we take it to be the \index{output generic signature}output generic signature of the \index{conformance substitution map}conformance substitution map. We can denote the set of all conformances with output generic signature $G$ by \IndexSetDefinition{conf}{\ConfObj{G}}$\ConfObj{G}$. Now, if we have a normal conformance $\NormalConf\in\ConfObj{G}$ and a substitution map $\Sigma\in\SubMapObj{G}{G^\prime}$, then $\NormalConf\otimes\Sigma\in\ConfObj{G^\prime}$, by definition. +\paragraph{More about substitution.} As with substitution maps, we define the \index{output generic signature}output generic signature of a conformance. For a \index{normal conformance}normal conformance, this is the generic signature of the conforming context (the nominal itself, or an extension). For a \index{specialized conformance}specialized conformance, we take it to be the \index{output generic signature}output generic signature of the \index{conformance substitution map}conformance substitution map. We can denote the set of all conformances with output generic signature $G$ by \IndexSetDefinition{conf}{\ConfObj{G}}$\ConfObj{G}$. Now, if we have a normal conformance $\NormalConf\in\ConfObj{G}$ and a substitution map $\Sigma\in\SubMapObj{G}{G^\prime}$, then $\NormalConf\otimes\Sigma\in\ConfObj{G^\prime}$, by definition. With this notation, we can understand how type substitution applies a substitution map to a \emph{specialized} conformance. Suppose $\Sigma^\prime\in\SubMapObj{G^\prime}{G^{\prime\prime}}$. Then, applying $\Sigma^\prime$ to the specialized conformance $\NormalConf\otimes\Sigma$ outputs a new specialized conformance with the composed substitution map: \[(\NormalConf\otimes\Sigma)\otimes\Sigma^\prime=\NormalConf\otimes(\Sigma\otimes\Sigma^\prime)\] @@ -207,6 +201,17 @@ \section{Conformance Substitution}\label{conformance subst} \section{Type Witnesses}\label{type witnesses} +\iffalse +: +\begin{itemize} +\item The conforming type must declare a \emph{type witness} for each associated type. We will discuss type witnesses in \SecRef{type witnesses}. +\item The conforming type itself and its type witnesses must satisfy the protocol's associated requirements. We will discuss checking if requirements are satisfied in \SecRef{checking generic arguments}. +\end{itemize} + +\fi + +%TODO: \Request{type witness request}. + \IndexDefinition{type witness} \index{type alias declaration} \index{generic parameter declaration} @@ -222,7 +227,7 @@ \section{Type Witnesses}\label{type witnesses} \item Via \textbf{associated type inference}, where it is implicitly derived from the declaration of a witness to a value requirement. (This is actually an extremely complex topic which warrants an entire chapter in itself.) \item Via a \textbf{default type witness} on the associated type declaration, which is used if all else fails. \end{enumerate} -The conformance checker is responsible for resolving type witnesses and ensuring they satisfy the requirements of the protocol's requirement signature, as described earlier in Section~\ref{requirement sig}. The problem of checking whether concrete types satisfy generic requirements is covered in Section~\ref{checking generic arguments}. +The conformance checker is responsible for resolving type witnesses and ensuring they satisfy the requirements of the protocol's requirement signature, as described earlier in \SecRef{requirement sig}. The problem of checking whether concrete types satisfy generic requirements is covered in \SecRef{checking generic arguments}. \begin{listing}\captionabove{Different ways of declaring a type witness in a conformance}\label{type witness listing} \begin{Verbatim} @@ -258,7 +263,7 @@ \section{Type Witnesses}\label{type witnesses} \index{synthesized declaration} \begin{example} -Listing~\ref{type witness listing} illustrates all four possibilities. In \texttt{WithMemberType}, the type witness is the nested struct type named \texttt{T}. In the three remaining cases, the conformance checker actually synthesizes a type alias named \texttt{T} as a member of the conforming type. +\ListingRef{type witness listing} illustrates all four possibilities. In \texttt{WithMemberType}, the type witness is the nested struct type named \texttt{T}. In the three remaining cases, the conformance checker actually synthesizes a type alias named \texttt{T} as a member of the conforming type. In the case of \texttt{WithGenericParam}, the type alias type \texttt{T} has as its underlying type the generic parameter, also named \texttt{T}. Thus it might seem that the generic parameter \texttt{T} is a member type of \texttt{WithGenericParam}: \begin{Verbatim} @@ -267,13 +272,12 @@ \section{Type Witnesses}\label{type witnesses} } \end{Verbatim} -However, the member type \texttt{T} does not refer to the generic parameter declaration itself, but to the synthesized type alias declaration. If \texttt{WithGenericParam} did not declare a conformance to \texttt{P}, there would be no member type named \texttt{T}, and the type representation \texttt{WithGenericParam.T} would diagnose an error in type resolution, because generic parameter declarations are not visible as member types. +However, the member type \texttt{T} does not refer to the generic parameter declaration itself, but to the synthesized type alias declaration. If \texttt{WithGenericParam} did not declare a conformance to \texttt{P}, there would be no member type named \texttt{T}, and the type representation \texttt{WithGenericParam.T} would diagnose an error. Type resolution of member types is discussed in detail in \SecRef{member type repr}. \end{example} -\paragraph{Projection} +\paragraph{Projection.} \index{type witness}% \index{associated type declaration}% -\index{protocol substitution map}% Given a conformance and an associated type of the conformed protocol, we can ask the conformance for the corresponding type witness. The next section will explain how type substitution of dependent member types uses the type witnesses of a conformance, but first we need to develop the ``algebra'' of type witnesses. We denote a type witness projection by $\AssocType{[P]A}$, where \texttt{A} is some associated type declared in a protocol \texttt{P}. (Of course, $\pi$ here has nothing to do with the circle constant $\oldstylenums{3.1415926}\ldots\,$; rather, it means ``projection'', or perhaps ``protocol.'') @@ -283,7 +287,7 @@ \section{Type Witnesses}\label{type witnesses} The type witnesses of a normal conformance are interface types which may contain type parameters from the generic signature of the conforming context. In other words, if $\NormalConf\in\ConfObj{G}$, then $\AssocType{[P]A}\otimes\NormalConf\in\TypeObj{G}$. -In Listing~\ref{type witness listing}, we have the following: +In \ListingRef{type witness listing}, we have the following: \begin{gather*} \AssocType{[P]T}\otimes\ConfReq{WithMemberType}{P}=\texttt{WithMemberType.T}\\ \AssocType{[P]T}\otimes\ConfReq{WithGenericParam}{P}=\ttgp{0}{0}\\ @@ -305,7 +309,7 @@ \section{Type Witnesses}\label{type witnesses} \item We can look up the conformance of $\texttt{T}_d$ to \texttt{P} to get the normal conformance $\NormalConf$, apply $\Sigma$ to get the specialized conformance $\ConfReq{T}{P}$, and project the type witness $\AssocType{[P]A}$ to get $\texttt{W}\otimes\Sigma$. \item We can apply $\Sigma$ to $\texttt{T}_d$ to get \texttt{T}, look up the conformance of \texttt{T} to \texttt{P} to get the specialized conformance $\ConfReq{T}{P}$, and project the type witness $\AssocType{[P]A}$ to get $\texttt{W}\otimes\Sigma$. \end{enumerate} -Figure~\ref{type witness diagram} exhibits the above in the form of a \index{commutative diagram}commutative diagram; the three paths from $\texttt{T}_d$ to $\texttt{W}\otimes\Sigma$ correspond to the three possible evaluations above. +\FigRef{type witness diagram} exhibits the above in the form of a \index{commutative diagram}commutative diagram; the three paths from $\texttt{T}_d$ to $\texttt{W}\otimes\Sigma$ correspond to the three possible evaluations above. \begin{figure}\captionabove{Type witnesses of normal and specialized conformances}\label{type witness diagram} \begin{center} @@ -320,7 +324,7 @@ \section{Type Witnesses}\label{type witnesses} \begin{example} To make this concrete, say we look up the conformance of \texttt{Array} to \texttt{Sequence}, and then get the type witness for the \texttt{Element} associated type. The declared interface type of \texttt{Array} is \texttt{Array<\ttgp{0}{0}>}, and the type witness of the \texttt{Element} associated type in $\ConfReq{Array<\ttgp{0}{0}>}{Sequence}$ is the \ttgp{0}{0} generic parameter type. -Our specialized type is \texttt{Array} with context substitution map $\SubstMap{\SubstType{\ttgp{0}{0}}{Int}}$, so the type witness for \texttt{Element} in $\ConfReq{Array}{Sequence}$ is $\ttgp{0}{0}\otimes\SubstMap{\SubstType{\ttgp{0}{0}}{Int}}$, which is \texttt{Int}. Figure~\ref{type witness diagram example} shows the commutative diagram for this case. We can start at the top left and always end up at the bottom right, independent of which of the three paths we take. +Our specialized type is \texttt{Array} with context substitution map $\SubstMap{\SubstType{\ttgp{0}{0}}{Int}}$, so the type witness for \texttt{Element} in $\ConfReq{Array}{Sequence}$ is $\ttgp{0}{0}\otimes\SubstMap{\SubstType{\ttgp{0}{0}}{Int}}$, which is \texttt{Int}. \FigRef{type witness diagram example} shows the commutative diagram for this case. We can start at the top left and always end up at the bottom right, independent of which of the three paths we take. \begin{figure}\captionabove{Type witnesses of the conformances of \texttt{Array<\ttgp{0}{0}>} and \texttt{Array} to \texttt{Sequence}}\label{type witness diagram example} \begin{center} \begin{tikzcd}[column sep=2.5cm,row sep=1cm] @@ -353,10 +357,10 @@ \section{Abstract Conformances}\label{abstract conformances} As every valid dependent member type is the type witness of an abstract conformance, we can define the application of a substitution map to a dependent member type by factoring the dependent member type in this manner: \[\texttt{T.[P]A}\otimes\Sigma=\bigl(\AssocType{[P]A} \otimes \ConfReq{T}{P}\bigr)\otimes\Sigma=\AssocType{[P]A}\otimes\bigl(\ConfReq{T}{P}\otimes\Sigma\bigr)\] -We saw how normal and specialized conformances are substituted in Section~\ref{conformance subst}. Now, it appears we need the ability to apply a substitution map to an abstract conformance. This operation is named \IndexDefinition{local conformance lookup}\emph{local conformance lookup} by analogy with \index{global conformance lookup}global conformance lookup. Local conformance lookup is \emph{compatible} with global conformance lookup, in this sense: a local conformance lookup with some \index{substitution map}substitution map $\Sigma$, type parameter \texttt{T} and protocol \texttt{P} must return the same conformance as performing a global conformance lookup of $\texttt{T}\otimes\Sigma$ to $\texttt{P}$. This can be expressed as an equation: +We saw how normal and specialized conformances are substituted in \SecRef{conformance subst}. Now, it appears we need the ability to apply a substitution map to an abstract conformance. This operation is named \IndexDefinition{local conformance lookup}\emph{local conformance lookup} by analogy with \index{global conformance lookup}global conformance lookup. Local conformance lookup is \emph{compatible} with global conformance lookup, in this sense: a local conformance lookup with some \index{substitution map}substitution map $\Sigma$, type parameter \texttt{T} and protocol \texttt{P} must return the same conformance as performing a global conformance lookup of $\texttt{T}\otimes\Sigma$ to $\texttt{P}$. This can be expressed as an equation: \[(\protosym{P}\otimes\texttt{T})\otimes\Sigma=\protosym{P}\otimes(\texttt{T}\otimes\Sigma)\] -Local conformance lookup is not actually implemented in terms of global conformance lookup, though. In the simple case, when the abstract conformance corresponds to an explicit conformance requirement in the substitution map's input generic signature, local conformance lookup returns one of the conformances stored by the substitution map. In the general case, local conformance lookup derives the conformance via a \index{conformance path}\emph{conformance path}. This will be revealed in Chapter~\ref{conformance paths}. +Local conformance lookup is not actually implemented in terms of global conformance lookup, though. In the simple case, when the abstract conformance corresponds to an explicit conformance requirement in the substitution map's input generic signature, local conformance lookup returns one of the conformances stored by the substitution map. In the general case, local conformance lookup derives the conformance via a \index{conformance path}\emph{conformance path}. This will be revealed in \ChapRef{conformance paths}. \begin{listing}\captionabove{Applying a substitution map to a dependent member type}\label{dmt subst map listing} \begin{Verbatim} @@ -370,7 +374,7 @@ \section{Abstract Conformances}\label{abstract conformances} \end{Verbatim} \end{listing} \begin{example} -Listing~\ref{dmt subst map listing} shows an example of dependent member type substitution. We're going to work through how the compiler derives the type of the \texttt{iter} variable. +\ListingRef{dmt subst map listing} shows an example of dependent member type substitution. We're going to work through how the compiler derives the type of the \texttt{iter} variable. The type annotation references the \texttt{InnerIterator} member type alias with a base type of \texttt{Concatenation>}, so we need to apply the context substitution map of this base type to the underlying type of the type alias declaration. The generic signature of \texttt{Concatenation} is the following: @@ -406,7 +410,7 @@ \section{Abstract Conformances}\label{abstract conformances} So our final substituted type is \verb|IndexingIterator>>|. \end{example} \begin{example} -An attentive reader might remember from Section~\ref{buildingsubmaps} that the construction of the context substitution map of a specialized type is a little tricky, because we have to recursively compute the substituted subject type of each conformance requirement in the generic signature and then perform a global conformance lookup. In the previous example, the generic signature of \texttt{Concatenation} has two conformance requirements, and their original and substituted subject types are as follows: +An attentive reader might remember from \SecRef{buildingsubmaps} that the construction of the context substitution map of a specialized type is a little tricky, because we have to recursively compute the substituted subject type of each conformance requirement in the generic signature and then perform a global conformance lookup. In the previous example, the generic signature of \texttt{Concatenation} has two conformance requirements, and their original and substituted subject types are as follows: \begin{gather*} \texttt{Elements} \Rightarrow \texttt{Array>}\\ \texttt{Elements.[Sequence]Element} \Rightarrow \texttt{Array} @@ -433,7 +437,7 @@ \section{Abstract Conformances}\label{abstract conformances} \end{gather*} \end{example} -\paragraph{Protocol substitution maps} Recall the \index{protocol substitution map}protocol substitution map construction from Section~\ref{contextsubstmap}, which wraps a conformance $\ConfReq{T}{P}$ in a substitution map $\Sigma_{\ConfReq{T}{P}}$ for the protocol's generic signature \verb||. Here, \texttt{T} is any interface type, not necessarily a type parameter, so the conformance might be normal or specialized (but of course it can be abstract, too): +\paragraph{Protocol substitution maps.} If \texttt{T} is any interface type (not just a type parameter) conforming to some protocol \texttt{P}, the normal, specialized or abstract conformance $\ConfReq{T}{P}$ defines a substitution map for the protocol generic signature \verb|| in a natural way. This is the \emph{protocol substitution map}, denoted by $\Sigma_{\ConfReq{T}{P}}$: \[\Sigma_{\ConfReq{T}{P}} := \SubstMapC{ \SubstType{Self}{T} }{ @@ -443,41 +447,31 @@ \section{Abstract Conformances}\label{abstract conformances} \[\texttt{Self.[P]A}\otimes\Sigma_{\ConfReq{T}{P}}=\AssocType{[P]A}\otimes\bigl(\ConfReq{Self}{P}\otimes\Sigma\bigr)=\AssocType{[P]A}\otimes\ConfReq{T}{P}\] \section{Associated Conformances}\label{associated conformances} -The conformance requirements inside a protocol's requirement signature are known as \IndexDefinition{associated conformance requirement}\emph{associated conformance requirements} and the concrete conformance corresponding to one is an \index{associated conformance}\emph{associated conformance} (Section~\ref{associated conformances}). There is an interesting duality between substitution maps and (normal) conformances, illustrated in Table~\ref{substitution map conformance duality}. - -\IndexDefinition{associated conformance} -\index{requirement signature} -\index{conformance requirement} -\index{substitution map} -A substitution map records a replacement type for each generic parameter of a generic signature, and as you saw in the previous section, a normal conformance records a type witness for each associated type of a protocol. - -A substitution map also stores a conformance for each conformance requirement in its generic signature. A normal conformance stores an \emph{associated conformance} for each conformance requirement in the protocol's requirement signature. - -\index{inherited protocol} -\index{associated conformance requirement} -Recall from Section~\ref{requirement sig} that the printed representation of a requirement signature looks like a generic signature with a single \texttt{Self} generic parameter. For example, here is the abridged requirement signature of the standard library's \texttt{Collection} protocol: -\begin{quote} -\texttt{} -\end{quote} -The special case of an associated conformance requirement with a subject type of \texttt{Self} represents a protocol inheritance relationship, as you already saw in Section~\ref{requirement sig}. Other associated conformance requirements constrain the protocol's associated types. - -\begin{table}\captionabove{Duality between substitution maps and conformances}\label{substitution map conformance duality} +The \index{conformance requirement}conformance requirements stated in a protocol's \index{requirement signature}requirement signature are known as \index{request}\IndexDefinition{associated conformance requirement}\emph{associated conformance requirements} and the concrete conformance corresponding to one is an \IndexDefinition{associated conformance}\emph{associated conformance}. There is an interesting duality between \index{substitution map}substitution maps and conformances: \begin{center} -\begin{tabular}{ll} +\begin{tabular}{rcl} \toprule -\textbf{Substitution map}&\textbf{Normal conformance}\\ +\textbf{Generic signatures:}&&\textbf{Protocols:}\\ +\midrule +Generic parameter&$\Leftrightarrow$&Associated type\\ +Requirement in signature&$\Leftrightarrow$&Associated requirement\\ +\midrule +\textbf{Substitution maps:}&&\textbf{Conformances:}\\ \midrule -Input generic signature&Requirement signature\\ -Generic parameter&Associated type declaration\\ -Replacement type&Type witness\\ -Conformance requirement&Associated conformance requirement\\ -Conformance in substitution map&Associated conformance\\ +Replacement type&$\Leftrightarrow$&Type witness\\ +Replacement conformance&$\Leftrightarrow$&Associated conformance\\ \bottomrule \end{tabular} \end{center} -\end{table} +Generic signatures and requirement signatures are similar except that the former constrains generic parameters, while the latter constrains associated types. A substitution map records a replacement type for each generic parameter and a replacement conformance for each conformance requirement. A normal conformance records a type witness for each associated type and an associated conformance for each associated conformance requirement. + +Recall from \SecRef{requirement sig} that the printed representation of a requirement signature looks like a generic signature with a single \texttt{Self} generic parameter. For example, here is the abridged requirement signature of the standard library's \texttt{Collection} protocol: +\begin{quote} +\texttt{} +\end{quote} +The special case of an associated conformance requirement with a subject type of \texttt{Self} represents a \index{inherited protocol}protocol inheritance relationship, as you already saw in \SecRef{requirement sig}. Other associated conformance requirements constrain the protocol's associated types. -The conformance checker populates the associated conformance mapping in a normal conformance by computing the substituted subject type of each associated conformance requirement, and then performing a global conformance lookup with this subject type. This is analogous to the conformance lookup performed during the construction of a substitution map (Section~\ref{buildingsubmaps}). +The associated conformances of a normal conformance are populated lazily by the \IndexDefinition{associated conformance request}\Request{associated conformance request}. The request computes the substituted subject type of an associated conformance requirement by applying the \index{protocol substitution map}protocol substitution map for the conformance, and then performs a global conformance lookup with this subject type. This is the same idea as when we do conformance lookups to populate a substitution map (\SecRef{buildingsubmaps}). The substituted subject type is obtained by applying the protocol substitution map to the subject type of each associated conformance requirement. For example, in the conformance of \texttt{Array<\ttgp{0}{0}>} to \texttt{Collection}, the substituted subject type of the requirement $\ConfReq{Self}{Sequence}$ is just the conforming type: \[ @@ -494,9 +488,9 @@ \section{Associated Conformances}\label{associated conformances} \protosym{Sequence} \otimes \texttt{Array<\ttgp{0}{0}>} = \ConfReq{Array<\ttgp{0}{0}>}{Sequence}\\ \protosym{Comparable} \otimes \texttt{Int} = \ConfReq{Int}{Comparable} \end{gather*} -\paragraph{Notation} We're going to use the notation $\AssocConf{Self.Index}{Comparable}$ for associated conformance requirements. Note that they are quite different than abstract conformances, which use the notation $\ConfReq{T}{P}$. The distinction is important; an abstract conformance describes a type parameter that is known to conform to a protocol in some \emph{generic} signature (possibly as a non-trivial consequence of other requirements), whereas an associated conformance requirement is a \emph{specific} requirement directly appearing in a protocol's \emph{requirement} signature. +\paragraph{Notation.} In the type substitution algebra, we will write an associated conformance requirement as $\AssocConf{Self.Index}{Comparable}$. Associated conformance requirements are quite different from abstract conformances, which use the notation $\ConfReq{T}{P}$. An abstract conformance states that a type parameter is known to conform to a protocol in some \emph{generic} signature (possibly as a non-trivial consequence of other requirements), whereas an associated conformance requirement is a \emph{specific} requirement directly appearing in a protocol's \emph{requirement} signature. -\paragraph{Projection} +\paragraph{Projection.} Projecting an associated conformance from a normal conformance can be understood as the action of an associated conformance requirement (from a protocol's requirement signature) on the left of a normal conformance (to this protocol): \[\AssocConf{Self.U}{Q} \otimes \NormalConf\] @@ -506,14 +500,14 @@ \section{Associated Conformances}\label{associated conformances} \qquad {} = \AssocConf{Self.U}{Q}\otimes\bigl(\NormalConf\otimes\Sigma\bigr)\\ \qquad {} = \bigl(\AssocConf{Self.U}{Q}\otimes\NormalConf\bigr)\otimes\Sigma \end{gather*} -Now we can project associated conformances from normal conformances and specialized conformances. Last but not least, we need to define associated conformance projection from an abstract conformance. Just as the type witnesses of an abstract conformance are dependent member types, associated conformances of abstract conformances are other abstract conformances. In Section~\ref{conformance paths exist}, we will show that \emph{all} abstract conformances can be defined this way: +Now we can project associated conformances from normal conformances and specialized conformances. Last but not least, we need to define associated conformance projection from an abstract conformance. Just as the type witnesses of an abstract conformance are dependent member types, associated conformances of abstract conformances are other abstract conformances. In \SecRef{conformance paths exist}, we will show that \emph{all} abstract conformances can be defined this way: \[ \AssocConf{Self.U}{Q} \otimes \ConfReq{T}{P} = \ConfReq{T.U}{Q} \] -If we define \IndexSetDefinition{assocconf}{\AssocConfObj{P}}$\AssocConfObj{P}$ as the set of all associated conformance requirements in a protocol \texttt{P}, then we get one final ``overload'' of the \index{$\otimes$}$\otimes$ binary operation: +If we define \IndexSetDefinition{assocconf}{\AssocConfObj{P}}$\AssocConfObj{P}$ as the set of all associated conformance requirements in a protocol \texttt{P}, then we get a new ``overload'' of the \index{$\otimes$}$\otimes$ \index{binary operation}binary operation: \[\AssocConfObj{P}\otimes\ConfPObj{P}{G}\longrightarrow\ConfObj{G}\] -A complete summary of the type substitution algebra appears in Appendix~\ref{notation summary}. +One more overload of $\otimes$ is described in \SecRef{checking generic arguments}. A complete summary of the type substitution algebra appears in \AppendixRef{notation summary}. \begin{listing}\caption{Different kinds of associated conformances}\label{associated conformance example} \begin{Verbatim} @@ -532,7 +526,7 @@ \section{Associated Conformances}\label{associated conformances} \end{listing} \begin{example} -The associated conformances of a normal conformance can themselves be any kind of conformance, including normal, specialized or abstract. Listing~\ref{associated conformance example} shows these possibilities. The protocol \texttt{P} states three associated conformance requirements, and each of the associated conformances of the normal conformance \verb|S: P| are a different kind of conformance: +The associated conformances of a normal conformance can themselves be any kind of conformance, including normal, specialized or abstract. \ListingRef{associated conformance example} shows these possibilities. The protocol \texttt{P} states three associated conformance requirements, and each of the associated conformances of the normal conformance \verb|S: P| are a different kind of conformance: \begin{quote} \begin{tabular}{lll} \toprule @@ -555,7 +549,7 @@ \section{Associated Conformances}\label{associated conformances} \end{gather*} The associated conformance projection operation actually turns around and reduces to a local conformance lookup into the substitution map, which gives us the final result: \[\ConfReq{\ttgp{0}{0}}{Equatable} \otimes \Sigma = \ConfReq{String}{Equatable}\] -This has some unexpected consequences, which are explored in Section~\ref{recursive conformances}. +This has some unexpected consequences, which are explored in \SecRef{recursive conformances}. \end{example} \section{Source Code Reference}\label{conformancesourceref} @@ -585,7 +579,7 @@ \section{Source Code Reference}\label{conformancesourceref} \index{nominal type declaration} \apiref{NominalTypeDecl}{class} -See also Section~\ref{declarationssourceref}. +See also \SecRef{declarationssourceref}. \begin{itemize} \item \texttt{getAllConformances()} returns a list of all conformances declared on this nominal type, its extensions, and inherited from its superclass, if any. \end{itemize} @@ -597,9 +591,9 @@ \section{Source Code Reference}\label{conformancesourceref} \index{module declaration} \IndexSource{global conformance lookup} \apiref{ModuleDecl}{class} -See also Section~\ref{compilation model source reference}. +See also \SecRef{compilation model source reference} and \SecRef{extensionssourceref}. \begin{itemize} -\item \texttt{lookupConformance()} returns the conformance of a type to a protocol. This is the a global conformance lookup operation. +\item \texttt{lookupConformance()} returns the conformance of a type to a protocol. This is the a global conformance lookup operation that does not check conditional requirements. To check conditional requirements, use \texttt{checkConformance()} described in \SecRef{extensionssourceref}. \end{itemize} \IndexSource{conformance} @@ -617,7 +611,7 @@ \section{Source Code Reference}\label{conformancesourceref} \IndexSource{concrete conformance} \apiref{ProtocolConformance}{class} -A concrete protocol conformance. This class is the root of a class hierarchy shown in Figure~\ref{conformancehierarchy}. Concrete protocol conformances are allocated in the AST context, and are always passed by pointer. See Section~\ref{extensionssourceref} for documentation about conditional conformance. +A concrete protocol conformance. This class is the root of a class hierarchy shown in \FigRef{conformancehierarchy}. Concrete protocol conformances are allocated in the AST context, and are always passed by pointer. See \SecRef{extensionssourceref} for documentation about conditional conformance. \begin{figure}\captionabove{The \texttt{ProtocolConformance} class hierarchy}\label{conformancehierarchy} \begin{center} @@ -638,14 +632,11 @@ \section{Source Code Reference}\label{conformancesourceref} \end{center} \end{figure} -\IndexSource{conforming type} -\IndexSource{type witness} -\IndexSource{associated conformance} \begin{itemize} -\item \texttt{getType()} returns the conforming type. +\item \texttt{getType()} returns the \IndexSource{conforming type}conforming type. \item \texttt{getProtocol()} returns the conformed protocol. -\item \texttt{getTypeWitness()} returns the type witness for an associated type. -\item \texttt{getAssociatedConformance()} returns the associated conformance for a conformance requirement in the protocol's requirement signature. +\item \texttt{getTypeWitness()} returns the \IndexSource{type witness}type witness for an associated type. +\item \texttt{getAssociatedConformance()} returns the \IndexSource{associated conformance}associated conformance for a conformance requirement in the protocol's requirement signature. \item \texttt{subst()} returns the protocol conformance obtained by applying a substitution map to this conformance. \end{itemize} @@ -658,7 +649,6 @@ \section{Source Code Reference}\label{conformancesourceref} \begin{itemize} \item \texttt{getDeclContext()} returns the conforming declaration context, either a nominal type declaration or extension. \item \texttt{getGenericSignature()} returns the generic signature of the conforming context. -\item \texttt{finishSignatureConformances()} computes the associated conformances of this conformance. Not intended to be called directly. \end{itemize} \IndexSource{conformance substitution map} @@ -678,4 +668,6 @@ \section{Source Code Reference}\label{conformancesourceref} \apiref{LazyConformanceLoader}{class} Abstract base class implemented by different kinds of modules to fill out conformances. For \index{serialized module}serialized modules, this populates the mapping from requirements to witnesses by deserializing records. For \index{imported module}imported modules, this populates the mapping by inspecting \index{Clang}Clang declarations. +\apiref{AssociatedConformanceRequest}{class} +Request evaluator request for lazily looking up an \IndexSource{associated conformance request}associated conformance of a normal conformance that was parsed from source. Computes the substituted subject type of the requirement and calls global conformance lookup. \end{document} diff --git a/docs/Generics/chapters/declarations.tex b/docs/Generics/chapters/declarations.tex index 57af093b34295..c4b4e0a890c81 100644 --- a/docs/Generics/chapters/declarations.tex +++ b/docs/Generics/chapters/declarations.tex @@ -4,209 +4,510 @@ \chapter{Declarations}\label{decls} -\IndexDefinition{declaration} -\IndexDefinition{value declaration} -\IndexDefinition{interface type} -\index{extension declaration} -\index{top-level code declaration} -\index{identifier} -\index{expression} -\lettrine{D}{eclarations are the building blocks} of Swift programs. The different kinds of declarations are categorized into a taxonomy. A \emph{value declaration} is a named entity which can be directly referenced from an expression. Every value declaration also has an \emph{interface type}; roughly speaking, this is the type of the expression naming the declaration. Most declarations one encounters are value declarations, for instance structs and functions. There are some important exceptions, however; extensions, described in Chapter~\ref{extensions}, add members to a type but do not themselves have names. A \emph{top-level code declaration} is another kind of declaration that is not a value declaration; it holds the statements and expressions at the top level of a source file. - -\IndexDefinition{declared interface type} -\IndexDefinition{type declaration} -\index{metatype type} -A \emph{type declaration} is an important kind of value declaration. A type declaration declares a new type that you can write down in a type annotation; this is the \emph{declared interface type} of the type declaration. Since type declarations are value declarations, they also have an interface type, which is the type of an expression referencing the type declaration. When a type is used as a value, the type of the value is a metatype. A type declaration's interface type is therefore the metatype of its declared interface type. - -\index{struct declaration}% -\index{enum declaration}% -\index{class declaration}% -\index{nominal type declaration}% -\index{protocol declaration}% -\IndexDefinition{self interface type}% -\Index{protocol Self type@protocol \texttt{Self} type}% -Struct, enum and class declarations are called \emph{nominal type declarations}. Protocols are also nominal type declarations, but they are special enough it is best to think of them as a separate kind of entity. -The \emph{self interface type} of a nominal type or extension declaration is the type from which the \texttt{self} parameter type of a method is derived. In a struct, enum or class declaration, the self interface type and declared interface type coincide. In a protocol, they differ; the declared interface type is a nominal type, while the self interface type is the protocol \texttt{Self} type (Section~\ref{protocols}). - -\index{call expression} -\index{initial value expression} -\index{expression} -In the following, the nominal type declaration \texttt{Fish} is referenced twice, first as a type annotation, and then in an expression: +\lettrine{D}{eclarations} are the \IndexDefinition{declaration}building blocks of Swift programs. In \ChapRef{compilation model}, we started by representing the entire program as a \index{module declaration}module declaration, and said that a module declaration consists of \index{file unit}file units. The next level of detail is that each file unit holds a list of \IndexDefinition{top-level declaration}top-level declarations. The different kinds of declarations are categorized into a taxonomy, and we will survey this taxonomy, as we did with types in \ChapRef{types}. Our principal goal will be describing the syntactic representations for declaring generic parameters and stating requirements, which are common to all generic declarations; once we have that, we can proceed to \PartRef{part blocks}. + +The declaration taxonomy has two important branches: +\begin{enumerate} +\item A \IndexDefinition{value declaration}\emph{value declaration} is a one that can be referenced by name from an \index{expression}expression; this includes variables, functions, and such. Every value declaration has an \IndexDefinition{interface type}\emph{interface type}, which is the type assigned to an expression that names this declaration. +\item A \IndexDefinition{type declaration}\emph{type declaration} is one that can be referenced by name from within a \index{type representation}type representation. This includes structs, type aliases, and so on. A type declaration declares a type, called the \IndexDefinition{declared interface type}\emph{declared interface type} of the type declaration. +\end{enumerate} +Not all declarations are value declarations. An \index{extension declaration}extension declaration adds members to an existing nominal type declaration, as we'll see in \ChapRef{extensions}, but an extension does not itself have a name. A \IndexDefinition{top-level code declaration}\emph{top-level code declaration} holds the statements and expressions written at the top level of a source file, and again, it does not have a name, semantically. + +\paragraph{Declaration contexts.} +The containment relation between declarations is described as follows. The parent of a declaration is not another declaration, but another related sort of entity, called a \IndexDefinition{declaration context}\emph{declaration context}. A declaration context is something that \emph{contains} declarations. Consider this example: +\begin{Verbatim} +func squares(_ nums: [Int]) -> [Int] { + return nums.map { x in x * x } +} +\end{Verbatim} +The \index{parameter declaration}parameter declaration ``\texttt{x}'' is a child of the closure expression ``\verb|{ x in x * x }|'', and not a direct child of the enclosing function declaration. A \index{closure expression}closure expression is thus an example of a declaration context that is not a declaration. On the other hand, a parameter declaration is a declaration, but not a declaration context. + +\paragraph{Type declarations.} Types can be referenced from inside expressions, so every type declaration is also a value declaration. We can better understand this by looking at this simple example in detail: +\begin{Verbatim} +struct Horse {} +let myHorse: Horse = Horse() +\end{Verbatim} +The struct declaration \index{horse}\texttt{Horse} is referenced twice, first in the type representation on the left-hand side of ``\texttt{=}'' and then again in the \index{expression}\index{initial value expression}initial value expression on the right. In the type representation, we want the \emph{declared interface type} of our struct: this is the nominal type \texttt{Horse} whose values are \emph{instances} of this struct, because that's the type we're assigning to the value stored inside \texttt{myHorse}. The second reference, within the \index{call expression}call expression, is instead using the \emph{type itself} as a value, so we want the \emph{interface type} of the struct declaration, which is the \index{metatype type}metatype type \texttt{Horse.Type}. (Recall the diagram from \SecRef{more types}.) When a metatype is the callee in a call expression, we interpret it as looking up the member named \texttt{init}: +\begin{Verbatim} +struct Horse {} +let myHorseType: Horse.Type = Horse.self +let myHorse: Horse = myHorseType.init() +\end{Verbatim} +The interface type of a type declaration always wraps its declared interface type in a metatype type. (It sounds like a mouthful, but the idea is simple.) + +\paragraph{Nominal type declarations.} +\IndexDefinition{nominal type declaration}Introduced with the \texttt{struct}, \IndexDefinition{enum declaration}\texttt{enum} and \IndexDefinition{class declaration}\texttt{class} keywords; \IndexSwift{5.5}Swift~5.5 also added \texttt{actor}, which for our needs, is just a class~\cite{se0306}. These are declaration contexts, and a declaration nested within a nominal type declaration is called a \IndexDefinition{member declaration}\emph{member declaration}. If it is also a member declaration, a function is called a \IndexDefinition{method declaration}\emph{method}, a variable is called a \IndexDefinition{property declaration}\emph{property}, and a type declaration is called a \IndexDefinition{member type declaration}\emph{member type declaration}. + +Structs and classes can contain \IndexDefinition{stored property declaration}\emph{stored property declarations}. Struct values directly store their stored properties, while a class value is a reference to a heap allocated \index{boxing}box. Enum values store one element among many; their declarations contain \IndexDefinition{enum element declaration}\emph{enum element declarations}, declared with \texttt{case}, instead of stored properties. + +Member declarations are visible to name lookup (\SecRef{name lookup}) both ``inside'' the nominal type declaration (unqualified lookup) and ``outside'' (qualified lookup). We're going to look at \ListingRef{unqualified lookup listing}, which uses two features we will cover in detail later: +\begin{itemize} +\item Nominal type declarations can conform to protocols (\ChapRef{conformances}). Members of conformed protocols and their extensions are visible from the conforming type. +\item A class can inherit from a \index{superclass type}superclass type (\ChapRef{classinheritance}). Members of the superclass and its extensions are visible from the subclass. +\end{itemize} + +\begin{listing}\captionabove{Some behaviors of name lookup}\label{unqualified lookup listing} \begin{Verbatim} -struct Fish {} +class Form { static func callee1() {} } +protocol Shape { static func callee2() } +extension Shape { static func callee3() {} } -let myFish: Fish = Fish() +struct Square: Shape { + class Circle: Form { + static func caller() { + ... // unqualified lookup from here + } + } +} \end{Verbatim} -This is a very simple piece of code, but there's more going on than seems at first glance. The first occurrence of \texttt{Fish} is the type annotation for the variable declaration \texttt{myFish}, so the interface type of \texttt{myFish} becomes the nominal type \texttt{Fish}. The second occurrence is inside the initial value expression of \texttt{myFish}. The callee of the call expression \texttt{Fish()} is the type expression \texttt{Fish}, whose type is the metatype \texttt{Fish.Type}. A call of a metatype is transformed into a call of the \texttt{init} member, which names a constructor declaration. Constructors are called on an instance of the metatype of a type, and return an instance of the type. So the initial value expression has the type \texttt{Fish}, which matches the interface type of \texttt{myFish}. The constructor has the interface type \verb|(Fish.Type) -> () -> Fish|. +\end{listing} + +Suppose we're type checking the body of \texttt{caller()}, and some expression in the body refers to an identifier. We must resolve the identifier with unqualified lookup. All three of \texttt{callee1()}, \texttt{callee2()} and \texttt{callee3()} are in scope, so starting from the top-left corner, unqualified lookup must look through all of these declaration contexts: +\begin{center} +\begin{tikzpicture}[node distance=1cm] + +\node (caller) [class] {\texttt{\vphantom{p}func caller()}}; + +\node (Circle) [class, right=of caller] {\texttt{\vphantom{p}class Circle}}; +\node (Square) [class, right=of Circle] {\texttt{struct Square}}; + +\node (Form) [class, below=of Circle] {\texttt{\vphantom{p}class Form}}; +\node (Shape) [class, below=of Square] {\texttt{protocol Shape}}; -\paragraph{Nesting} -\IndexDefinition{declaration context} -\index{module declaration} -\index{function declaration} -\index{variable declaration} -\index{generic parameter declaration} -\index{closure expression} -\index{source file} -A \emph{declaration context} contains declarations. They overlap with declarations, but neither one implies the other. Module declarations, nominal type declarations, extension declarations and function declarations are also declaration contexts. Variables and generic parameter declarations are declarations, but not declaration contexts. Last but not least, closure expressions and source files are declaration contexts, but not declarations. Table~\ref{taxonomy examples} summarizes the above. -\begin{table}\captionabove{Classifying various entities in our taxonomy}\label{taxonomy examples} -\begin{tabular}{lcccc} +\node (extShape) [class, below=of Shape] {\texttt{extension Shape}}; + +\draw [arrow] (caller) -- (Circle); +\draw [arrow] (Circle) -- (Square); +\draw [arrow] (Square) -- (Shape); +\draw [arrow] (Circle) -- (Form); +\draw [arrow] (Shape) -- (extShape); + +\end{tikzpicture} +\end{center} +We will say more about name lookup in \ChapRef{typeresolution} and \SecRef{direct lookup}. + +A nominal type declaration declares a new type with its own name and identity (hence ``nominal''). The declared interface type of a nominal type declaration is called a \index{nominal type}nominal type, which we already met in \SecRef{fundamental types}: +\begin{Verbatim} +struct Universe { // declared interface type: Universe + struct Galaxy {} // declared interface type: Universe.Galaxy + + func solarSystem() { + struct Planet() // declared interface type: Planet + } +} +\end{Verbatim} +The declared interface type of \texttt{Galaxy} is \texttt{Universe.Galaxy}, while the declared interface type of \texttt{Planet} is just \texttt{Planet}, with no parent type. This reflects the semantic difference; \texttt{Galaxy} is visible to qualified lookup as a member of \texttt{Universe}, while \texttt{Planet} is only visible to unqualified lookup within the scope of \texttt{solarSystem()}; we call it a \IndexDefinition{local type declaration}\emph{local type declaration}. Further discussion of nominal type nesting appears in \SecRef{nested nominal types}. + +\paragraph{Type alias declarations.} These are introduced by the \IndexDefinition{type alias declaration}\texttt{typealias} keyword. The \IndexDefinition{underlying type}underlying type is written on the right-hand side of ``\texttt{=}'': +\begin{Verbatim} +typealias Hands = Int // one hand is four inches +func measure(horse: Horse) -> Hands {...} +let metatype = Hands.self +\end{Verbatim} +The declared interface type of a type alias declaration is a \index{type alias type}type alias type. The canonical type of this type alias type is just the underlying type. Therefore, if we print the return type of \texttt{measure()} in a diagnostic message, we will print it as ``\texttt{Hands}'', but otherwise it behaves as if it were an \texttt{Int}. + +As with all type declarations, the interface type of a type alias declaration is the metatype of its declared interface type. In the above, the expression ``\texttt{Hands.self}'' has the metatype type \texttt{Hands.Type}. This is a sugared type, canonically equal to \texttt{Int.Type}. + +While type aliases are declaration contexts, the only declarations a type alias can contain are generic parameter declarations, in the event the type alias is generic. + +\paragraph{Other type declarations.} +We've seen non-generic nominal type and type alias declarations, but of course they can also be generic. We will study the declarations of generic parameters, requirements, protocols and associated types next, and our foray into generics will begin in earnest. Here are all the type declaration kinds and their declared interface types, with a representative specimen of each: +\begin{center} +\begin{tabular}{ll} \toprule -\textbf{Entity}&\textbf{Declaration?}&\textbf{Value decl?}&\textbf{Type decl?}&\textbf{Decl context?}\\ +\textbf{Type declaration}&\textbf{Declared interface type}\\ +\midrule +Nominal type declaration:&Nominal type:\\ +\verb|struct Horse {...}|&\verb|Horse|\\ +\midrule +Type alias declaration:&Type alias type:\\ +\verb|typealias Hands = Int|&\verb|Hands|\\ \midrule -Module&\checkmark&$\checkmark$&$\checkmark$&\checkmark\\ -Source file&$\times$&$\times$&$\times$&\checkmark\\ -Nominal type&\checkmark&\checkmark&\checkmark&\checkmark\\ -Extension&\checkmark&$\times$&$\times$&\checkmark\\ -Generic parameter&\checkmark&\checkmark&\checkmark&$\times$\\ -Function&\checkmark&\checkmark&$\times$&\checkmark\\ -Variable&\checkmark&\checkmark&$\times$&$\times$\\ -Top-level code&$\checkmark$&$\times$&$\times$&\checkmark\\ -Closure expression&$\times$&$\times$&$\times$&\checkmark\\ +Generic parameter declaration:&Generic parameter type:\\ +\verb||&\verb|T| (or \rT)\\ +\midrule +Protocol declaration:&Protocol type:\\ +\verb|protocol Sequence {...}|&\verb|Sequence|\\ +\midrule +Associated type declaration:&Dependent member type:\\ +\verb|associatedtype Element|&\verb|Self.Element|\\ \bottomrule \end{tabular} -\end{table} +\end{center} -\IndexDefinition{local declaration context}% -\index{subscript declaration}% -\IndexDefinition{initializer declaration context}% -Declarations and declaration contexts are nested within each other. The roots in this hierarchy are module declarations; all other declarations and declaration contexts point at a parent declaration context. Source files are always immediate children of module declarations. A \emph{local context} is any declaration context that is not a module, source file, type declaration or extension. Top-level code declarations, function declarations and closure expressions are three kinds of local contexts we've already seen. +\section{Generic Parameters}\label{generic params} -\index{initial value expression} -The three remaining kinds of local context are subscript declarations, enum element declarations and initializer contexts: -\begin{itemize} -\item Subscripts and enum elements are local contexts, because they contain their parameter declarations. -\item Subscript declarations can also be generic, so they need to contain their generic parameters. -\item Initializer contexts represent the initial value expression of a variable that is itself not a child of a local context. This ensures that any declarations appearing in the initial value expression of a variable are always children of a local context. -\end{itemize} +Various kinds of declarations can have a \IndexDefinition{generic parameter list}generic parameter list. We call them \IndexDefinition{generic declaration}\emph{generic declarations}. We start with those where the generic parameter list is written in source: nominal type declarations, \IndexDefinition{generic type alias}type aliases, \index{function declaration}functions and \index{constructor declaration}constructors, and \index{subscript declaration}subscripts. A generic parameter list is \IndexDefinition{parsed generic parameter list}denoted in source with the \texttt{<...>} syntax following the name of the declaration: +\begin{Verbatim} +struct Outer {...} +\end{Verbatim} +Each comma-separated element in this list is a \IndexDefinition{generic parameter declaration}\emph{generic parameter declaration}; this is a type declaration that declares a generic parameter type. Generic parameter declarations are visible to unqualified lookup in the entire source range of the parent declaration, the one that has the generic parameter list. When generic declarations nest, the inner generic declaration is effectively parameterized by all generic parameters, at every level of nesting. + +Any declaration that can have a generic parameter list is also a \index{declaration context}declaration context in our taxonomy, because it contains other declarations; namely, its generic parameter declarations. We say that a declaration context is a \IndexDefinition{generic context}\emph{generic context} if at least one parent context has a generic parameter list. -\IndexDefinition{local type declaration} -\IndexDefinition{top-level type declaration} -\IndexDefinition{nested type declaration} -\IndexDefinition{local type declaration} -\IndexDefinition{top-level function declaration} -\IndexDefinition{method declaration} -\IndexDefinition{local function declaration} -\IndexDefinition{global variable declaration} -\IndexDefinition{stored property declaration} -\IndexDefinition{local variable declaration} -There is special terminology for type declarations in different kinds of declaration contexts: +The name of a generic parameter declaration plays no role after \index{unqualified lookup}unqualified lookup. Instead, to each generic parameter declaration, we assign a pair of integers (or more accurately, \index{natural number}natural numbers; they're non-negative), the \IndexDefinition{depth}\emph{depth} and the \IndexDefinition{index}\emph{index}: \begin{itemize} -\item A \emph{top-level type} is an immediate child of a source file. -\item A \emph{nested type} or \emph{member type} is an immediate child of a nominal type declaration or an extension. -\item A \emph{local type} is an immediate child of a local context. +\item The depth selects a generic parameter list; the generic parameters declared by the outermost generic parameter list are at depth zero, and we increment the depth by one for each nested generic parameter list. +\item The index selects a generic parameter within a generic parameter list; we number sibling generic parameter declarations consecutively starting from zero. \end{itemize} -Similarly, for functions: + +Let's write some nested generic declarations inside the \texttt{Outer} struct above. In the following, \texttt{two()} is generic over \texttt{T}~and~\texttt{U}, while \texttt{four()} is generic over~\texttt{T}, \texttt{V}, \texttt{W}~and~\texttt{X}: +\begin{Verbatim} +struct Outer { + func two(u: U) -> T {...} + + struct Both { + func four() -> X {...} + } +} +\end{Verbatim} + +When type resolution resolves the type representation ``\texttt{T}'' in the return type of \texttt{two()}, it outputs a generic parameter type that prints as ``\texttt{T}'', if it appears in a diagnostic for example. This is a \index{sugared type}sugared type. Every generic parameter type also has a canonical form which only records the depth and index; we denote a canonical generic parameter type by ``\ttgp{d}{i}'', where \texttt{d} is the depth and \texttt{i} is the index. Two generic parameter types are canonically equal if they have the same depth and index. This is sound, because the depth and index unambiguously identify a generic parameter within its lexical scope. + +Let's enumerate all generic parameters visible within \texttt{two()}, +\begin{center} +\begin{tabular}{llll} +\toprule +&\textbf{Depth}&\textbf{Index}&\textbf{Canonical type}\\ +\midrule +\texttt{T}&0&0&\ttgp{0}{0}\\ +\texttt{U}&1&0&\ttgp{1}{0}\\ +\bottomrule +\end{tabular} +\end{center} +and \texttt{four()}, +\begin{center} +\begin{tabular}{llll} +\toprule +&\textbf{Depth}&\textbf{Index}&\textbf{Canonical type}\\ +\midrule +\texttt{T}&0&0&\ttgp{0}{0}\\ +\texttt{V}&1&0&\ttgp{1}{0}\\ +\texttt{W}&1&1&\ttgp{1}{1}\\ +\texttt{X}&2&0&\ttgp{2}{0}\\ +\bottomrule +\end{tabular} +\end{center} +The generic parameter~\texttt{U} of \texttt{two()} has the same \index{declared interface type!generic parameter}declared interface type, \ttgp{1}{0}, as the generic parameter~\texttt{V} of \texttt{four()}. This is not a problem because the source ranges of their parent declarations, \texttt{two()} and \texttt{Both}, do not intersect. + +The numbering by depth can be seen in the \index{declared interface type!nested nominal type}declared interface type of a nested generic nominal type declaration. For example, the declared interface type of \texttt{Outer.Both} is the generic nominal type \texttt{Outer<\ttgp{0}{0}>.Both<\ttgp{1}{0}, \ttgp{1}{1}>}. + +\paragraph{Implicit generic parameters.} Sometimes the generic parameter list is not written in source. Every protocol declaration has a generic parameter list with a single generic parameter named \Index{protocol Self type@protocol \texttt{Self} type}\texttt{Self} (\SecRef{protocols}), and every extension declaration has a generic parameter list cloned from that of the extended type (\ChapRef{extensions}). These implicit generic parameters can be referenced by name within their scope, just like the generic parameter declarations in a parsed generic parameter list (\SecRef{identtyperepr}). + +Function, constructor and subscript declarations can also declare \IndexDefinition{opaque parameter}\emph{opaque parameters} with the \texttt{some} keyword, possibly in combination with a generic parameter list: +\begin{Verbatim} +func pickElement(_ elts: some Sequence) -> E {...} +\end{Verbatim} +An opaque parameter simultaneously declares a parameter value, a generic parameter type that is the type of the value, and a requirement this type must satisfy. Here, we can refer to ``\texttt{elts}'' from an expression inside the function body, but we cannot name the \emph{type} of ``\texttt{elts}'' in a type representation. From \index{expression}expression context however, the type of an opaque parameter can be obtained via the \texttt{type(of:)} special form, which produces a metatype value. This allows for invoking static methods and such. + +We append the opaque parameters to the parsed generic parameter list, so they follow parsed generic parameters in index order. In \texttt{pickElement()}, the generic parameter \texttt{E} has canonical type~\rT, while the opaque parameter associated with ``\texttt{elts}'' has canonical type~\rU. Opaque parameter declarations also state a constraint type, which imposes a requirement on this unnamed generic parameter. We will discuss this in the next section. Note that when \texttt{some} appears in the return type of a function, it declares an \emph{opaque return type}, which is a related but different feature (\ChapRef{opaqueresult}). + +In \ChapRef{genericsig}, we will see that the generic signature of a declaration records all visible generic parameters, regardless of how they were declared. + +\section{Requirements}\label{requirements} + +The requirements of a generic declaration constrain the generic argument types that can be provided by the caller. This endows the generic declaration's type parameters with new capabilities, so they abstract over the concrete types that satisfy those requirements. We use the following encoding for requirements in the theory and implementation. + +\begin{definition}\label{requirement def} +A \IndexDefinition{requirement}\emph{requirement} is a triple consisting of a \emph{requirement kind}, a subject type, and one final piece of information that depends on the kind: \begin{itemize} -\item A \emph{top-level function} or \emph{global function} is an immediate child of a source file. -\item A \emph{method} is an immediate child of a nominal type declaration or an extension. -\item A \emph{local function} is an immediate child of a local context. +\item A \IndexDefinition{conformance requirement}\textbf{conformance requirement} $\ConfReq{T}{P}$ states that the replacement type for~\texttt{T} must conform to~\texttt{P}, which must be a protocol, protocol composition, or parameterized protocol type. +\item A \IndexDefinition{superclass requirement}\textbf{superclass requirement} $\ConfReq{T}{C}$ states that the replacement type for~\texttt{T} must be a subclass of some \index{class type}class type~\texttt{C}. +\item A \IndexDefinition{layout requirement}\textbf{layout requirement} $\ConfReq{T}{AnyObject}$ states that the replacement type for~\texttt{T} must be represented as a single reference-counted pointer at runtime. +\item A \IndexDefinition{same-type requirement}\textbf{same-type requirement} $\SameReq{T}{U}$ states that the replacement types for \texttt{T}~and~\texttt{U} must be \index{canonical type equality}canonically equal. \end{itemize} -And finally, for variables: +\end{definition} + +When looking at concrete instances of requirements in a self-contained snippet of code, there is no ambiguity in using the same notation for the first three kinds, because the type referenced by the right-hand side determines the requirement kind. When talking about requirements in the abstract, we will explicitly state that~\texttt{P} is some protocol, or~\texttt{C} is some class, before talking about $\ConfReq{T}{P}$ or $\ConfReq{T}{C}$. + +\paragraph{Constraint types.} +Before we introduce the trailing \texttt{where} clause syntax for stating requirements in a fully general way, let's look at the shorthand of stating a \IndexDefinition{constraint type}\emph{constraint type} in the \IndexDefinition{inheritance clause!generic parameter}inheritance clause of a \index{generic parameter declaration}generic parameter declaration: +\begin{Verbatim} +func allEqual(_ elements: [E]) {...} +\end{Verbatim} +The generic parameter declaration~\texttt{E} declares the generic parameter type~\rT, and states a constraint type. This is the protocol type~\texttt{Equatable}, so the stated requirement is the conformance requirement $\ConfReq{\rT}{Equatable}$. More generally, the constraint type is one of the following: +\begin{enumerate} +\item A \index{protocol type!constraint type}protocol type, like \texttt{Equatable}. +\item A \index{parameterized protocol type!constraint type}parameterized protocol type, like \texttt{Sequence}. +\item A \index{protocol composition type!constraint type}protocol composition type, like \texttt{Sequence \& MyClass}. +\item A \index{class type!constraint type}class type, like \texttt{NSObject}. +\item The \Index{AnyObject@\texttt{AnyObject}}\texttt{AnyObject} \index{layout constraint}\emph{layout constraint}, which restricts the possible concrete types to those represented as a single reference-counted pointer. +\end{enumerate} +In the first three cases, the stated requirement becomes a conformance requirement. Otherwise, it is a superclass or layout requirement. In all cases, the subject type of the requirement is the \index{declared interface type!generic parameter}declared interface type of the generic parameter. + +\begin{example} +The constraint type of the generic parameter \texttt{B} in \texttt{open()} refers to the generic parameter \texttt{C}. This illustrates a property of \index{scope tree}scope tree: generic parameters are always visible from the entire \index{source range}source range of a generic declaration, which includes the \index{generic parameter list}generic parameter list itself. +\begin{Verbatim} +class Box { + var contents: Contents? = nil +} + +func open, C>(box: B) -> C { + return box.contents! +} +\end{Verbatim} +The declaration of \texttt{open()} thus states a single requirement, the superclass requirement $\ConfReq{\rT}{Box<\rU>}$. Here is a possible usage of \texttt{open()}, that we will leave unexplained for now; once we learn about substitution maps there will be many more examples: +\begin{Verbatim} +struct Vegetable {} +class FarmBox: Box {} +let vegetable: Vegetable = open(box: FarmBox()) +\end{Verbatim} +\end{example} + +\paragraph{Opaque parameters.} +The \index{constraint type!opaque parameter}constraint type that follows the \texttt{some} keyword imposes a conformance, superclass or layout requirement on the \index{opaque parameter}opaque parameter, just like stating a constraint type in the inheritance clause of a generic parameter declaration does. The following two declarations both state the requirement $\ConfReq{\rU}{Sequence<\rT>}$: +\begin{Verbatim} +func pickElement(_ elts: some Sequence) -> E {...} +func pickElement>(_ elts: S) -> E {...} +\end{Verbatim} + +Constraint types can appear in various other positions, and in all cases, they state a requirement with some distinguished subject type: +\begin{enumerate} +\item In the inheritance clause of a protocol or associated type (\SecRef{protocols}). +\item Following the \texttt{some} keyword in return position, where it declares an opaque return type (\ChapRef{opaqueresult}). +\item Following the \texttt{any} keyword that references an existential type (\ChapRef{existentialtypes}), with the exception that the constraint type cannot be a class by itself (for example, we allow ``\verb|any NSObject & Equatable|'', but ``\verb|any NSObject|'' is just ``\texttt{NSObject}''). +\end{enumerate} + +\paragraph{Trailing where clauses.} All of the above are special cases of requirements stated in a \IndexDefinition{where clause@\texttt{where} clause}\index{trailing where clause@trailing \texttt{where} clause|see{\texttt{where} clause}}\texttt{where} clause. This allows generality that cannot be expressed using the inheritance clause of a generic parameter alone. + +Each entry in a \texttt{where} clause names the subject type explicitly, so that \index{dependent member type}dependent member types can be be subject to requirements; here, we state two requirements, $\ConfReq{\rT}{Sequence}$ and $\ConfReq{\rT.Element}{Comparable}$: +\begin{Verbatim} +func isSorted(_: S) where S.Element: Comparable {...} +\end{Verbatim} + +A \texttt{where} clause entry can also state a same-type requirement. Here, we state two conformance requirements $\ConfReq{\rT}{Sequence}$ and $\ConfReq{\rU}{Sequence}$, together with the same-type requirement $\SameReq{\rT.Element}{\rU.Element}$: +\begin{Verbatim} +func merge(_: S1, _: S2) -> [S1.Element] + where S1: Comparable, S1.Element == S2.Element {...} +\end{Verbatim} + +Note that there is no way to refer to an opaque parameter type within the function's \Index{where clause@\texttt{where} clause}\texttt{where} clause, but every declaration using opaque parameters can always be rewritten into an equivalent one using named generic parameters, so no generality is lost. + +We saw in \ChapRef{types} that when the parser reads a type annotation in the source, it constructs a \index{type representation}type representation, a lower-level syntactic object which must be \index{type resolution}resolved to obtain a \index{type}type. Similarly, requirements have a syntactic form, called a \IndexDefinition{requirement representation}\emph{requirement representation}. The parser constructs requirement representations while reading a \texttt{where} clause. The relationship between the syntactic and semantic entities is shown in this diagram: +\begin{center} +\begin{tikzpicture}[node distance=1cm] +\node (ReqRepr) [data] {Requirement representation}; +\node (TypeRepr) [data, below=of ReqRepr] {Type representation}; +\node (Req) [data, right=2cm of ReqRepr] {Requirement}; +\node (Type) [data, below=of Req] {Type}; + +\draw [arrow] (ReqRepr) -- (TypeRepr); +\draw [arrow] (Req) -- (Type); + +\path [arrow] (ReqRepr) edge [left] node {\footnotesize{contains}} (TypeRepr); +\path [arrow] (Req) edge [right] node {\footnotesize{contains}} (Type); + +\path [arrow] (ReqRepr) edge [above] node {\footnotesize{resolves to}} (Req); +\path [arrow] (TypeRepr) edge [below] node {\footnotesize{resolves to}} (Type); + +\end{tikzpicture} +\end{center} +There are only two kinds of requirement representations, because the ``\texttt{:}'' form cannot distinguish conformance, superclass and layout requirements until we resolve the type representation on the right-hand side: +\begin{enumerate} +\item A \IndexDefinition{constraint requirement representation}\textbf{constraint requirement representation} ``\texttt{T:\ C}'', where \texttt{T} and \texttt{C} are type representations. +\item A \IndexDefinition{same-type requirement representation}\textbf{same-type requirement representation} ``\texttt{T == U}'', where \texttt{T} and \texttt{U} are type representations. +\end{enumerate} + +Recall that the right-hand side of a conformance requirement may be a protocol type, protocol composition type, or parameterized protocol type. With a protocol composition type, we decompose the conformance requirement into a series of requirements, one for each member of the composition. + +For example, if \texttt{MyClass} is a class, the requirement $\ConfReq{\rT}{Sequence~\&~MyClass}$ decomposes into $\ConfReq{\rT}{Sequence}$ and $\ConfReq{\rT}{MyClass}$, the latter being a superclass requirement. The empty protocol composition, written \Index{Any@\texttt{Any}}\texttt{Any}, is a trivial case; stating a conformance requirement to \texttt{Any} does nothing in a \texttt{where} clause, but it is allowed. Parameterized protocol types also decompose, as we'll see in~\SecRef{protocols}. + +The next chapter will introduce the derived requirements formalism. We will assume that only conformance requirements to protocol types remain, and similarly, that the subject types of requirements are type parameters, and not arbitrary types. \SecRef{requirement desugaring} will show how we eliminate these unnecessary forms of generality. + +\paragraph{Contextually-generic declarations.} A generic declaration nested inside of another generic declaration can state a \texttt{where} clause, without introducing new generic parameters of its own. This is called a \IndexDefinition{contextually-generic declaration}\emph{contextually-generic declaration}: +\begin{Verbatim} +enum LinkedList { + ... + + func sum() -> Element where Element: AdditiveArithmetic {...} +} +\end{Verbatim} +There is no semantic distinction between attaching a \texttt{where} clause to a member of a type, or moving the member to a \index{constrained extension}constrained extension (\SecRef{constrained extensions}), assuming such a transformation is possible, so the above is equivalent to the following: +\begin{Verbatim} +extension LinkedList where Element: AdditiveArithmetic { + func sum() -> Element {...} +} +\end{Verbatim} +\index{mangling} +However, for historical reasons, these two declarations have distinct \index{mangling!contextually-generic declaration}mangled symbol names, so the above is not an \index{ABI}ABI-compatible transformation. + +\medskip + +In \ChapRef{genericsig}, we will see that the generic signature of a declaration records all of its requirements, however they were stated in source. + +\paragraph{History.} The syntax described in this section has evolved over time: \begin{itemize} -\item A \emph{global variable} is an immediate child of a source file. -\item A \emph{property} is an immediate child of a nominal type declaration or an extension. -\item A \emph{local variable} is an immediate child of a local context. +\item The \texttt{where} clause used to be written within the ``\texttt{<}'' and ``\texttt{>}'', but was moved to the current \Index{where clause@\texttt{where} clause}``trailing'' position in \IndexSwift{3.0}Swift 3 \cite{se0081}. +\item Generic type aliases were introduced in \IndexSwift{3.0}Swift 3 \cite{se0048}. +\item Protocol compositions involving class types were introduced in \IndexSwift{4.0}Swift 4 \cite{se0156}. +\item Generic subscripts were introduced in \IndexSwift{4.0}Swift 4 \cite{se0148}. +\item Implementation limitations prevented the \texttt{where} clause from stating requirements that constrain outer generic parameters until Swift 3, and contextually-generic declarations were not allowed until \IndexSwift{5.3}Swift 5.3 \cite{se0261}. +\item Opaque parameter declarations were introduced in \IndexSwift{5.7}Swift 5.7 \cite{se0341}. \end{itemize} -\paragraph{Computing interface types} -\IndexDefinition{interface type request} -The \Request{interface type request} computes the interface type of a value declaration. The evaluation function switches over the declaration kind and takes different kinds of action, for example: +\section{Protocols}\label{protocols} + +The \texttt{protocol} keyword introduces a \IndexDefinition{protocol declaration}\emph{protocol declaration}, which is a special kind of nominal type declaration. The members of a protocol, with the exception of type aliases, are requirements that must be witnessed by corresponding members in each \index{conforming type}conforming type. In particular, the property, subscript and method declarations inside a protocol don't have bodies, but are otherwise represented in a similar way to the concrete case. A protocol declaration's declared interface type is a \index{protocol type}protocol type. + +Every protocol has an implicit generic parameter list with a single generic parameter named \IndexDefinition{protocol Self type@protocol \texttt{Self} type}\texttt{Self}, which abstracts over the conforming type. The declared interface type of \texttt{Self} is always~\rT, because protocols cannot be nested in other generic contexts (\SecRef{nested nominal types}), nor can they declare any other generic parameters. + +The \texttt{associatedtype} keyword introduces an \IndexDefinition{associated type declaration}\emph{associated type declaration}, which can only appear inside of a protocol. The declared interface type is a \index{dependent member type}dependent member type (\SecRef{fundamental types}). Specifically, the declared interface type of an associated type~\texttt{A} in a protocol~\texttt{P} is the \index{bound dependent member type}bound dependent member type denoted \texttt{Self.[P]A}, formed from the base type of~\texttt{Self} together with~\texttt{A}. + +Protocols can state \IndexDefinition{associated requirement}\emph{associated requirements} on their \texttt{Self} type and its dependent member types. The conforming type must declare a type witness for each associated type (\SecRef{type witnesses}), and the conforming type and its type witnesses must satisfy the protocol's associated requirements. We will review all the ways of stating associated requirements now. + +\paragraph{Protocol inheritance clauses.} +A protocol can have an \index{inheritance clause!protocol}inheritance clause with a list of one or more comma-separated \index{constraint type!protocol inheritance clause}constraint types. Each inheritance clause entry states an associated requirement with a subject type of \texttt{Self}. These are additional requirements the conforming type itself must satisfy in order to conform. + +An associated conformance requirement with a subject type of \texttt{Self} establishes a \index{protocol inheritance|see{inherited protocol}}\IndexDefinition{inherited protocol}\emph{protocol inheritance} relationship. The protocol stating the requirement is the \emph{derived protocol}, and the protocol on the right-hand side is the \emph{base protocol}. The derived protocol is said to \emph{inherit} from (or sometimes, \emph{refine}) the base protocol. A \index{qualified lookup!protocol inheritance}qualified lookup will search through all base protocols, when the lookup begins at a derived protocol or one of its concrete conforming types. + +For example, the standard library's \texttt{Collection} protocol inherits from \texttt{Sequence} by stating the associated requirement $\ConfReq{Self}{Sequence}$: +\begin{Verbatim} +protocol Collection: Sequence {...} +\end{Verbatim} + +Protocols can restrict their conforming types to those with a reference-counted pointer representation by stating an \texttt{AnyObject} layout constraint in the inheritance clause: +\begin{Verbatim} +protocol BoxProtocol: AnyObject {...} +\end{Verbatim} + +Protocols can also limit their conforming types to subclasses of some superclass: +\begin{Verbatim} +class Plant {} +class Animal {} +protocol Duck: Animal {} +class MockDuck: Plant, Duck {} // error: not a subclass of Animal +\end{Verbatim} +A protocol is \IndexDefinition{class-constrained protocol}\emph{class-constrained} if the $\ConfReq{Self}{AnyObject}$ associated requirement is either explicitly stated, or a consequence of some other associated requirement. We'll say more about the semantics of protocol inheritance clauses and name lookup in \SecRef{requirement sig}, \SecRef{identtyperepr}, and \ChapRef{building generic signatures}. + +\paragraph{Primary associated types.} +A protocol can declare a list of \IndexDefinition{primary associated type}\emph{primary associated types} with a syntax resembling a generic parameter list: +\begin{Verbatim} +protocol IteratorProtocol { + associatedtype Element + mutating func next() -> Element? +} +\end{Verbatim} +While generic parameter lists introduce new generic parameter declarations, the entries in the primary associated type list reference \emph{existing} associated types declared in the protocol's body. + +A \index{parameterized protocol type}\emph{parameterized protocol type} can be formed from a reference to a protocol with primary associated types, together with a list of generic arguments, one for each primary associated type. When written on the right-hand side of a conformance requirement, a parameterized protocol type decomposes into a conformance requirement to the protocol, followed by a series of same-type requirements. The following are equivalent: +\begin{Verbatim} +func sumOfSquares(_: I) -> Int + where I: IteratorProtocol {...} +func sumOfSquares(_: I) -> Int + where I: IteratorProtocol, I.Element == Int {...} +\end{Verbatim} +More details in \SecRef{requirement desugaring}. Parameterized protocol types and primary associated types were added to the language in \IndexSwift{5.7}Swift~5.7~\cite{se0346}. + +\paragraph{Associated requirements.} +An associated type declaration can have an \index{inheritance clause!associated type}inheritance clause, consisting of one or more comma-separated constraint types. Each entry defines a requirement on the declared interface type of the associated type declaration, so we get $\ConfReq{Self.Data}{Codable}$ and $\ConfReq{Self.Data}{Hashable}$ below: +\begin{Verbatim} +associatedtype Data: Codable, Hashable +\end{Verbatim} + +An associated type declaration can also have a trailing \Index{where clause@\texttt{where} clause}\texttt{where} clause, where associated requirements can be stated in full generality. The standard library \texttt{Sequence} protocol demonstrates the various syntactic forms we just described: +\begin{Verbatim} +protocol Sequence { + associatedtype Iterator: IteratorProtocol + associatedtype Element where Element == Iterator.Element + + func makeIterator() -> Iterator +} +\end{Verbatim} +The associated conformance requirement on \texttt{Self.Iterator} could have been stated using a \texttt{where} clause instead: +\begin{Verbatim} +associatedtype Iterator where Iterator: IteratorProtocol +\end{Verbatim} +A \texttt{where} clause can also be attached to the protocol itself; there is no semantic difference between that and attaching it to an associated type: +\begin{Verbatim} +protocol Sequence where Iterator: IteratorProtocol, + Element == Iterator.Element {...} +\end{Verbatim} +Finally, we could make the \texttt{Self} qualification explicit: +\begin{Verbatim} +protocol Sequence where Self.Iterator: IteratorProtocol, + Self.Element == Self.Iterator.Element {...} +\end{Verbatim} + +In all cases we state two associated requirements. Our notation will append a subscript with the protocol name declaring the requirement: +\begin{gather*} +\ConfReq{Self.Iterator}{IteratorProtocol}_\texttt{Sequence}\\ +\SameReq{Self.Element}{Self.Iterator.Element}_\texttt{Sequence} +\end{gather*} + +Let's summarize all the ways of stating associated requirements in a protocol: \begin{itemize} -\item Given a function declaration, this request recursively evaluates itself to get the interface type of each parameter declaration, and constructs a function type from the interface type of each parameter declaration together with the function's return type. -\item For a parameter declaration, this request resolves the type representation written by the user. -\item For a type declaration, it constructs a metatype from the declared interface type of the type declaration. +\item The protocol can state an inheritance clause. Each entry defines a conformance, superclass or layout requirement with a subject type of \texttt{Self}. +\item An associated type declaration \texttt{A} can state an inheritance clause. Each entry defines a conformance, superclass or layout requirement with a subject type of \texttt{Self.[P]A}. +\item Arbitrary requirements can be stated in \Index{where clause@\texttt{where} clause}trailing \texttt{where} clauses, attached to the protocol or any of its associated types, in any combination. \end{itemize} - -\section{Type Declarations}\label{type declarations} -\IndexDefinition{nominal type declaration}% -\IndexDefinition{struct declaration}% -\IndexDefinition{enum declaration}% -\IndexDefinition{class declaration}% -\index{declared interface type}% -\paragraph{Struct, enum and class declarations} -These are the concrete nominal types. The declared interface type of a non-generic nominal type declaration is a nominal type. If the nominal type declaration is generic, the declared interface type is a generic nominal type where the generic arguments are the declaration's generic parameters. +A protocol's associated requirements are collected in its requirement signature, which we will see is dual to a generic signature in some sense (\SecRef{requirement sig}). How concrete types satisfy the requirement signature will be discussed in \ChapRef{conformances}. -Concrete nominal types can be nested inside of other declaration contexts, with a few limitations described in Section~\ref{nested nominal types}. The declared interface type reflects this nesting. For example, the declared interface type of \texttt{Outer.Inner} is the generic nominal type \texttt{Outer.Inner}: +\paragraph{Self requirements.} +The \index{value requirement}value requirements of a protocol cannot constrain \texttt{Self} or its associated types in their \Index{where clause@\texttt{where} clause}\texttt{where} clause. For example, the following protocol is rejected, because there is no way to implement the \texttt{minElement()} requirement in a concrete conforming type whose \texttt{Element} type is \emph{not} \texttt{Comparable}: \begin{Verbatim} -struct Outer { - struct Inner {} +protocol SetProtocol { + associatedtype Element + func minElement() -> Element where Element: Comparable // error } \end{Verbatim} -Structs, enums and classes can conform to protocols; how this is modeled is the topic of Chapter~\ref{conformances}. Classes can also inherit from other classes; Chapter~\ref{classinheritance} describes how inheritance interacts with generics. -\IndexDefinition{protocol declaration}% -\paragraph{Protocol declarations} -The declared interface type of a protocol declaration is the protocol type \texttt{P}. Protocols are the fourth kind of nominal type, but they behave differently in many ways, because they do not have concrete instances. Protocol declarations are described in Chapter~\ref{protocols}. +\paragraph{History.} +Older releases of Swift had a model where protocols and associated types could have inheritance clauses, but more general associated requirements did not exist. The trailing \texttt{where} clause syntax was extended to cover associated types and protocols in \IndexSwift{4.0}Swift~4~\cite{se0142}. -\IndexDefinition{type alias declaration}% -\IndexDefinition{underlying type}% -\paragraph{Type alias declarations} -Type aliases assign a new name to an underlying type. The declared interface type is a type alias type whose canonical type is the underlying type of the type alias. While type aliases are declaration contexts, the only kind of declaration they contain is their own generic parameter declarations. The special case of type aliases in protocols is discussed in Section~\ref{protocol type alias}. +\section{Functions}\label{function decls} -\IndexDefinition{generic parameter declaration}% -\paragraph{Generic parameter declarations} -Generic parameter declarations declare generic parameter types. They appear inside the generic parameter lists of generic declarations. The declared interface type of a generic parameter declaration is a sugared type that prints as the name of the declaration; the canonical type is the generic parameter type \ttgp{d}{i}, where \texttt{d} is the depth and \texttt{i} is the index. Generic parameter declarations are described in Chapter~\ref{generic declarations}. +A \IndexDefinition{function declaration}function declaration can appear at the top level of a source file, as member of a nominal type or extension where it is also called a method declaration, or as a \IndexDefinition{local function declaration}local function nested inside of another function. In this section, we will describe the computation of the interface type of a function declaration, and then conclude with a discussion of closure captures. -\IndexDefinition{associated type declaration}% -\index{bound dependent member type}% -\paragraph{Associated type declarations} -Associated type declarations appear inside protocols. The declared interface type of an associated type \texttt{A} is a bound dependent member type \texttt{Self.[P]A} referencing the declaration of \texttt{A}, with the \texttt{Self} generic parameter of the protocol as the base type. Associated type declarations are described in Section~\ref{protocols}. +The \IndexDefinition{interface type request}\Request{interface type request} computes the interface type of a function declaration. This is a \index{function type}function type or \index{generic function type}generic function type, constructed from the interface types of the function's parameter declarations, together with its return type. If no return type is specified, it is assumed to be the empty tuple type~\texttt{()}: +\begin{Verbatim} +func f(x: Int, y: String) -> Bool {...} +// Interface type: (Int, String) -> Bool -\section{Function Declarations}\label{func decls} +func g() {...} +// Interface type: () -> () +\end{Verbatim} -\IndexDefinition{function declaration} -\index{generic function type} -\paragraph{Function declarations} -Functions can either appear at the top level, inside of a local context such as another function, or as a member of a type, called a method. If a function is itself generic or nested inside of a generic context, the interface type is a generic function type, otherwise it is a function type. +\paragraph{Method declarations.} +As well as the formal parameters declared in its parameter list, a method declaration also has an implicit \texttt{self} parameter, to receive the value on the left-hand side of the ``\texttt{.}'' in the member reference expression. The interface type of a method declaration is a function type which receives the \IndexDefinition{self parameter declaration}\texttt{self} parameter, and returns another function which then takes the method's formal parameters. The ``\texttt{->}'' operator denoting a function type is right-associative, so \verb|A -> B -> C| means \verb|A -> (B -> C)|: +\begin{Verbatim} +struct Universe { + func wormhole(x: Int, y: String) -> Bool {...} + // Interface type: (Universe) -> (Int, String) -> Bool -The interface type of a function is constructed from the interface types of the function's parameter declarations, and the function's return type. If the return type is omitted, it becomes the empty tuple type \texttt{()}. For methods, this function type is then wrapped in another level of function type representing the base of the call which becomes the \texttt{self} parameter of the method. + static func bigBang() {} + // Interface type: (Universe.Type) -> () -> () -\index{self interface type} -\IndexDefinition{method self parameter} -The \texttt{self} parameter's type and parameter flags are constructed from the self interface type of the method's type declaration, and various attributes of the method: + mutating func teleport() {} + // Interface type: (inout Universe) -> () -> () +} +\end{Verbatim} +The interface type of the \texttt{self} parameter is derived as follows: \begin{itemize} -\item If the method is \texttt{mutating}, the \texttt{self} parameter becomes \texttt{inout}. -\item If the method returns the dynamic Self type, the \texttt{self} parameter type is wrapped in the dynamic Self type. -\item Finally, if the method is \texttt{static}, the \texttt{self} parameter is wrapped in a metatype. -\end{itemize} +\item We start with the \IndexDefinition{self interface type}\emph{self interface type} of the method's parent declaration context. In a struct, enum or class declaration, the self interface type is the declared interface type. In a protocol, the self interface type is the protocol \Index{protocol Self type@protocol \texttt{Self} type}\texttt{Self} type (\SecRef{protocols}). In an extension, the self interface type is that of the extended type. -This can be summarized as follows; note that \texttt{(Self)} parameter list means the self interface type of the method's type declaration, together with any additional parameter flags computed via the above: -\begin{quote} -\begin{tabular}{ccll} -\toprule -\textbf{Generic?}&\textbf{Method?}&\textbf{Interface type}\\ -\midrule -$\times$&$\times$&\texttt{(Params)\ -> Result}\\ -\checkmark&$\times$&\texttt{ (Params)\ -> Result}\\ -$\times$&\checkmark&\texttt{(Self) -> (Params)\ -> Result}\\ -\checkmark&\checkmark&\texttt{ (Self) -> (Params)\ -> Result}\\ -\bottomrule -\end{tabular} -\end{quote} +\item If the method is \IndexDefinition{static method declaration}\texttt{static}, we wrap the self interface type in a \index{metatype type}metatype. -\index{call expression} -The two levels of function type in the interface type of a method mirror the two-level structure of a method call expression \texttt{foo.bar(1, 2)}, shown in Figure~\ref{method call expr}: -\begin{itemize} -\item The self apply expression \texttt{foo.bar} applies the single argument \texttt{foo} to the method's \texttt{self} parameter. The type of the self apply expression is the method's inner function type. -\item The outer call applies the argument list \texttt{(1, 2)} to the inner function type. The type of the outer call expression is the method's return type. +\item If the method is \texttt{mutating}, we pass the \texttt{self} parameter \texttt{inout}. + +\item If the method is declared inside a class and it returns the \IndexDefinition{dynamic Self type@dynamic \texttt{Self} type}dynamic \texttt{Self} type, the type of \texttt{self} is the dynamic \texttt{Self} type (\SecRef{misc types}). \end{itemize} -\begin{figure}\captionabove{Two levels of function application in a method call \texttt{foo.bar(1, 2)}}\label{method call expr} +\begin{figure}[b!]\captionabove{The method call \texttt{universe.wormhole(x:~1, y:~"hi")}}\label{method call expr} \begin{center} \begin{tikzpicture}[% grow via three points={one child at (0.5,-0.7) and two children at (0.5,-0.7) and (0.5,-1.4)}, edge from parent path={[->] (\tikzparentnode.south) |- (\tikzchildnode.west)}] - \node [class] {\vphantom{p}call expression: \texttt{Result}} - child { node [class] {\vphantom{p}self apply expression: \texttt{(Int, Int) -> ()}} - child { node [class] {\vphantom{p}declaration reference expression: \texttt{Foo.bar}}} - child { node [class] {\vphantom{p}declaration reference expression: \texttt{foo}}}} + \node [class] {\vphantom{p}call: \texttt{universe.wormhole(x:~1, y:~"hi")}} + child { node [class] {\vphantom{p}call: \texttt{universe.wormhole}} + child { node [class] {\vphantom{p}declaration reference: \texttt{Universe.wormhole}}} + child { node [class] {\vphantom{p}declaration reference: \texttt{universe}}}} child [missing] {} child [missing] {} child { node [class] {\vphantom{p}argument list} - child { node [class] {\vphantom{p}integer literal expression: \texttt{1}}} - child { node [class] {\vphantom{p}integer literal expression: \texttt{2}}}} + child { node [class] {\vphantom{p}literal: \texttt{1}}} + child { node [class] {\vphantom{p}literal: \texttt{"hi"}}}} child [missing] {} child [missing] {} child [missing] {}; @@ -214,59 +515,237 @@ \section{Function Declarations}\label{func decls} \end{center} \end{figure} -\IndexDefinition{constructor declaration} -\paragraph{Constructor declarations} -Constructor declarations always appear as members of other types, and are named \texttt{init}. The interface type of a constructor takes a metatype and returns an instance of the constructed type, possibly wrapped in an \texttt{Optional}. +Let's compare the interface type of a method declaration with the structure of a method call expression in the abstract syntax tree, shown in \FigRef{method call expr}: +\begin{itemize} +\item The outer expression's callee, \texttt{universe.wormhole}, is itself a call expression, so we must evaluate this inner call expression first. -\begin{quote} -\begin{tabular}{cl} -\toprule -\textbf{Generic?}&\textbf{Interface type}\\ -\midrule -$\times$&\texttt{(Self.Type) -> (Params)\ -> Self}\\ -\checkmark&\texttt{ (Self.Type) -> (Params)\ -> Self}\\ -\bottomrule -\end{tabular} -\end{quote} +The inner call expression applies the the argument \texttt{universe} to the \texttt{self} parameter of \texttt{Universe.wormhole()}. This represents the method lookup. The type of the inner call expression is \verb|(Int, String) -> Bool|. +\item The outer call expression applies the argument list \verb|(x: 1, y: "hi")| to the result of the inner call expression. This represents the method call itself. The type of the outer call expression is the method's return type, \texttt{Bool}. +\end{itemize} +The extra function call disappears in \index{SILGen}SILGen, where a method lowers to a SIL function that receives all formal parameters and the \texttt{self} parameter at once. + +Swift also allows partially-applied method references, such as \texttt{universe.wormhole}, which are values of function type. This form binds the \texttt{self} parameter but does not call the method. We eliminate this by a process called \index{eta conversion}\emph{eta conversion}, wrapping the method reference in a closure which simply forwards each parameter: +\begin{Verbatim} +{ x, y in universe.wormhole(x: x, y: y) } +\end{Verbatim} +The fully unapplied form, \texttt{Universe.wormhole}, desugars into the following: +\begin{Verbatim} +{ s in { x y in s.wormhole(x: x, y: y) } } +\end{Verbatim} +Thus, SILGen does not need to support partially-applied and unapplied method references. -\IndexDefinition{initializer interface type} -Class constructors also have a \emph{initializer interface type}, used when a subclass initializer delegates to an initializer in the superclass. The initializer interface type is the same as the interface type, except it takes a self value instead of a self metatype. +\begin{wrapfigure}[25]{l}{17.2em} +\begin{minipage}{17em} +\begin{Verbatim} +struct Example { + func instanceMethod() {} + static func staticMethod() {} + + struct Lookup { + func innerMethod() {} + func test() { + instanceMethod() // bad + staticMethod() // ok + innerMethod() // ok + } + } + + func anotherMethod(x: Int) { + struct Local { + func test() { + print(x) // bad + } + } + } +} +\end{Verbatim} +\end{minipage} +\end{wrapfigure} + +All function declarations must be followed by a body in the source language, except for protocol requirements. The body may contain statements, expressions, or other declarations. We will not give an exhaustive account of all statements and expressions, like we did with types and declarations. + +The example on the left illustrates name lookup from inside method declarations. + +In a method body, an unqualified reference to a member of the innermost nominal type declaration is interpreted as having an implicit ``\texttt{self.}'' qualification. Thus, instance methods can refer to other instance methods this way, and static methods can refer to other static methods. + +An unqualified reference to a member of an outer nominal type can only be made if the member is static, because there is no ``outer \texttt{self} value'' to invoke the method with; the nested type does not reference a value of its parent type at run time. + +For the same reason, methods inside \index{local type declaration}local types cannot refer to local variables declared inside parent declaration contexts. (Constrast this with \index{Java}Java inner classes for example, which can be declared as \texttt{static} or instance members of their parent class; a non-\texttt{static} inner class can then capture values from outer scopes, including ``\texttt{this}'', which plays the same role as Swift's ``\texttt{self}''.) + +\paragraph{Constructor declarations.} +\IndexDefinition{constructor declaration}Constructor declarations are introduced with the \texttt{init} keyword. The parent context of a constructor must be a nominal type or extension. + +The interface type of a constructor is like a static method that returns the new instance. A constructor also has an \IndexDefinition{initializer interface type}\emph{initializer interface type} which describes the type of an initializing entry point, where the instance is allocated by the caller: +\begin{Verbatim} +struct Universe { + init(age: Int) {...} + // Interface type is (Universe.Type) -> (Int) -> Universe + // Initializer interface type is (Universe) -> (Int) -> Universe +} +\end{Verbatim} +The interface type of the constructor's \texttt{self} parameter is the nominal type itself, and not the metatype. + +\paragraph{Destructor declarations.} +\IndexDefinition{destructor declaration}Destructor declarations are introduced with the \texttt{deinit} keyword. They can only appear inside classes. They have no formal parameters, no generic parameter list, no \texttt{where} clause, and no return type. + +\paragraph{Local contexts.} +A \IndexDefinition{local context}\emph{local context} is any declaration context that is not a module, source file, type declaration or extension. Swift allows variable, function and type declarations to appear in local context. The following are local contexts: +\begin{itemize} +\item \index{top-level code declaration}Top-level code declarations. +\item Function declarations. +\item \index{closure expression}Closure expressions. +\item If a variable is not itself in local context (for example, it's a member of a nominal type declaration), then its \index{initial value expression}initial value expression defines a new local context. +\item \index{subscript declaration}Subscript and \index{enum element declaration}enum element declarations declarations are local contexts, because they can contain parameter declarations (and also a generic parameter list, in the case of a subscript). +\end{itemize} + +Local functions and closures can \IndexDefinition{captured value}\emph{capture} references to other local declarations from outer scopes. We use the standard technique of \IndexDefinition{closure conversion}\emph{closure conversion} to lower functions with captured values into ones without. We can understand this process as introducing an additional parameter for each captured value, followed by a walk to replace references to captured values with references to the corresponding parameters in the function body. In Swift, this is part of \index{SILGen}SILGen's lowering process, and not a separate transformation on the abstract syntax tree. + +The \IndexDefinition{capture info request}\Request{capture info request} computes the list of declarations captured by the given function, and all of its nested local functions and closure expressions. + +\begin{wrapfigure}[9]{l}{10.6em} + \begin{minipage}{10.5em} +\begin{Verbatim} +func f() { + let x = 0, y = 0 + + func g() { + var z = 0 + print(x) + + func h() { + print(y, z) + } + } +} +\end{Verbatim} +\end{minipage} +\end{wrapfigure} + +Consider the three nested functions shown on the left. We proceed to compute their captures from the inside out. + +The innermost function~\texttt{h()} captures \texttt{y}~and~\texttt{z}. The middle function~\texttt{g()} captures~\texttt{x}. It also captures~\texttt{y}, because~\texttt{h()} captures~\texttt{y}, but it does not capture~\texttt{z}, because~\texttt{z} is declared by~\texttt{g()} itself. Finally,~\texttt{f()} is declared at the top level, so it does not have any captures. + +We can summarize this as follows: +% FIXME \begin{quote} -\begin{tabular}{cl} +\qquad\qquad +\begin{tabular}{lll} \toprule -\textbf{Generic?}&\textbf{Initializer interface type}\\ +\textbf{Function}&\textbf{Captures}\\ \midrule -$\times$&\texttt{(Self) -> (Params)\ -> Self}\\ -\checkmark&\texttt{ (Self) -> (Params)\ -> Self}\\ +\texttt{f()}&$\varnothing$\\ +\texttt{g()}&$\{\texttt{x},\,\texttt{y}\}$\\ +\texttt{h()}&$\{\texttt{y},\,\texttt{z}\}$\\ \bottomrule \end{tabular} \end{quote} -\IndexDefinition{destructor declaration} -\paragraph{Destructor declarations} -Destructor declarations cannot have a generic parameter list, a \texttt{where} clause, or a parameter list. Formally they take no parameters and return \texttt{()}. +\bigskip -\begin{quote} -\begin{tabular}{cl} +\begin{algorithm}[Computing closure captures]\label{closure captures algorithm} +As input, takes the type-checked body of a \index{closure function}closure expression or \index{local function declaration}local function~$F$. Outputs the \index{set}set of captures of~$F$. +\begin{enumerate} +\item Initialize the return value with an \index{empty set}empty set, $C\leftarrow\varnothing$. (See \AppendixRef{math summary} for a description of the notation used in this algorithm.) +\item Recursively walk the type-checked body of $F$ and handle each element: +\item (Declaration references) If $F$ contains an expression that references some local variable or local function~$d$ by name, let $\mathsf{parent}(d)$ denote the parent declaration context of~$d$. This is either $F$ itself, or some outer local context, because we found $d$ by unqualified lookup from~$F$. + +If $\mathsf{parent}(d)\neq F$, set $C\leftarrow C\cup\{d\}$. +\item (Nested closures) If $F$ contains a nested closure expression or local function $F^\prime$, then a capture $d$ of $F^\prime$ is either a local declaration of~$F$, or it is also a capture of~$F$. + +Recursively compute the captures of $F^\prime$. For each $d$ captured by $F^\prime$ such that $\mathsf{parent}(d)\neq F$, set $C\leftarrow C\cup\{d\}$. + +\item (Local types) If $F$ contains a local type, do not walk into the children of the local type. Local types do not have captures. + +\item (Diagnose) After the recursive walk, consider each element $d\in C$. If the path of parent declaration contexts from $F$ to $d$ contains a nominal type declaration, we have an unsupported capture inside a local type. Diagnose an error. + +\item Return $C$. +\end{enumerate} +\end{algorithm} + +\begin{wrapfigure}[9]{r}{16.5em} +\begin{minipage}{16.5em} +\begin{Verbatim} +func f() { + let x = 0, y = 0, z = 0 + + func g() { print(x); h() } + func h() { print(y); g() } + func i() { print(z); h() } +} +\end{Verbatim} +\end{minipage} +\end{wrapfigure} + +Local functions can also reference each other recursively. Consider the functions shown on the right and notice how \texttt{f()} and \texttt{g()} are mutually recursive. At runtime, we cannot represent this by forming two closure contexts where each one retains the other, because then neither context will ever be released. + +We use a second algorithm to obtain the list of \IndexDefinition{lowered captures}\emph{lowered captures}, by replacing any captured local functions with their corresponding capture lists, repeating this until fixed point. The final list contains variable declarations only. With our example, the captures and lowered captures of each function are as follows: + +\begin{center} +\begin{tabular}{lll} \toprule -\textbf{Generic?}&\textbf{Interface type}\\ +\textbf{Function}&\textbf{Captures}&\textbf{Lowered}\\ \midrule -$\times$&\texttt{(Self) -> ()\ -> ()}\\ -\checkmark&\texttt{ (Self) -> ()\ -> ()}\\ +\texttt{f()}&$\varnothing$&$\varnothing$\\ +\texttt{g()}&$\{\texttt{x},\,\texttt{h}\}$&$\{\texttt{x},\,\texttt{y}\}$\\ +\texttt{h()}&$\{\texttt{y},\,\texttt{g}\}$&$\{\texttt{x},\,\texttt{y}\}$\\ +\texttt{i()}&$\{\texttt{z},\,\texttt{h}\}$&$\{\texttt{x},\,\texttt{y},\,\texttt{z}\}$\\ \bottomrule \end{tabular} -\end{quote} +\end{center} + +(As a special case, if a set of local functions reference each other but capture no other state from the outer declaration context, their lowered captures will be empty, so no runtime context allocation is necessary.) + +\begin{algorithm}[Computing lowered closure captures] +As input takes the type-checked body of a \index{closure function}closure expression or \index{local function declaration}local function~$F$. Outputs the \index{set}set of variable declarations transitively captured by~$F$. +\begin{enumerate} +\item Initialize the set $C\leftarrow\varnothing$; this will be the return value. Initialize an empty worklist. Initialize an empty visited set. Add $F$ to the worklist. +\item If the worklist is empty, return $C$. Otherwise, remove the next function $F$ from the worklist. +\item If $F$ is in the visited set, go back to Step~2. Otherwise, add $F$ to the visited set. +\item Compute the captures of~$F$ using \AlgRef{closure captures algorithm} and consider each capture~$d$. If~$d$ is a local variable declaration, set $C\leftarrow C\cup\{d\}$. If~$d$ is a local function declaration, add~$d$ to the worklist. +\item Go back to Step~2. +\end{enumerate} +\end{algorithm} + +This completely explains captures of \texttt{let} variables, but mutable \texttt{var} variables and \texttt{inout} parameters merit further explanation. + +A \index{non-escaping function type}\emph{non-escaping} closure can capture a \texttt{var} or \texttt{inout} by simply capturing the memory address of the storage location. This is safe, because a non-escaping closure cannot outlive the dynamic extent of the storage location. + +An \index{escaping function type}\texttt{@escaping} closure can also capture a \texttt{var}, but this requires promoting the \texttt{var} to a \index{boxing}heap-allocated box, with all loads and stores of the variable indirecting through the box. A example like the below can be found in every \index{Lisp}Lisp textbook. Each invocation of \texttt{counter()} allocates a new counter value on the heap, and returns three closures sharing the same mutable box; the box itself is completely hidden by the abstraction: +\begin{Verbatim} +func counter() -> (read: () -> Int, inc: () -> (), dec: () -> ()) { + var count = 0 // promoted to a box + return ({ count }, { count += 1 }, { count -= 1 }) +} +\end{Verbatim} + +Before \IndexSwift{3.0}Swift~3.0, \texttt{@escaping} closures were permitted to capture \texttt{inout} parameters as well. To make this safe, the contents of the \texttt{inout} parameter were first copied into a heap-allocated box, which was captured by the closure. The contents of this box were then copied back before the function returned to its caller. This was essentially equivalent to doing the following transform, where we introduced \verb|_n| by hand: +\begin{Verbatim} +func changeValue(_ n: inout Int) { + var _n = n // copy the value + + let escapingFn = { + _n += 1 // capture the box + } + + n = _n // write it back +} +\end{Verbatim} +In this scheme, if the closure outlives the dynamic extent of the \texttt{inout} parameter, any subsequent writes from within the closure are silently dropped. This was a source of user confusion, so Swift~3.0 banned \texttt{inout} captures from escaping closures instead~\cite{se0035}. + +In SIL, a closure is represented abstractly, as the result of this partial application operation. The mechanics of how the partially-applied function value actually stores its captures---the partially-applied arguments---are left up to IRGen. In IRGen, we allocate space for storing the captures (either on the stack for a \index{non-escaping function type}non-escaping function type, otherwise it's on the heap for \texttt{@escaping}), and then we emit a thunk, which takes a pointer to the context as an argument, unpacks the captured values from the context, and passes them as individual arguments to the original function. This thunk together with the context forms a \IndexDefinition{thick function}\emph{thick function} value which can then be passed around. + +If nothing is captured (or if all captured values are zero bytes in size), we can pass a null pointer as the context, without performing a heap allocation. If there is exactly one captured value and this value can be represented as a reference-counted pointer, we can also elide the allocation by passing the captured value as the context pointer. For example, if a closure's single capture is an instance of a \index{class type}class type, nothing is allocated. If the single capture is the heap-allocated box that wraps a \texttt{var}, we must still allocate the box for the \texttt{var}, but we avoid wrapping it in an additional context. + +\section{Storage}\label{other decls} -\section{Storage Declarations} \IndexDefinition{storage declaration} \index{l-value type} -Storage declarations represent the declaration of an l-value. Storage declarations can have zero or more associated accessor declarations. The accessor declarations are siblings of the storage declaration in the declaration context hierarchy. +Storage declarations represent locations that can be read and written. -\IndexDefinition{variable declaration} -\paragraph{Variable declarations} The interface type of a variable is the stored value type, possibly wrapped in a reference storage type if the variable is declared as \texttt{weak} or \texttt{unowned}. The \emph{value interface type} of a variable is the storage type without any wrapping. +\paragraph{Parameter declarations.} Functions, enum elements and subscripts can have parameter lists; each parameter is represented by a \IndexDefinition{parameter declaration}parameter declaration. Parameter declarations are a kind of variable declaration. -For historical reasons, the interface type of a property (a variable appearing inside of a type) does not include the \texttt{Self} clause, the way that method declarations do. +\paragraph{Variable declarations.} \IndexDefinition{variable declaration}Variables that are not parameters are introduced with \texttt{var} and \texttt{let}. The interface type of a variable is the stored value type, possibly wrapped in a reference storage type if the variable is declared as \texttt{weak} or \texttt{unowned}. The \IndexDefinition{value interface type}\emph{value interface type} of a variable is the storage type without any wrapping. For historical reasons, the interface type of a property (a variable appearing inside of a type) does not include the \texttt{Self} clause, the way that method declarations do. \IndexDefinition{pattern binding declaration} \IndexDefinition{pattern binding entry} @@ -274,16 +753,15 @@ \section{Storage Declarations} \IndexDefinition{initial value expression} Variable declarations are always created alongside a \emph{pattern binding declaration} which represents the various ways in which variables can be bound to values in Swift. A pattern binding declaration consists of one or more \emph{pattern binding entries}. Each pattern binding entry has a \emph{pattern} and an optional \emph{initial value expression}. A pattern declares zero or more variables. -\begin{example} -A pattern binding declaration with a single entry, where the pattern declares a single variable: +Here is a pattern binding declaration with a single entry, where the pattern declares a single variable: \begin{Verbatim} let x = 123 \end{Verbatim} -Same as the above, except with a more complex pattern which declares a variable storing the first element of a tuple while discarding the second element: +A more complex pattern which declares a variable storing the first element of a tuple while discarding the second element: \begin{Verbatim} let (x, _) = (123, "hello") \end{Verbatim} -A pattern binding declaration with a single entry, where the pattern declares two variables \texttt{x} and \texttt{y}: +A pattern binding declaration with a single entry, but now the pattern declares two variables \texttt{x} and \texttt{y}: \begin{Verbatim} let (x, y) = (123, "hello") \end{Verbatim} @@ -300,8 +778,6 @@ \section{Storage Declarations} let x = 123 let y = "hello" \end{Verbatim} -\end{example} -\begin{example} When a pattern binding declaration appears outside of a local context, each entry must declare at least one variable, so we reject both of the following: \begin{Verbatim} let _ = 123 @@ -310,45 +786,20 @@ \section{Storage Declarations} let _ = "hello" } \end{Verbatim} -\end{example} + \index{typed pattern} \index{tuple pattern} -\begin{example} A funny quirk of the pattern grammar is that typed patterns and tuple patterns do not compose in the way one might think. If ``\texttt{let x:~Int}'' is a typed pattern declaring a variable \texttt{x} type with annotation \texttt{Int}, and ``\texttt{let (x, y)}'' is a tuple pattern declaring two variables \texttt{x} and \texttt{y}, you might expect ``\texttt{let~(x:~Int,~y:~String)}'' to declare two variables \texttt{x} and \texttt{y} with type annotations \texttt{Int} and \texttt{String} respectively; what actually happens is you get a tuple pattern declaring two variables named \texttt{Int} and \texttt{String} that binds a two-element tuple with \emph{labels} \texttt{x} and \texttt{y}: \begin{Verbatim} let (x: Int, y: String) = (x: 123, y: "hello") print(Int) // huh? prints 123 print(String) // weird! prints "hello" \end{Verbatim} -\end{example} -\IndexDefinition{parameter declaration}% -\IndexDefinition{value ownership kind}% -\index{autoclosure function type}% -\paragraph{Parameter declarations} Functions, enum elements and subscripts can have parameter lists; each parameter is represented by a parameter declaration. The interface type of a declaration with a parameter list is built by first computing the interface type of each parameter. Among other things, the parameter declaration stores the value ownership kind, the variadic flag, and the \texttt{@autoclosure} attribute. This is in fact the same exact information encoded in the parameter list of a function type. +\paragraph{Subscript declarations.} \IndexDefinition{subscript declaration}Subscripts are introduced with the \texttt{subscript} keyword. They can only appear as members of nominal types and extensions. The interface type of a subscript is a function type taking the index parameters and returning the storage type. The value interface type of a subscript is just the storage type. For historical reasons, the interface type of a subscript does not include the \texttt{Self} clause, the way that method declarations do. Subscripts can either be instance or static members; static subscripts were introduced in \IndexSwift{5.1}Swift~5.1 \cite{se0254}. -\index{argument label}% -\IndexDefinition{default argument expression}% -\index{closure expression}% -Closure expressions also have parameter lists and thus parent parameter declarations. Parameter declarations of named declarations can have argument labels and default argument expressions, which are not encoded in a function type. These phenomena are only visible when directly calling a named declaration and not a closure value. - -\IndexDefinition{subscript declaration}% -\paragraph{Subscript declarations} Subscripts always appear as members of types, with a special declaration name. The interface type of a subscript is a function type taking the index parameters and returning the storage type. The value interface type of a subscript is just the storage type. For historical reasons, the interface type of a subscript does not include the \texttt{Self} clause, the way that method declarations do. -\begin{quote} -\begin{tabular}{clll} -\toprule -\textbf{Generic?}&\textbf{Interface type}\\ -\midrule -$\times$&\texttt{(Indices...)\ -> Value}\\ -$\checkmark$&\texttt{\ (Indices...)\ -> Value}\\ -\bottomrule -\end{tabular} -\end{quote} - -\IndexDefinition{accessor declaration} -\paragraph{Accessor declarations} - -The interface type of an accessor depends the accessor kind. For example, getters return the value, and setters take the new value as a parameter. Property accessors do not take any other parameters; subscript accessors also take the subscript's index parameters. There is a lot more to say about accessors and storage declarations, but unfortunately, you'll have to wait for the next book. +\paragraph{Accessor declarations.} +Each storage declaration has a \IndexDefinition{accessor declaration}set of accessor declarations, which are a special kind of function declaration. The accessor declarations are siblings of the storage declaration in the declaration context hierarchy. The interface type of an accessor depends the accessor kind. For example, getters return the value, and setters take the new value as a parameter. Property accessors do not take any other parameters; subscript accessors also take the subscript's index parameters. There is a lot more to say about accessors and storage declarations, but it is beyond the scope of this book. \section{Source Code Reference}\label{declarationssourceref} @@ -368,7 +819,7 @@ \section{Source Code Reference}\label{declarationssourceref} \IndexSource{declaration} \apiref{Decl}{class} -Base class of declarations. Figure~\ref{declhierarchy} shows various subclasses, which correspond to the different kinds of declarations defined previously in this chapter. +Base class of declarations. \FigRef{declhierarchy} shows various subclasses, which correspond to the different kinds of declarations defined previously in this chapter. \begin{figure}\captionabove{The \texttt{Decl} class hierarchy}\label{declhierarchy} \begin{center} \begin{tikzpicture}[% @@ -464,7 +915,7 @@ \section{Source Code Reference}\label{declarationssourceref} \index{statement} \index{expression} \index{type representation} -\paragraph{Visitors} +\paragraph{Visitors.} If you need to exhaustively handle each kind of declaration, the simplest way is to switch over the kind, which is an instance of the \texttt{DeclKind} enum, like this: \begin{Verbatim} Decl *decl = ...; @@ -507,6 +958,8 @@ \section{Source Code Reference}\label{declarationssourceref} \item \texttt{getInterfaceType()} returns the declaration's interface type. \end{itemize} +\subsection*{Type Declarations} + \IndexSource{type declaration} \IndexSource{declared interface type} \apiref{TypeDecl}{class} @@ -525,7 +978,7 @@ \section{Source Code Reference}\label{declarationssourceref} Base class of nominal type declarations. Also a \texttt{DeclContext}. \begin{itemize} \item \texttt{getSelfInterfaceType()} returns the type of the \texttt{self} value inside the body of this declaration. Different from the declared interface type for protocols, where the declared interface type is a nominal but the declared self type is the generic parameter \texttt{Self}. -\item \texttt{getDeclaredType()} returns the type of an instance of this declaration, without generic arguments. If the declaration is generic, this is an unbound generic type. If this declaration is not generic, this is a nominal type. This is occasionally used in diagnostics instead of the declared interface type, when the generic parameter types are irrelevant. +\item \texttt{getDeclaredType()} returns the type of an instance of this declaration, without generic arguments. If the declaration is generic, this is an \IndexSource{unbound generic type}unbound generic type. If this declaration is not generic, this is a nominal type. This is occasionally used in diagnostics instead of the declared interface type, when the generic parameter types are irrelevant. \end{itemize} \IndexSource{type alias declaration} @@ -537,41 +990,11 @@ \section{Source Code Reference}\label{declarationssourceref} \item \texttt{getUnderlyingType()} returns the underlying type of the type alias declaration, without wrapping it in type alias type sugar. \end{itemize} -\IndexSource{function declaration} -\IndexSource{method self parameter} -\apiref{AbstractFunctionDecl}{class} -Base class of function-like declarations. Also a \texttt{DeclContext}. -\begin{itemize} -\item \texttt{getImplicitSelfDecl()} returns the implicit \texttt{self} parameter, if there is one. -\item \texttt{getParameters()} returns the function's parameter list. -\item \texttt{getMethodInterfaceType()} returns the type of a method without the \texttt{Self} clause. -\item \texttt{getResultInterfaceType()} returns the return type of this function or method. -\end{itemize} - -\apiref{ParameterList}{class} -The parameter list of \texttt{AbstractFunctionDecl}, \texttt{EnumElementDecl} or \texttt{SubscriptDecl}. -\begin{itemize} -\item \texttt{size()} returns the number of parameters. -\item \texttt{get()} returns the \texttt{ParamDecl} at the given index. -\end{itemize} - -\IndexSource{constructor declaration} -\apiref{ConstructorDecl}{class} -Constructor declarations. -\begin{itemize} -\item \texttt{getInitializerInterfaceType()} returns the initializer interface type, used when type checking \texttt{super.init()} delegation. -\end{itemize} - -\IndexSource{storage declaration} -\apiref{AbstractStorageDecl}{class} -Base class for storage declarations. -\begin{itemize} -\item \texttt{getValueInterfaceType()} returns the type of the stored value, without \texttt{weak} or \texttt{unowned} storage qualifiers. -\end{itemize} +\subsection*{Declaration Contexts} \IndexSource{declaration context} \apiref{DeclContext}{class} -Base class for declaration contexts. The top-level \verb|isa<>|, \verb|cast<>| and \verb|dyn_cast<>| template functions also support dynamic casting from a \texttt{DeclContext *} to any of its subclasses. +Base class for declaration contexts. The top-level \verb|isa<>|, \verb|cast<>| and \verb|dyn_cast<>| template functions also support dynamic casting from a \texttt{DeclContext *} to any of its subclasses. See also \SecRef{genericsigsourceref}. \IndexSource{closure expression} \IndexSource{source file} @@ -606,6 +1029,180 @@ \section{Source Code Reference}\label{declarationssourceref} \item \texttt{getDeclaredInterfaceType()} delegates to the method on \texttt{NominalTypeDecl} or \texttt{ExtensionDecl} as appropriate. \item \texttt{getSelfInterfaceType()} is similar. \end{itemize} -Generics-related methods on \texttt{DeclContext} are described in Section~\ref{genericdeclsourceref}. -\end{document} \ No newline at end of file +Generic prameters and requirements: +\begin{itemize} +\item \texttt{isGenericContext()} answers true if either this generic context or one of its parents has a generic parameter list. +\item \texttt{isInnermostContextGeneric()} answers if this declaration context itself has a generic parameter list. Compare with \texttt{isGenericContext()}. +\end{itemize} + +\subsection*{Generic Contexts} + +Key source files: +\begin{itemize} +\item \SourceFile{include/swift/AST/GenericParamList.h} +\item \SourceFile{include/swift/AST/Requirement.h} +\item \SourceFile{lib/AST/GenericParamList.cpp} +\item \SourceFile{lib/AST/NameLookup.cpp} +\item \SourceFile{lib/AST/Requirement.cpp} +\end{itemize} + +\IndexSource{generic context} +\IndexSource{generic declaration} +\IndexSource{parsed generic parameter list} +\apiref{GenericContext}{class} +Subclass of \texttt{DeclContext}. Base class for declaration kinds which can have a generic parameter list. See also \SecRef{genericsigsourceref}. +\begin{itemize} +\item \texttt{getParsedGenericParams()} returns the declaration's parsed generic parameter list, or \texttt{nullptr}. +\item \texttt{getGenericParams()} returns the declaration's full generic parameter list, which includes any implicit generic parameters. Evaluates a \texttt{GenericParamListRequest}. +\item \texttt{isGeneric()} answers if this declaration has a generic parameter list. This is equivalent to calling \texttt{DeclContext::isInnermostContextGeneric()}. Compare with \texttt{DeclContext::isGenericContext()}. +\item \texttt{getGenericContextDepth()} returns the \IndexSource{depth}depth of the declaration's generic parameter list, or \texttt{(unsigned)-1} if neither this declaration nor any outer declaration is generic. +\item \texttt{getTrailingWhereClause()} returns the declaration's trailing \texttt{where} clause, or \texttt{nullptr}. +\end{itemize} + +Trailing \texttt{where} clauses are not preserved in serialized generic contexts. Most code should look at \texttt{GenericContext::getGenericSignature()} instead (\SecRef{genericsigsourceref}), except when actually building the generic signature. + + +\IndexSource{generic parameter list} +\apiref{GenericParamList}{class} +A generic parameter list. +\begin{itemize} +\item \texttt{getParams()} returns an array of generic parameter declarations. +\item \texttt{getOuterParameters()} returns the outer generic parameter list, linking multiple generic parameter lists for the same generic context. Only used for extensions of nested generic types. +\end{itemize} + +\IndexSource{protocol Self type@protocol \texttt{Self} type} +\apiref{GenericParamListRequest}{class} +This request creates the full generic parameter list for a declaration. Kicked off from \texttt{GenericContext::getGenericParams()}. +\begin{itemize} +\item For protocols, this creates the implicit \texttt{Self} parameter. +\item For functions and subscripts, calls \texttt{createOpaqueParameterGenericParams()} to walk the formal parameter list and look for \texttt{OpaqueTypeRepr}s. +\item For extensions, calls \texttt{createExtensionGenericParams()} which clones the generic parameter lists of the extended nominal itself and all of its outer generic contexts, and links them together via \texttt{GenericParamList::getOuterParameters()}. +\end{itemize} + +\IndexSource{generic parameter declaration} +\apiref{GenericTypeParamDecl}{class} +A generic parameter declaration. +\begin{itemize} +\item \IndexSource{depth}\texttt{getDepth()} returns the depth of the generic parameter declaration. +\item \IndexSource{index}\texttt{getIndex()} returns the index of the generic parameter declaration. +\item \texttt{getName()} returns the name of the generic parameter declaration. +\item \texttt{getDeclaredInterfaceType()} returns the \IndexSource{sugared type}sugared generic parameter type for this declaration, which prints as the generic parameter's name. +\item \texttt{isOpaque()} answers if this generic parameter is associated with an \IndexSource{opaque parameter}opaque parameter. +\item \texttt{getOpaqueTypeRepr()} returns the associated \texttt{OpaqueReturnTypeRepr} if this is an opaque parameter, otherwise \texttt{nullptr}. +\item \texttt{getInherited()} returns the generic parameter declaration's \IndexSource{inheritance clause}inheritance clause. +\end{itemize} + +Inheritance clauses are not preserved in serialized generic parameter declarations. Requirements stated on generic parameter declarations are part of the corresponding generic context's generic signature, so except when actually building the generic signature, most code uses \texttt{GenericContext::getGenericSignature()} instead (\SecRef{genericsigsourceref}). + +\apiref{GenericTypeParamType}{class} +A \IndexSource{generic parameter type}generic parameter type. +\begin{itemize} +\item \texttt{getDepth()} returns the depth of the generic parameter declaration. +\item \texttt{getIndex()} returns the index of the generic parameter declaration. +\item \texttt{getName()} returns the name of the generic parameter declaration if this is the sugared form, otherwise returns a string of the form ``\ttgp{d}{i}''. +\end{itemize} + +\IndexSource{where clause@\texttt{where} clause} +\apiref{TrailingWhereClause}{class} +The syntactic representation of a trailing \texttt{where} clause. +\begin{itemize} +\item \texttt{getRequirements()} returns an array of \texttt{RequirementRepr}. +\end{itemize} + +\IndexSource{requirement representation} +\apiref{RequirementRepr}{class} +The syntactic representation of a requirement in a trailing \texttt{where} clause. +\begin{itemize} +\item \texttt{getKind()} returns a \texttt{RequirementReprKind}. +\item \texttt{getFirstTypeRepr()} returns the first \texttt{TypeRepr} of a same-type requirement. +\item \texttt{getSecondTypeRepr()} returns the second \texttt{TypeRepr} of a same-type requirement. +\item \texttt{getSubjectTypeRepr()} returns the first \texttt{TypeRepr} of a constraint or layout requirement. +\item \texttt{getConstraintTypeRepr()} returns the second \texttt{TypeRepr} of a constraint requirement. +\item \texttt{getLayoutConstraint()} returns the layout constraint of a layout requirement. +\end{itemize} + +\apiref{RequirementReprKind}{enum class} +\begin{itemize} +\item \texttt{RequirementRepr::TypeConstraint} +\item \texttt{RequirementRepr::SameType} +\item \texttt{RequirementRepr::LayoutConstraint} +\end{itemize} + +\apiref{WhereClauseOwner}{class} +Represents a reference to some set of requirement representations which can be resolved to requirements, for example a trailing \texttt{where} clause. This is used by various requests, such as the \texttt{RequirementRequest} below, and the \texttt{InferredGenericSignatureRequest} in \SecRef{buildinggensigsourceref}. +\begin{itemize} +\item \texttt{getRequirements()} returns an array of \texttt{RequirementRepr}. +\item \texttt{visitRequirements()} resolves each requirement representation and invokes a callback with the \texttt{RequirementRepr} and resolved \texttt{Requirement}. +\end{itemize} + +\apiref{RequirementRequest}{class} +Request which can be evaluated to resolve a single requirement representation in a \texttt{WhereClauseOwner}. Used by \texttt{WhereClauseOwner::visitRequirements()}. + +\IndexSource{protocol declaration} +\IndexSource{primary associated type} +\apiref{ProtocolDecl}{class} +A protocol declaration. +\begin{itemize} +\item \texttt{getTrailingWhereClause()} returns the protocol \texttt{where} clause, or \texttt{nullptr}. +\item \texttt{getAssociatedTypes()} returns an array of all associated type declarations in the protocol. +\item \texttt{getPrimaryAssociatedTypes()} returns an array of all primary associated type declarations in the protocol. +\item \texttt{getInherited()} returns the parsed inheritance clause. +\end{itemize} + +Trailing \texttt{where} clauses and inheritance clauses are not preserved in serialized protocol declarations. Except when actually building the requirement signature, most code uses \texttt{ProtocolDecl::getRequirementSignature()} instead (\SecRef{genericsigsourceref}). + +\IndexSource{inherited protocol} +The last four utility methods operate on the requirement signature, so are safe to use on deserialized protocols: +\begin{itemize} +\item \texttt{getInheritedProtocols()} returns an array of all protocols directly inherited by this protocol, computed from the inheritance clause. +\item \texttt{inheritsFrom()} determines if this protocol inherits from the given protocol, possibly transitively. +\item \texttt{getSuperclass()} returns the protocol's superclass type. +\item \texttt{getSuperclassDecl()} returns the protocol's superclass declaration. +\end{itemize} + +\index{associated type declaration} +\apiref{AssociatedTypeDecl}{class} +An associated type declaration. +\begin{itemize} +\item \texttt{getTrailingWhereClause()} returns the associated type's trailing \texttt{where} clause, or \texttt{nullptr}. +\item \texttt{getInherited()} returns the associated type's inheritance clause. +\end{itemize} + +Trailing \texttt{where} clauses and inheritance clauses are not preserved in serialized associated type declarations. Requirements on associated types are part of a protocol's requirement signature, so except when actually building the requirement signature, most code uses \texttt{ProtocolDecl::getRequirementSignature()} instead (\SecRef{genericsigsourceref}). + +\subsection*{Other Declarations} + +\IndexSource{function declaration} +\IndexSource{method self parameter} +\apiref{AbstractFunctionDecl}{class} +Base class of function-like declarations. Also a \texttt{DeclContext}. +\begin{itemize} +\item \texttt{getImplicitSelfDecl()} returns the implicit \texttt{self} parameter, if there is one. +\item \texttt{getParameters()} returns the function's parameter list. +\item \texttt{getMethodInterfaceType()} returns the type of a method without the \texttt{Self} clause. +\item \texttt{getResultInterfaceType()} returns the return type of this function or method. +\end{itemize} + +\apiref{ParameterList}{class} +The parameter list of \texttt{AbstractFunctionDecl}, \texttt{EnumElementDecl} or \texttt{SubscriptDecl}. +\begin{itemize} +\item \texttt{size()} returns the number of parameters. +\item \texttt{get()} returns the \texttt{ParamDecl} at the given index. +\end{itemize} + +\IndexSource{constructor declaration} +\apiref{ConstructorDecl}{class} +Constructor declarations. +\begin{itemize} +\item \texttt{getInitializerInterfaceType()} returns the initializer interface type, used when type checking \texttt{super.init()} delegation. +\end{itemize} + +\IndexSource{storage declaration} +\apiref{AbstractStorageDecl}{class} +Base class for storage declarations. +\begin{itemize} +\item \texttt{getValueInterfaceType()} returns the type of the stored value, without \texttt{weak} or \texttt{unowned} storage qualifiers. +\end{itemize} + +\end{document} diff --git a/docs/Generics/chapters/derived-requirements-summary.tex b/docs/Generics/chapters/derived-requirements-summary.tex index 13ae28254e5af..3cf12dbb53f51 100644 --- a/docs/Generics/chapters/derived-requirements-summary.tex +++ b/docs/Generics/chapters/derived-requirements-summary.tex @@ -5,41 +5,84 @@ \chapter{Derived Requirements Summary}\label{derived summary} \index{$\vdash$} -\IndexStep{GenSig} -\IndexStep{ReqSig} -\IndexStep{AssocType} -\IndexStep{Conf} -\IndexStep{Equiv} -\IndexStep{Same} -\IndexStep{Member} -\IndexStep{Concrete} +Let $G$ be a \index{generic signature!summary}generic signature. We generate the theory of~$G$ by repeated application of inference rules, starting from a finite set of elementary statements. A derivation proves that some element belongs to this set by giving a list of derivation steps where the assumptions in each step are conclusions of previous steps. Nomenclature: +\begin{center} +\begin{tabular}{ll} +\toprule +\textbf{Symbol}&\textbf{Description}\\ +\midrule +\texttt{T}, \texttt{U}, and \texttt{V}&\index{type parameter!summary}type parameters\\ +\texttt{Self.U} and \texttt{Self.V}&type parameters rooted in the \index{protocol Self type!summary}protocol \texttt{Self} type\\ +\texttt{X}&a concrete type\\ +\texttt{C}&a concrete \index{class type!summary}class type\\ +$\Xprime$ and $\Cprime$&obtained from \texttt{X} and \texttt{C} by replacing \texttt{Self} with \texttt{T}\\ +\texttt{P} and \texttt{Q}&protocols\\ +\texttt{A}&the name of an \index{associated type declaration!summary}associated type of \texttt{P}\\ +\texttt{[P]A}&an associated type declaration of \texttt{P}\\ +\texttt{T.[P]A} and \texttt{T.A}&\index{bound dependent member type!summary}bound and \index{unbound dependent member type!summary}unbound dependent member type\\ +$\ConfReq{T}{P}$&a \index{conformance requirement!summary}conformance requirement\\ +$\SameReq{T}{U}$&a \index{same-type requirement!summary}same-type requirement between type parameters\\ +$\SameReq{T}{X}$&a concrete same-type requirement\\ +$\ConfReq{T}{C}$&a \index{superclass requirement!summary}superclass requirement\\ +$\ConfReq{T}{AnyObject}$&a \index{layout requirement!summary}layout requirement\\ +$\ConfReq{Self.U}{Q}_\texttt{P}$&an \index{associated requirement!summary}associated requirement of protocol \texttt{P}\\ +\bottomrule +\end{tabular} +\end{center} +See \index{derived requirement!summary}\index{valid type parameter!summary}\SecRef{derived req} and \SecRef{type params} for details. + +\index{elementary derivation step!summary}\paragraph{Elementary statements.} +For \IndexStepTwo{Generic}{summary}each generic parameter \ttgp{d}{i} of~$G$: +\begin{gather*} +\GenericStepDef +\end{gather*} +For \IndexStepTwo{Conf}{summary}\IndexStepTwo{Same}{summary}\IndexStepTwo{Concrete}{summary}\IndexStepTwo{Super}{summary}\IndexStepTwo{Layout}{summary}each explicit requirement of~$G$ by \index{requirement kind!summary}kind: +\begin{gather*} +\ConfStepDef\\ +\SameStepDef\\ +\ConcreteStepDef\\ +\SuperStepDef\\ +\LayoutStepDef +\end{gather*} + +\index{requirement signature!summary} +\paragraph{Requirement signatures.} +Assume $G\vdash\ConfReq{T}{P}$. For \IndexStepTwo{AssocName}{summary}\IndexStepTwo{AssocDecl}{summary}\IndexStepTwo{AssocBind}{summary}each associated type~\texttt{A} of~\texttt{P}: +\begin{gather*} +\AssocNameStepDef\\ +\AssocDeclStepDef\\ +\AssocBindStepDef +\end{gather*} +For each \IndexStepTwo{AssocConf}{summary}\IndexStepTwo{AssocSame}{summary}\IndexStepTwo{AssocConcrete}{summary}\IndexStepTwo{AssocSuper}{summary}\IndexStepTwo{AssocLayout}{summary}associated requirement of~\texttt{P} by kind: +\begin{gather*} +\AssocConfStepDef\\ +\AssocSameStepDef\\ +\AssocConcreteStepDef\\ +\AssocSuperStepDef\\ +\AssocLayoutStepDef +\end{gather*} + +\paragraph{Equivalence.} +Same-type requirements \IndexStepTwo{Reflex}{summary}\IndexStepTwo{Sym}{summary}\IndexStepTwo{Trans}{summary}generate an equivalence relation: +\begin{gather*} +\ReflexStepDef\\ +\SymStepDef\\ +\TransStepDef +\end{gather*} + +\paragraph{Compatibility.} +A derived conformance, superclass or layout requirement \IndexStepTwo{SameConf}{summary}\IndexStepTwo{SameSuper}{summary}\IndexStepTwo{SameLayout}{summary}\IndexStepTwo{SameName}{summary}\IndexStepTwo{SameDecl}{summary}applies to all type parameters in an equivalence class: +\begin{gather*} +\SameConfStepDef\\ +\SameConcreteStepDef\\ +\SameSuperStepDef\\ +\SameLayoutStepDef +\end{gather*} +If two type parameters are equivalent, so are \IndexStepTwo{SameName}{summary}\IndexStepTwo{SameDecl}{summary}all corresponding member types: \begin{gather*} -\vdash\ConfReq{T}{P}\tag{\textsc{GenSig}}\\ -\vdash\ConfReq{T}{C}\\ -\vdash\ConfReq{T}{AnyObject}\\ -\vdash\FormalReq{T == U}\\[\medskipamount] -\ConfReq{T}{P}\vdash\texttt{T.A}\tag{\textsc{AssocType}}\\ -\ConfReq{T}{P}\vdash\texttt{T.[P]A}\\ -\ConfReq{T}{P}\vdash\FormalReq{T.[P]A == T.A}\\[\medskipamount] -\vdash\ConfReq{Self.U}{Q}_\texttt{P}\tag{\textsc{ReqSig}}\\ -\vdash\ConfReq{Self.U}{C}_\texttt{P}\\ -\vdash\ConfReq{Self.U}{AnyObject}_\texttt{P}\\ -\vdash\FormalReq{Self.U == Self.V}_\texttt{P}\\[\medskipamount] -\ConfReq{T}{P},\,\ConfReq{Self.U}{Q}\vdash\ConfReq{T.U}{Q}\tag{\textsc{Conf}}\\ -\ConfReq{T}{P},\,\ConfReq{Self.U}{C}\vdash\ConfReq{T.U}{C}\\ -\ConfReq{T}{P},\,\ConfReq{Self.U}{AnyObject}\vdash\ConfReq{T.U}{AnyObject}\\ -\ConfReq{T}{P},\,\FormalReq{Self.U == Self.V}\vdash\FormalReq{T.U == T.V}\\[\medskipamount] -\texttt{T}\vdash\FormalReq{T == T}\tag{\textsc{Equiv}}\\ -\FormalReq{T == U}\vdash\FormalReq{U == T}\\ -\FormalReq{T == U},\,\FormalReq{U == V}\vdash\FormalReq{T == V}\\[\medskipamount] -\ConfReq{U}{P},\,\FormalReq{T == U}\vdash\ConfReq{T}{P}\tag{\textsc{Same}}\\ -\ConfReq{U}{C},\,\FormalReq{T == U}\vdash\ConfReq{T}{C}\\ -\ConfReq{U}{AnyObject},\,\FormalReq{T == U}\vdash\ConfReq{T}{AnyObject}\\[\medskipamount] -\ConfReq{U}{P},\,\FormalReq{T == U}\vdash\FormalReq{T.A == U.A}\tag{\textsc{Member}}\\ -\ConfReq{U}{P},\,\FormalReq{T == U}\vdash\FormalReq{T.[P]A == U.[P]A}\\[\medskipamount] -\FormalReq{T == U}\vdash\FormalReq{G == G}\qquad\mbox{(arbitrary type constructor \texttt{G<>})}\tag{\textsc{Concrete}}\\ -\FormalReq{G == G}\vdash\FormalReq{T == U}\qquad\mbox{(arbitrary type constructor \texttt{G<>})} +\SameNameStepDef\\ +\SameDeclStepDef \end{gather*} -\end{document} \ No newline at end of file +\end{document} diff --git a/docs/Generics/chapters/existential-types.tex b/docs/Generics/chapters/existential-types.tex index c0f0bedc535e6..2b714d4819269 100644 --- a/docs/Generics/chapters/existential-types.tex +++ b/docs/Generics/chapters/existential-types.tex @@ -2,19 +2,21 @@ \begin{document} -\chapter{Existential Types}\label{existentialtypes} +\chapter[]{Existential Types}\label{existentialtypes} \ifWIP +Existential types where the constraint type is \texttt{AnyObject} and \texttt{Any} can also be written without the \texttt{any} keyword. + Mention \cite{rajexistential} As every Swift developer knows, protocols serve a dual purpose in the language: as generic constraints, and as the types of values. The latter feature, formally known as existential types, is the topic of this chapter. An existential type can be thought of as a container for values which satisfy certain requirements. Existential types were borrowed from \index{Objective-C}Objective-C, and have been part of the Swift language since the beginning, in the form of protocol types and protocol compositions. This feature has an interesting history. The protocols that could be used as types were initially restricted to those without associated types, or requirements with \texttt{Self} in non-covariant position (the latter rules out \texttt{Equatable} for example). This meant that the implementation of existential types was at first rather disjoint from generics. As existential types gained the ability to state more complex constraints over time, the two sides of protocols converged. -Protocol compositions were originally written as \texttt{protocol} for a value of a type conforming to both protocols \texttt{P} and \texttt{Q}. The modern syntax for protocol compositions \texttt{P~\&~Q} was introduced in Swift 3 \cite{se0095}. Protocol compositions with superclass terms were introduced in Swift 4 \cite{se0156}. The spelling \texttt{any P} of an existential type, to distinguish from \texttt{P} the constraint type, was introduced in Swift 5.6 \cite{se0355}. This was followed by Swift 5.7 allowing all protocols to be used as existential types \cite{se0309}, and introducing implicit opening of existential types \cite{se0352}, and constrained existential types \cite{se0353}. +Protocol compositions were originally written as \texttt{protocol} for a value of a type conforming to both protocols \texttt{P} and \texttt{Q}. The modern syntax for protocol compositions \texttt{P~\&~Q} was introduced in \IndexSwift{3.0}Swift 3 \cite{se0095}. Protocol compositions with superclass terms were introduced in \IndexSwift{4.0}Swift 4 \cite{se0156}. The spelling \texttt{any P} of an existential type, to distinguish from \texttt{P} the constraint type, was introduced in \IndexSwift{5.6}Swift 5.6 \cite{se0355}. This was followed by \IndexSwift{5.7}Swift 5.7 allowing all protocols to be used as existential types \cite{se0309}, and introducing implicit opening of existential types \cite{se0352}, and constrained existential types \cite{se0353}. -An existential type is written with the \texttt{any} keyword followed by a constraint type, which is a concept previously defined in Section~\ref{constraints}. For aesthetic reasons, the \texttt{any} keyword can be omitted if the constraint type is \texttt{Any} or \texttt{AnyObject}, since \texttt{any~Any} or \texttt{any~AnyObject} looks funny. For backwards compatibility, \texttt{any} can also be omitted if the protocols appearing in the constraint type do not have any associated types or requirements with \texttt{Self} in non-covariant position. +An existential type is written with the \texttt{any} keyword followed by a constraint type, which is a concept previously defined in \SecRef{requirements}. For aesthetic reasons, the \texttt{any} keyword can be omitted if the constraint type is \texttt{Any} or \texttt{AnyObject}, since \texttt{any~Any} or \texttt{any~AnyObject} looks funny. For backwards compatibility, \texttt{any} can also be omitted if the protocols appearing in the constraint type do not have any associated types or requirements with \texttt{Self} in non-covariant position. \paragraph{Type representation} Existential types are instances of \texttt{ExistentialType}, which wraps a constraint type. Even in the cases where \texttt{any} can be omitted, type resolution will wrap the constraint type in \texttt{ExistentialType} when resolving a type in a context where the type of a value is expected. If the constraint type is a protocol composition with a superclass term, or a parameterized protocol type, arbitrary types can appear as structural components of the constraint type. This means that the constraint type of an existential type is subject to substitution by \texttt{Type::subst()}. For example, the interface type of the properties \texttt{foo} and \texttt{bar} below are existential types containing type parameters: @@ -38,7 +40,7 @@ \chapter{Existential Types}\label{existentialtypes} \fi -\section{Opened Existentials}\label{open existential archetypes} +\section[]{Opened Existentials}\label{open existential archetypes} \ifWIP @@ -101,7 +103,7 @@ \section{Opened Existentials}\label{open existential archetypes} \end{quote} In both signatures, the interface type of the existential is \texttt{\ttgp{1}{0}}. \end{example} -Recall from Chapter~\ref{genericenv} that there are three kinds of generic environments. We've seen primary generic environments, which are associated with generic declarations. We also saw opaque generic environments, which are instantiated from an opaque return declaration and substitution map, in Section~\ref{opaquearchetype}. Now, it's time to introduce the third kind, the opened generic environment. An opened generic environment is created from an opened existential signature of the first kind (with no parent generic signature). The archetypes of an opened generic environment are \emph{opened archetypes}. +Recall from \ChapRef{genericenv} that there are three kinds of generic environments. We've seen primary generic environments, which are associated with generic declarations. We also saw opaque generic environments, which are instantiated from an opaque return declaration and substitution map, in \SecRef{opaquearchetype}. Now, it's time to introduce the third kind, the opened generic environment. An opened generic environment is created from an opened existential signature of the first kind (with no parent generic signature). The archetypes of an opened generic environment are \emph{opened archetypes}. \index{call expression} When the expression type checker encounters a call expression where an argument of existential type is passed to a parameter of generic parameter type, the existential value is \emph{opened}, projecting the value and assigning it a new opened archetype from a fresh opened generic environment. The call expression is rewritten by wrapping the entire call is wrapped in an \texttt{OpenExistentialExpr}, which stores two sub-expressions. The first sub-expression is the original call argument, which evaluates to the value of existential type. The payload value and opened archetype is scoped to the second sub-expression, which consumes the payload value. The call argument is replaced with a \texttt{OpaqueValueExpr}, which has the opened archetype type. The opened archetype also becomes the replacement type for the generic parameter in the call's substitution map. @@ -153,7 +155,7 @@ \section{Opened Existentials}\label{open existential archetypes} \fi -\section{Existential Layouts}\label{existentiallayouts} +\section[]{Existential Layouts}\label{existentiallayouts} \ifWIP @@ -162,11 +164,11 @@ \section{Existential Layouts}\label{existentiallayouts} \item[\texttt{getKind()}] Returns an element of the \texttt{ExistentialLayout::Kind} enum, which is one of \texttt{Class}, \texttt{Error}, or \texttt{Opaque}, corresponding to one of the below representations. \item[\texttt{requiresClass()}] Returns whether this existential type requires the stored concrete type to be a class, that is, whether it uses class representation. \item[\texttt{getSuperclass()}] Returns the existential's superclass bound, either explicitly stated in a protocol composition or declared on a protocol. -\item[\texttt{getProtocols()}] Returns the existential's protocol conformances. The protocols in this array are minimized with respect to protocol inheritance, and sorted in canonical protocol order (Definition~\ref{linear protocol order}). +\item[\texttt{getProtocols()}] Returns the existential's protocol conformances. The protocols in this array are minimized with respect to protocol inheritance, and sorted in canonical protocol order (\DefRef{linear protocol order}). \item[\texttt{getLayoutConstraint()}] Returns the existential's layout constraint, if there is one. This is the \texttt{AnyObject} layout constraint if the existential can store any Swift or \index{Objective-C}Objective-C class instance. If the superclass bound is further known to be a Swift-native class, this is the stricter \texttt{\_NativeClass} layout constraint. \end{description} -Some of the above methods might look familiar from the description of generic signature queries in Section~\ref{genericsigqueries}, or the local requirements of archetypes in Chapter~\ref{genericenv}. Indeed, for the most part, the same information can be recovered by asking questions about the existential's interface type in the opened existential signature, or if you have an opened archetype handy, by calling similar methods on the archetype. There is one important difference though. In a generic signature, the minimization algorithm drops protocol conformance requirements which are satisfied by a superclass bound. This is true with opened existential signatures as well. However, for historical reasons, the same transformation is not applied when computing an existential layout. This means that the list of protocols in \texttt{ExistentialLayout::getProtocols()} may include more protocols than the \texttt{getConformsTo()} query on the opened existential signature. It is the former list of protocols coming from the \texttt{ExistentialLayout} that informs the runtime representation of the existential type \texttt{any C \& P}. If \index{ABI}ABI stability was not a concern, this would be reworked to match the behavior of requirement minimization. +Some of the above methods might look familiar from the description of generic signature queries in \SecRef{genericsigqueries}, or the local requirements of archetypes in \ChapRef{genericenv}. Indeed, for the most part, the same information can be recovered by asking questions about the existential's interface type in the opened existential signature, or if you have an opened archetype handy, by calling similar methods on the archetype. There is one important difference though. In a generic signature, the minimization algorithm drops protocol conformance requirements which are satisfied by a superclass bound. This is true with opened existential signatures as well. However, for historical reasons, the same transformation is not applied when computing an existential layout. This means that the list of protocols in \texttt{ExistentialLayout::getProtocols()} may include more protocols than the \texttt{getConformsTo()} query on the opened existential signature. It is the former list of protocols coming from the \texttt{ExistentialLayout} that informs the runtime representation of the existential type \texttt{any C \& P}. If \index{ABI}ABI stability was not a concern, this would be reworked to match the behavior of requirement minimization. \begin{example} Consider these definitions: @@ -253,7 +255,7 @@ \section{Existential Layouts}\label{existentiallayouts} \end{tabular} \end{quote} -\section{Generalization Signatures} +\section[]{Generalization Signatures} \index{metatype type} \index{runtime type metadata} @@ -273,7 +275,7 @@ \section{Generalization Signatures} print(concrete() == generic(Int.self)) // true \end{Verbatim} \end{listing} -Listing~\ref{metadataunique} constructs the same metatype twice, once in a concrete function and then again in a generic function: +\ListingRef{metadataunique} constructs the same metatype twice, once in a concrete function and then again in a generic function: \begin{itemize} \item The \texttt{concrete()} function encodes the type \texttt{(Int, Int)} using a compact mangled representation and passes it to the a runtime entry point for instantiating metadata from a mangled type string. This entry point ultimately calls the tuple type constructor after demangling the input string. \item The \texttt{generic()} function receives the type metadata for \texttt{Int} as an argument, and directly calls the tuple type constructor to build the type \texttt{(T, T)} with the substitution \texttt{T := Int}. Both functions return the same value of \texttt{Any.Type} because the two calls to the tuple type constructor return the same value. @@ -308,7 +310,7 @@ \section{Generalization Signatures} \end{Verbatim} \end{listing} -As a first attempt at solving this problem, you might think to use the opened existential signature as the uniquing key for existential type metadata at runtime. Unfortunately, naively encoding the requirements of the opened existential signature does not give you uniqueness, because the opened existential signature also includes all generic parameters and requirements from the parent generic signature. Listing~\ref{generalizationexample} shows a ``concrete vs. generic'' example similar to the above, but with constrained existential types. +As a first attempt at solving this problem, you might think to use the opened existential signature as the uniquing key for existential type metadata at runtime. Unfortunately, naively encoding the requirements of the opened existential signature does not give you uniqueness, because the opened existential signature also includes all generic parameters and requirements from the parent generic signature. \ListingRef{generalizationexample} shows a ``concrete vs. generic'' example similar to the above, but with constrained existential types. The opened existential signature of \texttt{any P} in \texttt{concrete()} is: \begin{quote} @@ -348,7 +350,7 @@ \section{Generalization Signatures} \begin{quote} \texttt{<\ttgp{0}{0} where \ttgp{0}{0}:\ P, \ttgp{0}{0}.[P]X == ConcreteQ>} \end{quote} -This two-step process of applying a substitution map to the requirements of a generic signature, then building a new generic signature from the substituted generic requirements re-appears several times throughout the compiler. Requirement inference in Section~\ref{requirementinference} used this technique. It will also come up again in Chapter \ref{classinheritance}~and~\ref{valuerequirements}. In this case though, \textsl{it doesn't actually solve our problem!} Whatever transformation we do here needs to happen at runtime, since the implementation of \texttt{generic()} needs to be able to do it for an arbitrary type \texttt{T}. Teaching the runtime to build minimal canonical generic signatures from scratch is not practical since it would require duplicating a large portion of the compiler there. +This two-step process of applying a substitution map to the requirements of a generic signature, then building a new generic signature from the substituted generic requirements re-appears several times throughout the compiler. Requirement inference in \SecRef{requirementinference} used this technique. It will also come up again in Chapters \ref{classinheritance}~and~\ref{valuerequirements}. In this case though, \textsl{it doesn't actually solve our problem!} Whatever transformation we do here needs to happen at runtime, since the implementation of \texttt{generic()} needs to be able to do it for an arbitrary type \texttt{T}. Teaching the runtime to build minimal canonical generic signatures from scratch is not practical since it would require duplicating a large portion of the compiler there. Instead of using the ``most concrete'' opened existential signature as the uniquing key, the compiler constructs the ``most generic'' signature together with a substitution map. If the replacement types in this substitution map contain type parameters, they are filled in at runtime from the generic context when the existential type metadata is being constructed. The resulting generalization signature and substitution map serves as the uniquing key for the runtime instantiation of existential type metadata. This algorithm is implemented in \texttt{ExistentialGeneralization::get()}. @@ -389,7 +391,7 @@ \section{Generalization Signatures} \end{enumerate} These are the necessary invariants that ensures uniqueness of existential type metadata. -\begin{example} Let's look at Listing~\ref{generalizationexample} again. Starting with \texttt{concrete()}, applying Algorithm~\ref{existentialgeneralizationalgo} to the type \texttt{any~P} gives the generalized constraint type \texttt{any~P<\ttgp{0}{0},~\ttgp{0}{1}>} and the generalization signature \texttt{<\ttgp{0}{0}, \ttgp{0}{1}>} and the following substitution map: +\begin{example} Let's look at \ListingRef{generalizationexample} again. Starting with \texttt{concrete()}, applying \AlgRef{existentialgeneralizationalgo} to the type \texttt{any~P} gives the generalized constraint type \texttt{any~P<\ttgp{0}{0},~\ttgp{0}{1}>} and the generalization signature \texttt{<\ttgp{0}{0}, \ttgp{0}{1}>} and the following substitution map: \[ \SubstMap{ \SubstType{\ttgp{0}{0}}{ConcreteQ}\\ @@ -407,7 +409,7 @@ \section{Generalization Signatures} \end{example} \begin{example} -The generalization signature in the previous example does not have any generic requirements. In Listing~\ref{generalizationrequirements}, the existential type is a protocol composition containing a generic class type, which can introduce requirements in the generalization signature. Applying Algorithm~\ref{existentialgeneralizationalgo} to the type \texttt{any~Q~\&~G} produces the generalized constraint type \texttt{any~Q<\ttgp{0}{0}>~\&~G<\ttgp{0}{1}>} and the following generalization signature: +The generalization signature in the previous example does not have any generic requirements. In \ListingRef{generalizationrequirements}, the existential type is a protocol composition containing a generic class type, which can introduce requirements in the generalization signature. Applying \AlgRef{existentialgeneralizationalgo} to the type \texttt{any~Q~\&~G} produces the generalized constraint type \texttt{any~Q<\ttgp{0}{0}>~\&~G<\ttgp{0}{1}>} and the following generalization signature: \begin{quote} \texttt{<\ttgp{0}{0}, \ttgp{0}{1} where \ttgp{0}{1}:\ P, \ttgp{0}{1}.[P]X == \ttgp{0}{1}.[P]Y>} \end{quote} @@ -447,13 +449,13 @@ \section{Generalization Signatures} \fi -\section{Self-Conforming Protocols}\label{selfconformingprotocols} +\section[]{Self-Conforming Protocols}\label{selfconformingprotocols} \ifWIP A common source of confusion for beginners is that in general, protocols in Swift do not conform to themselves. The layperson's explanation of this is that an existential type is a ``box'' for storing a value with an unknown concrete type. If the box requires that the value's type conforms to a protocol, you can't fit the ``box itself'' inside of another box, because it has the wrong shape. This explanation will be made precise in this section. -For many purposes, implicit existential opening introduced in Swift 5.7 \cite{se0352} offers an elegant way around this problem: +For many purposes, implicit existential opening introduced in \IndexSwift{5.7}Swift 5.7 \cite{se0352} offers an elegant way around this problem: \begin{Verbatim} protocol Animal {...} @@ -554,7 +556,7 @@ \section{Self-Conforming Protocols}\label{selfconformingprotocols} \fi -\section{Source Code Reference} +\section[]{Source Code Reference} \iffalse @@ -599,4 +601,4 @@ \section{Source Code Reference} \fi -\end{document} \ No newline at end of file +\end{document} diff --git a/docs/Generics/chapters/extensions.tex b/docs/Generics/chapters/extensions.tex index 7ad265e19317b..8a6a4e99e2f25 100644 --- a/docs/Generics/chapters/extensions.tex +++ b/docs/Generics/chapters/extensions.tex @@ -4,11 +4,7 @@ \chapter{Extensions}\label{extensions} -\IndexDefinition{extension declaration} -\index{value declaration} -\index{qualified lookup} -\IndexDefinition{extended type} -\lettrine{E}{xtensions add members} to existing nominal type declarations. We refer to this nominal type declaration as the \emph{extended type} of the extension. The extended type may have been declared in the same source file, another source file of the main module, or in some other module. Extensions themselves are declarations, but they are \emph{not} value declarations in the sense of Chapter~\ref{decls}, meaning the extension itself cannot be referenced by name. Instead, the members of an extension are referenced as members of the extended type, visible from qualified name lookup. +\lettrine{E}{xtensions add members} to existing nominal type declarations. We refer to this nominal type declaration as the \IndexDefinition{extended type}\emph{extended type} of the extension. The extended type may have been declared in the same source file, another source file of the main module, or in some other module. Extensions themselves are \IndexDefinition{extension declaration}declarations, but they are \emph{not} \index{value declaration}value declarations in the sense of \ChapRef{decls}, meaning the extension itself cannot be referenced by name. Instead, the members of an extension are referenced as members of the extended type, visible to \index{qualified lookup}qualified name lookup. Consider a module containing a pair of struct declarations, \texttt{Outer} and \texttt{Outer.Middle}: \begin{Verbatim} @@ -25,22 +21,19 @@ \chapter{Extensions}\label{extensions} \end{Verbatim} If a third module subsequently imports both the first and second module, it will see the members \texttt{Outer.Middle.foo()} and \texttt{Outer.Middle.Inner} just as if they were defined inside \texttt{Outer.Middle} itself. -\index{interface type} -\index{normal conformance} \paragraph{Extensions and generics.} -Extensions have generic signatures, so the interface types of their members may involve the generic parameters of the extended type. The generic signature of an ``unconstrained'' extension is the same as that of the extended type. Extensions can impose additional requirements on their generic parameters via a \texttt{where} clause; this declares a \emph{constrained extension} with its own generic signature (Section~\ref{constrained extensions}). An extension can also state a conformance to a protocol, which is represented as normal conformance visible to global conformance lookup. If the extension is unconstrained, this is equivalent to a conformance on the extended type. If the extension is constrained, the conformance becomes a \emph{conditional conformance} (Section~\ref{conditional conformance}). +The \index{generic parameter declaration}generic parameters of the extended type are visible in the \index{scope tree}scope of the extension's body. Each extension has a generic signature, which describes the \index{interface type}interface types of its members. The generic signature of an ``unconstrained'' extension is the same as that of the extended type. Extensions can impose additional requirements on their generic parameters via a \Index{where clause@\texttt{where} clause}\texttt{where} clause; this declares a \emph{constrained extension} with its own generic signature (\SecRef{constrained extensions}). An extension can also state a conformance to a protocol, which is represented as \index{normal conformance}normal conformance visible to global conformance lookup. If the extension is unconstrained, this is essentially equivalent to stating a conformance on the extended type. If the extension is constrained, the conformance becomes a \emph{conditional conformance} (\SecRef{conditional conformance}). -\index{generic parameter list} -Let's begin by taking a closer look at generic parameter lists of extensions, by way of our nested type \texttt{Outer.Middle} and its extension above. Recall how generic parameters of nominal types work from Chapter~\ref{generic declarations}: their names are lexically scoped to the body of the type declaration, and each generic parameter uniquely identified by its depth and index. We can represent the declaration context nesting and generic parameter lists of our nominal type declaration \texttt{Outer.Middle} with a diagram like the following: +Let's begin by taking a closer look at \index{generic parameter list}generic parameter lists of extensions, by way of our nested type \texttt{Outer.Middle} and its extension above. Recall how generic parameters of nominal types work from \SecRef{generic params}: their names are lexically scoped to the body of the type declaration, and each generic parameter uniquely identified by its \index{depth}depth and \index{index}index. We can represent the declaration context nesting and generic parameter lists of our nominal type declaration \texttt{Outer.Middle} with a diagram like the following: \begin{quote} \begin{tikzpicture}[node distance=5mm and 5mm,text height=1.5ex,text depth=.25ex] \node (SourceFile) [SourceFile] {source file}; -\node (Outer) [below=of SourceFile,xshift=4em,decl] {struct \texttt{Outer}}; +\node (Outer) [below=of SourceFile,xshift=4em,decl, outer sep=0.2em] {struct \texttt{Outer}}; \node (OuterGP) [right=of Outer,xshift=2em,OtherEntity] {generic parameter list \texttt{}}; -\node (Middle) [below=of Outer,xshift=4em,decl] {struct \texttt{Middle}}; +\node (Middle) [below=of Outer,xshift=4em,decl, outer sep=0.2em] {struct \texttt{Middle}}; \node (MiddleGP) [right=of Middle,xshift=2em,OtherEntity] {generic parameter list \texttt{}}; \draw [arrow] (Middle.west) -| (Outer.south); @@ -52,7 +45,7 @@ \chapter{Extensions}\label{extensions} \end{tikzpicture} \end{quote} -Semantically, each generic parameter visible inside the extended type should also be visible inside the extension, under the same name, depth and index. This is implemented by cloning the generic parameter declarations. Since every generic parameter in a single generic parameter list must have the same depth, an extension conceptually has multiple generic parameter lists, one for each level of depth (that is, the generic context nesting) of the extended type. +We create the fiction that each generic parameter in the scope of the extended type is also visible inside the extension by \emph{cloning} generic parameter declarations. The cloned declarations have the same name, depth and index as the originals, but they are parented to the extension. This ensures that looking up a generic parameter inside an extension finds a generic parameter with the same depth and index as the one with the same name in the extended type. Since all generic parameter in a single generic parameter list have the same depth, an extension conceptually has multiple generic parameter lists, one for each level of depth (that is, the generic context nesting) of the extended type. This is represented by linking the generic parameter lists together via an optional ``outer list'' pointer. The innermost generic parameter list is ``the'' generic parameter list of the extension, and the head of the list; it is cloned from the extended type. Its outer list pointer is cloned from the extended type's parent generic context, if any, and so on. The outermost generic parameter list has a null outer list pointer. In our extension of \texttt{Outer.Middle}, this looks like so: \begin{quote} @@ -60,7 +53,7 @@ \chapter{Extensions}\label{extensions} \node (SourceFile) [SourceFile] {source file}; -\node (Middle) at (0, -2) [xshift=6.5em,decl] {extension \texttt{Outer.Middle}}; +\node (Middle) at (0, -2) [xshift=6.5em,decl, outer sep=0.2em] {extension \texttt{Outer.Middle}}; \node (MiddleGP) [right=of Middle,xshift=2em,OtherEntity] {generic parameter list \texttt{}}; \node (OuterGP) [above=of MiddleGP,OtherEntity] {generic parameter list \texttt{}}; @@ -73,7 +66,7 @@ \chapter{Extensions}\label{extensions} \end{tikzpicture} \end{quote} -Unqualified name lookup traverses the outer list pointer when searching for generic parameter declarations by name. A generic parameter found inside an extension always has the same depth and index as the original generic parameter it was cloned from in the extended type. Now, consider the nested type \texttt{Inner} from our example, which declares a generic parameter list of its own: +Unqualified name lookup traverses the outer list pointer when searching for generic parameter declarations by name. Now, consider the nested type \texttt{Inner} from our example, which declares a generic parameter list of its own: \begin{Verbatim} extension Outer.Middle { struct Inner {} @@ -85,8 +78,8 @@ \chapter{Extensions}\label{extensions} \node (SourceFile) [SourceFile] {source file}; -\node (Middle) at (0, -2) [xshift=6.5em,decl] {extension \texttt{Outer.Middle}}; -\node (Inner) [below=of Middle,xshift=4em,decl] {struct \texttt{Inner}}; +\node (Middle) at (0, -2) [xshift=6.5em,decl, outer sep=0.2em] {extension \texttt{Outer.Middle}}; +\node (Inner) [below=of Middle,xshift=4em,decl, outer sep=0.2em] {struct \texttt{Inner}}; \node (InnerGP) [right=of Inner,xshift=2em,OtherEntity] {generic parameter list \texttt{}}; \node (MiddleGP) [right=of Middle,xshift=2em,OtherEntity] {generic parameter list \texttt{}}; @@ -101,29 +94,30 @@ \chapter{Extensions}\label{extensions} \end{tikzpicture} \end{quote} +The \index{declared interface type}declared interface type of an extension is the declared interface type of the extended type; the \index{self interface type}self interface type of an extension is the self interface type of the extended type. The two concepts coincide except when the extended type is a \index{protocol extension}protocol, in which case the declared interface type is the protocol type, whereas the self interface type is the \Index{protocol Self type@protocol \texttt{Self} type}protocol \texttt{Self} type. -\paragraph{Other behaviors.} Syntactically, the extended type is written in a type representation following the \texttt{extension} keyword. As the extended type must ultimately resolve to a nominal type, this is typically an identifier type representation. Extension binding uses a more primitive form of type resolution, for reasons explained in Section~\ref{extension binding}. Extension members generally behave as members of the extended type, with a few caveats: +\paragraph{Other behaviors.} An extension member generally behaves as if were declared inside the extended type itself, with some differences: \begin{itemize} -\item Members of \index{protocol extension}\emph{protocol extensions} are not requirements imposed on the concrete type in the way that members of a protocol declaration are; they actually have implementations. +\item Members of \index{protocol extension}\emph{protocol extensions} are not requirements imposed on the concrete type in the way that members of a protocol declaration are; they actually have a body. -A protocol extension member with the same name and \index{interface type}interface type as a protocol requirement acts as a \IndexDefinition{default witness}\emph{default witness}, to be used if the conforming type does not provide its own witness for this requirement. The protocol extension below declares a default implementation of the protocol requirement \texttt{f()}, together with a utility function \texttt{g()}: +A protocol extension member with the same name and \index{interface type}interface type as a protocol requirement acts as a \IndexDefinition{default witness}\emph{default witness}, to be used if the conforming type does not provide its own witness for this requirement: \begin{Verbatim} protocol P { func f() } extension P { - func f() {} - func g() { - f() + func f() { + print("default witness for P.f()") } } -struct S: P {} // conformance uses the default witness for P.f() +struct S: P {} + +S().f() // calls the default witness for P.f() \end{Verbatim} \index{stored property declaration} \index{struct declaration} -\index{limitation} \item Extensions of struct and class declarations cannot add stored properties, only computed properties: \begin{Verbatim} struct S { @@ -135,7 +129,7 @@ \chapter{Extensions}\label{extensions} var flippedX: Int { return -x } // computed property: okay } \end{Verbatim} -This is for several reasons. Semantically, all stored properties must be initialized by all declared constructors; if extensions could add stored properties unknown to a previously-declared constructor, this invariant would be broken. Another reason is that the stored property layout of a struct or class is computed when the struct or class declaration is emitted, and there is no mechanism for extensions to alter this layout after the fact. +All stored properties must be initialized by all declared constructors; extensions being able to introduce stored properties hithero-unknown to the original module would break this invariant. Another reason is that the stored property layout of a struct or class is computed when the struct or class declaration is emitted, and there is no mechanism for extensions to alter this layout after the fact. \index{enum declaration} \item Extensions cannot add new cases to enum declarations, for similar reasons as stored properties; doing so would complicate both the static exhaustiveness checking for \texttt{switch} statements and the in-memory layout computation of the enum. @@ -144,47 +138,34 @@ \chapter{Extensions}\label{extensions} \item When the extended type is a class, the methods of the extension are implicitly \texttt{final}, and the extension's methods are not permitted to override methods from the superclass. Non-\texttt{final} methods declared inside a class are dispatched through a vtable of function pointers attached to the runtime metadata of the class, and there is no mechanism for extensions to add new entries or replace existing entries inherited from the superclass. \index{nested type declaration} -\item The rules for nested types in extensions are the same as nested types inside nominal types (Section~\ref{nested nominal types}). That is, extensions of structs, enums and classes can declare nested structs, enums and classes; while protocol extensions cannot declare nested nominal types for the same reason that protocols cannot. Extensions themselves always appear at the top level of a source file, but the extended type can be a nominal type nested inside of another nominal type (or extension). - -\index{self interface type} -\index{declared interface type} -\Index{protocol Self type@protocol \texttt{Self} type} -\item The declared interface type of an extension is the declared interface type of the extended type; the self interface type of an extension is the self interface type of the extended type. The two concepts coincide except when the extended type is a protocol, in which case the declared interface type is the protocol type, whereas the self interface type is the protocol \texttt{Self} type. +\item The rules for nested types in extensions are the same as those for other nominal types (\SecRef{nested nominal types}). An extension of a struct, enum or class may contain nested structs, enums and classes, while a protocol extension cannot just as a protocol cannot. Extensions themselves must be at the top level of a source file, but the extended type can be nested inside of another nominal type (or extension). \end{itemize} \section{Extension Binding}\label{extension binding} -\IndexDefinition{extension binding}% -\index{qualified lookup}% -\index{name lookup}% -Extension members are visible to qualified lookup because \emph{extension binding} establishes a two-way association between an extension declaration and its extended type. Extension binding resolves the extended type representation to obtain the extended type, and then adds the extension's members to the extended type's name lookup table. Extension binding runs very early in the type checking process, immediately after parsing and import resolution. A complication is that the extended type of an extension can itself be declared inside of another extension. Extension binding cannot simply visit all extension declarations in a single pass in source order, because the order of declarations in a source file, and the order of source files within a module, should have no semantic effect. Instead, multiple passes are performed; a failure to bind an extension is not a fatal condition, and instead failed extensions are revisited later after subsequent extensions are successfully bound. This process iterates until fixed point. +The extended type of an extension is given as a \index{type representation}type representation following the \texttt{extension} keyword. Extension members are made available to \index{qualified lookup}qualified lookup by the process of \IndexDefinition{extension binding}\emph{extension binding}, which resolves the type representation of an extension to its extended type, and adds the extension's members to the extended type's name lookup table. + +A complication is that the extended type of an extension can itself be declared inside of another extension. Extension binding cannot simply visit all extension declarations in a single pass in source order, because of ordering dependencies between extensions and nested types. Instead, multiple passes are performed; a failure to bind an extension is not a fatal condition, and instead failed extensions are revisited later after subsequent extensions are successfully bound. This process iterates until fixed point. \begin{algorithm}[Extension binding]\label{extension binding algorithm} Takes a list of all extensions in the main module as input, in any order. \begin{enumerate} -\item Initialize the pending list to contain all extensions. -\item Initialize the delayed list to the empty list. -\item Initialize the flag, initially clear. +\item Initialize the pending list, adding all extension declarations. +\item Initialize the delayed list to be empty. +\item Initialize the flag to be clear. \item (Check) If the pending list is empty, go to Step~6. -\item (Resolve) Remove an extension from the pending list and attempt to resolve its extended type. If resolution succeeds, associate the extension with the resolved nominal type declaration and set the flag. If resolution fails, add the extension to the delayed list but do not emit any diagnostics and leave the flag unchanged. Go back to Step~4. -\item (Retry) If the flag is set, move the entire contents of the delayed list to the pending list, clear the flag, and go back to Step~4. Otherwise, return. +\item (Resolve) Remove an extension from the pending list and attempt to resolve its extended type. If resolution succeeds, associate the extension with the resolved nominal type declaration and set the flag. If resolution fails, add the extension to the delayed list but do not emit any diagnostics and leave the flag unchanged. Go to Step~4. +\item (Retry) If the flag is set, move the entire contents of the delayed list to the pending list, clear the delayed list, clear the flag, and go to Step~4. Otherwise, return. \end{enumerate} \end{algorithm} -\index{type-check source file request} -\index{primary file} -Any extensions with an unresolvable extended type will simply remain on the delayed list when this algorithm returns; the algorithm does not actually emit diagnostics. Unresolvable extended types are diagnosed later, when the \textbf{type-check source file request} visits the declarations in each primary file. Upon encountering an extension which failed extension binding, the type-check source file request resolves the extended type using ordinary type resolution. If this fails, we simply diagnose the unknown type. Otherwise, we're in a more subtle situation where the extended type is resolvable by ordinary type resolution, but not the limited form of type resolution used by extension binding, which is described below. +The worklist-driven extension binding algorithm was introduced in \IndexSwift{5.0}Swift~5. Older compiler releases attempted to bind extensions in a single pass, something that could either succeed or fail depending on declaration order. This incorrect behavior was one of the most frequently reported bugs of all time \cite{sr631}. -The worklist-driven extension binding algorithm was introduced in \index{history}Swift~5.0. Older compiler releases attempted to bind extensions in a single pass, something that could either succeed or fail depending on declaration order. This incorrect behavior was one of the most frequently reported bugs of all time \cite{sr631}. +\paragraph{Invalid extensions.} If extension binding fails to resolve the extended type of an extension, it simply remains on the delayed list without any diagnostics emitted. Invalid extensions are \index{diagnostic!extension binding}diagnosed later, when the \index{type-check source file request}\Request{type-check source file request} visits all \index{primary file}primary files and attempts to resolve the extended types of any extensions again. -\index{limitation}% -\paragraph{Unsupported extensions.} -\index{synthesized declaration}% -\index{identifier type representation}% -\index{generic type alias}% -\index{type resolution}% -\index{associated type inference}% -Ordinary type resolution depends on generic signature construction and protocol conformance checking; however, we cannot do these things until after extension binding is complete, lest they depend on name lookup finding members of extensions. For this reason, extension binding must use a minimal form of type resolution that does not call on other parts of the type checker. In particular, extension binding cannot find declarations synthesized by the compiler, including type aliases created by associated type inference. Also, it cannot perform type substitution; this rules out extensions of generic type aliases whose underlying type is itself a type parameter. +Extension binding uses a more limited form of type resolution, because we only need to resolve the type representation to a \emph{type declaration} and not a \emph{type}. This type declaration must be a \index{nominal type}nominal type declaration, so the extended type is typically written as an \index{identifier type representation}\emph{identifier} or \index{member type representation}\emph{member} type representation (Sections \ref{identtyperepr}~and~\ref{member type repr}). Extension binding runs early in the type checking process, immediately after parsing and import resolution. We cannot build \index{generic signature}generic signatures or \index{conformance checker}check conformances in extension binding, because those requests assume that extension binding has already taken place, freely relying on qualified lookup to find members of arbitrary extensions. -This is the situation alluded to previously, where extension binding fails but the \textbf{type-check source file request} successfully resolves the extended type later. In this case, a couple of additional checks help tailor the diagnostic. If ordinary type resolution returned a type alias type \texttt{Foo} desugaring to a nominal type \texttt{Bar}, the type checker emits the ``extension of type \texttt{Foo} must be declared as an extension of \texttt{Bar}'' diagnostic. Even though we know what the extended type should be at this point, we must still diagnose an error; it is too late to bind the extension, because other name lookups may have already been performed, potentially missing out on finding members of this extension. +In particular, extension binding \index{limitation!extension binding}cannot find declarations \index{synthesized declaration}synthesized by the compiler, including type aliases created by \index{associated type inference}associated type inference. Also, it cannot perform type substitution; this rules out extensions of \index{generic type alias}generic type aliases whose underlying type is itself a type parameter. + +If extension binding fails while the \Request{type-check source file request} successfully resolve the extended type when it visits the extension later, we emit a special diagnostic, tailored by a couple of additional checks. If ordinary type resolution returned a \index{type alias type}type alias type \texttt{Foo} desugaring to a nominal type \texttt{Bar}, the type checker emits the ``extension of type \texttt{Foo} must be declared as an extension of \texttt{Bar}'' diagnostic. Even though we know what the extended type should be at this point, we must still diagnose an error; it is too late to bind the extension, because other name lookups may have already been performed, potentially missing out on finding members of this extension. \begin{example}\label{bad extension 1} An invalid extension of an inferred type alias:\index{horse} \begin{Verbatim} @@ -212,14 +193,22 @@ \section{Extension Binding}\label{extension binding} // extension of `Int' extension G> {...} \end{Verbatim} -Extension binding fails to resolve \texttt{G>}, because this requires performing a type substitution that depends on the conformance \verb|Array: Sequence|: +Extension binding fails to resolve \texttt{G>}, because this requires performing a type substitution that depends on the conformance $\ConfReq{Array}{Sequence}$: \begin{gather*} \texttt{T.Element}\otimes\SubstMapC{\SubstType{T}{Array}}{\SubstConf{T}{Array}{Sequence}}\\ \qquad {} =\texttt{Int} \end{gather*} \end{example} -The other case where extension binding fails but \textbf{type-check source file request} is able to resolve the extended type is when this type is not actually a nominal type. In this situation, the type checker emits the fallback ``non-nominal type cannot be extended'' diagnostic. +The other case where extension binding fails but \textbf{type-check source file request} is able to resolve the extended type is when this type is not actually a nominal type. In this situation, the type checker emits the fallback ``non-nominal type cannot be extended'' diagnostic: +\begin{Verbatim} +typealias Fn = () -> () + +// error: you wish +extension Fn { + ... +} +\end{Verbatim} The extension binding algorithm is quadratic in the worst case, where each pass over the pending list is only able to bind exactly one extension. However, only unrealistic code examples trigger this pathological behavior. Indeed, the first pass binds all extensions of types not nested inside of other extensions, and the second pass binds extensions of types nested inside extensions that were bound by the first pass, which covers the vast majority of cases in reasonable user programs. @@ -246,28 +235,25 @@ \section{Extension Binding}\label{extension binding} \paragraph{Local types.} \index{local type declaration} -\index{limitation} +\index{limitation!conditional conformance and local types} Because extensions can only appear at the top level of a source file, the extended type must ultimately be visible from the top level. This allows extensions of types nested inside other top-level types, but precludes extensions of local types nested inside of functions or other local contexts, because there is ultimately no way to name a local type from the top level of a source file. (As a curious consequence, local types cannot conditionally conform to protocols, since the only way to declare a conditional conformance is with an extension!) -\section{Direct Lookup} +\section{Direct Lookup}\label{direct lookup} -Now we're going to take closer look at \IndexDefinition{direct lookup}\emph{direct lookup}, the primitive operation that looks for a \index{value declaration}value declaration with the given name inside of a nominal type and its extensions. This was first introduced in Section~\ref{name lookup} as the layer below qualified \index{name lookup}name lookup, which performs direct lookups into the base type, its conformed protocols, and superclasses. +Now we're going to take closer look at \IndexDefinition{direct lookup}\emph{direct lookup}, the primitive operation that looks for a \index{value declaration}value declaration with the given name inside of a nominal type and its extensions. This was first introduced in \SecRef{name lookup} as the layer below qualified \index{name lookup}name lookup, which performs direct lookups into the base type, its conformed protocols, and superclasses. Nominal type declarations and extensions are \IndexDefinition{iterable declaration context}\emph{iterable declaration contexts}, meaning they contain member declarations. Before discussing direct lookup, let's consider what happens if we ask an iterable declaration context to list its members. This is a lazy operation which triggers work the first time it is called: \begin{itemize} -\item Iterable declaration contexts parsed from source are populated by \index{delayed parsing}delayed parsing; when the \index{parser}parser first reads a source file, the bodies of iterable declaration contexts are skipped, and only the source range is recorded. Asking for the list of members goes and parses the source range again, constructing declarations from their parsed representation (Section~\ref{delayed parsing}). -\index{serialized module} -\index{imported module} -\index{Objective-C} -\item Iterable declaration contexts from binary and imported modules are equipped with a \IndexDefinition{lazy member loader}\emph{lazy member loader} which serves a similar purpose. Asking the lazy member loader to provide a list of members will build Swift declarations from deserialized records or imported Clang declarations. The lazy member loader can also find declarations with a specific name, as explained below. +\item Iterable declaration contexts parsed from source are populated by \index{delayed parsing}delayed parsing; when the \index{parser}parser first reads a source file, the bodies of iterable declaration contexts are skipped, and only the source range is recorded. Asking for the list of members goes and parses the source range again, constructing declarations from their parsed representation (\SecRef{delayed parsing}). +\item Iterable declaration contexts from \index{serialized module}binary and \index{imported module}imported modules are equipped with a \IndexDefinition{lazy member loader}\emph{lazy member loader} which serves a similar purpose. Asking the lazy member loader to list all members will build the corresponding Swift declarations from deserialized records or imported Clang declarations. The lazy member loader can also find just those declarations with a \emph{specific} name, as explained below; this is the more common operation, since it is much more efficient. \end{itemize} \paragraph{Member lookup table.} -Every nominal type declaration has an associated \IndexDefinition{member lookup table}\emph{member lookup table}, which is used for direct lookup. This table maps each identifier to a list of value declarations with that name (multiple value declarations can share a name because Swift allows type-based overloading). The declarations in a member lookup table are understood to be members of one or more iterable declaration contexts, which are exactly the type declaration itself and all of its extensions. These iterable declaration contexts might originate from a mix of different module kinds. For example, the nominal type itself might be an Objective-C class from an imported Objective-C module, with one extension declared in a binary Swift module, and another extension defined in the main module, parsed from source. +Every nominal type declaration has an associated \IndexDefinition{member lookup table}\emph{member lookup table}, which is used for direct lookup. This table maps each identifier to a list of value declarations with that name (multiple value declarations can share a name because Swift allows type-based overloading). The declarations in a member lookup table are understood to be members of one or more iterable declaration contexts, which are exactly the type declaration itself and all of its extensions. These iterable declaration contexts might originate from a mix of different module kinds. For example, the nominal type itself might be an \index{Objective-C}Objective-C class from an imported Objective-C module, with one extension declared in a binary Swift module, and another extension defined in the main module, parsed from source. -The lookup table is populated lazily, in a manner resembling a state machine. The first direct lookup into a nominal type does populate the member lookup table with all members from any \emph{parsed} iterable declaration contexts, which might trigger delayed parsing. Each entry in the member lookup table stores a ``complete'' bit. Initially the ``complete'' bit of these entries is \emph{not} set, meaning that the entry only contains those members that were parsed from source. If any iterable declaration contexts originate from binary and imported modules, direct lookup then asks each lazy member loader to selectively load only those members with the given name. After the lazy member loaders do their work, the lookup table entry for this name is now complete, and the ``complete'' bit is set. Later when direct lookup finds a member lookup table entry with the ``complete'' bit set, it knows this entry is fully populated, and the stored list of declarations is returned immediately without querying the lazy member loaders. +The lookup table is populated lazily, in a manner resembling a state machine. Say we're asked to perform a direct lookup for some given name. If this is the first direct lookup, we populate the member lookup table with \emph{all} members from any \emph{parsed} iterable declaration contexts, which might trigger delayed parsing. Each entry in the member lookup table stores a ``complete'' bit. The ``complete'' bit of these initially-populated entries is \emph{not} set, meaning that the entry only contains those members that were parsed from source. If any iterable declaration contexts originate from binary and imported modules, direct lookup then asks each lazy member loader to selectively load only those members with the given name. (Parsed declaration contexts do not offer this level of lazyness, because there is no way to parse a subset of the members only.) After the lazy member loaders do their work, the lookup table entry for this name is now complete, and the ``complete'' bit is set. Later when direct lookup finds a member lookup table entry with the ``complete'' bit set, it knows this entry is fully populated, and the stored list of declarations is returned immediately without querying the lazy member loaders. -This \IndexDefinition{lazy member loading}\emph{lazy member loading} mechanism ensures that only those members which are actually referenced in a compilation session are loaded from serialized and imported iterable declaration contexts. Iterable declaration contexts parsed from source do not offer this level of fine-grained lazyness, because there is no way to parse only those members having a given name. +This \emph{lazy member loading} mechanism ensures that only those members which are actually referenced in a compilation session are loaded from serialized and imported iterable declaration contexts. \begin{listing}\captionabove{A class implemented in Objective-C, with an extension written in Swift}\label{lazy member listing} \begin{Verbatim} @@ -299,7 +285,7 @@ \section{Direct Lookup} \end{listing} \begin{example} -Listing~\ref{lazy member listing} shows an example of lazy member loading. Observe the following: +\ListingRef{lazy member listing} shows an example of lazy member loading. Observe the following: \begin{itemize} \item The \texttt{NSHorse} class itself is declared in Objective-C in a header file. Suppose this header file is part of the \texttt{HorseKit} module, imported by the two Swift source files. \item The Swift source file \texttt{b.swift} declares an extension of \texttt{NSHorse}. Let's say that this is a secondary file in the current frontend job. @@ -403,7 +389,7 @@ \section{Direct Lookup} \end{example} \paragraph{History.} -Lazy member loading was introduced in Swift~4.1 to avoid wasted work in deserializing or importing members that are never referenced. The speedup is most pronounced when multiple frontend jobs perform direct lookup into a common set of deserialized or imported nominal types. The overhead of loading all members of this common set of types was multiplied across frontend jobs prior to the introduction of lazy member loading. +Lazy member loading was introduced in \IndexSwift{4.1}Swift~4.1 to avoid wasted work in deserializing or importing members that are never referenced. The speedup is most pronounced when multiple frontend jobs perform direct lookup into a common set of deserialized or imported nominal types. The overhead of loading all members of this common set of types was multiplied across frontend jobs prior to the introduction of lazy member loading. \section{Constrained Extensions}\label{constrained extensions} @@ -421,11 +407,11 @@ \section{Constrained Extensions}\label{constrained extensions} \begin{Verbatim} extension Set where Element == Int {...} \end{Verbatim} -The generic signature of \texttt{Set} is \verb||. Adding the same-type requirement $\FormalReq{Element == Int}$ makes the requirement $\ConfReq{Element}{Hashable}$ redundant since \texttt{Int} conforms \texttt{Hashable}, so the extension's generic signature becomes \verb||. +The generic signature of \texttt{Set} is \verb||. Adding the same-type requirement $\SameReq{Element}{Int}$ makes the requirement $\ConfReq{Element}{Hashable}$ redundant since \texttt{Int} conforms \texttt{Hashable}, so the extension's generic signature becomes \verb||. -This example demonstrates that while the requirements of the constrained extension abstractly imply those of the extended type, they are not a ``syntactic'' subset---the $\ConfReq{Element}{Hashable}$ requirement does not appear in the constrained extension's generic signature, because it became redundant when $\FormalReq{Element == Int}$ was added. +This example demonstrates that while the requirements of the constrained extension abstractly imply those of the extended type, they are not a ``syntactic'' subset---the $\ConfReq{Element}{Hashable}$ requirement does not appear in the constrained extension's generic signature, because it became redundant when $\SameReq{Element}{Int}$ was added. -In the early days of Swift, this extension was not supported at all, because the compiler did not permit same-type requirements between generic parameters and concrete types; only dependent member types could be made concrete. This restriction was lifted in Swift~3, and many concepts described in this book, most importantly substitution maps and generic environments, were introduced as part of this effort. +In the early days of Swift, this extension was not supported at all, because the compiler did not permit same-type requirements between generic parameters and concrete types; only dependent member types could be made concrete. This restriction was lifted in \IndexSwift{3.0}Swift~3, and many concepts described in this book, most importantly substitution maps and generic environments, were introduced as part of this effort. \paragraph{Extending a generic nominal type.} An extension of a generic nominal type with generic arguments is shorthand for a \texttt{where} clause that constrains each generic parameter to a concrete type. The generic argument types must be fully concrete; they cannot reference the generic parameters of the extended type declaration. The previous example can be more concisely expressed using this syntax: \begin{Verbatim} @@ -436,7 +422,7 @@ \section{Constrained Extensions}\label{constrained extensions} typealias StringMap = Dictionary extension StringMap {...} \end{Verbatim} -This shorthand was introduced in Swift 5.7 \cite{se0361}. +This shorthand was introduced in \IndexSwift{5.7}Swift 5.7 \cite{se0361}. \paragraph{Extending a generic type alias.} A \index{generic type alias}generic type alias is called a \IndexDefinition{pass-through type alias}``pass-through type alias'' if it satisfies the following three conditions: @@ -454,7 +440,7 @@ \section{Constrained Extensions}\label{constrained extensions} \begin{Verbatim} struct Range {...} \end{Verbatim} -Prior to Swift 4.2, the standard library also had a separate \texttt{CountableRange} type with a different set of requirements: +Prior to \IndexSwift{4.2}Swift 4.2, the standard library also had a separate \texttt{CountableRange} type with a different set of requirements: \begin{Verbatim} struct CountableRange where Bound.Stride: SingedInteger {} @@ -464,7 +450,7 @@ \section{Constrained Extensions}\label{constrained extensions} typealias CountableRange = Range where Bound.Stride: SignedInteger \end{Verbatim} -In Swift 4.1, the following were extensions of two different nominal types: +In \IndexSwift{4.1}Swift 4.1, the following were extensions of two different nominal types: \begin{Verbatim} extension Range {...} @@ -497,7 +483,9 @@ \section{Constrained Extensions}\label{constrained extensions} \section{Conditional Conformances}\label{conditional conformance} -A conformance declared on a nominal type declaration or an unconstrained extension implements the protocol's requirements for all specializations of the nominal type; we can call this an \emph{unconditional} conformance. In constrast, a conformance declared on a \emph{constrained} extension is what's known as a \IndexDefinition{conditional conformance}\emph{conditional conformance}, which implements the protocol requirements only for those specializations of the extended type which satisfy the requirements of the extension. For example, arrays have a natural notion of equality, defined in terms of the equality operation on the element type. However, there is no reason to require all arrays to store \texttt{Equatable} types. Instead, the standard library defines a conditional conformance of \texttt{Array} to \texttt{Equatable} where the element type is \texttt{Equatable}: +A conformance written on a nominal type or unconstrained extension implements the protocol's requirements for all specializations of the nominal type, and we call this an \emph{unconditional} conformance. A conformance declared on a \emph{constrained} extension is what's known as a \IndexDefinition{conditional conformance}\emph{conditional conformance}, which implements the protocol requirements only for those specializations of the extended type which satisfy the requirements of the extension. Conditional conformances were introduced in \IndexSwift{4.2}Swift 4.2~\cite{se0143}. + +For example, arrays have a natural notion of equality, defined in terms of the equality operation on the element type. However, there is no reason to restrict all arrays to only storing \texttt{Equatable} types. Instead, the standard library defines a conditional conformance of \texttt{Array} to \texttt{Equatable} where the element type is \texttt{Equatable}: \begin{Verbatim} struct Array {...} @@ -514,11 +502,18 @@ \section{Conditional Conformances}\label{conditional conformance} } } \end{Verbatim} -%Conditional conformances of \texttt{Array} to \texttt{Hashable} and \texttt{Comparable} are defined similarly. +More complex conditional requirements can also be written. We previously discussed overlapping conformances and coherence in \SecRef{conformance lookup}, and conditional conformances inherit an important restriction. A nominal type can only conform to a protocol once, so in particular, \index{overlapping conformance}overlapping conditional conformances are not supported and we emit a \index{diagnostic!overlapping conformance}diagnostic: +\begin{Verbatim} +struct G {} + +protocol P {} + +extension G: P where T == Int {} +extension G: P where T == String {} // error +\end{Verbatim} \paragraph{Computing conditional requirements.} -A \index{normal conformance}normal conformance stores a list of zero or more \IndexDefinition{conditional requirement}\emph{conditional requirements}; this is the ``difference'' between the requirements of the conforming context, and the requirements of the conforming type. A conditional conformance is a conformance with at least one conditional requirement; equivalently, it is a conformance declared on a constrained extension. Formally, this ``requirement difference'' operation takes two generic signatures, that of the conforming type, and the conforming context. We collect those requirements of the conforming context which are not already satisfied by the conforming type. -A simple example is the conditional conformance of \texttt{Dictionary} to \texttt{Equatable}: +A conditional conformance stores a list of \IndexDefinition{conditional requirement}\emph{conditional requirements}. If $G$ is the generic signature of the constrained extension where the conformance was declared, and $H$ is the generic signature of the conforming type, then certainly $G$ satisfies all requirements of $H$. If the converse is also true, we have an unconditional conformance. Otherwise, the conditional requirements of the conformance are precisely those requirements of $G$ not satisfied by $H$. A simple example is the conditional conformance of \texttt{Dictionary} to \texttt{Equatable}: \begin{Verbatim} extension Dictionary: Equatable where Value: Equatable {...} \end{Verbatim} @@ -530,41 +525,36 @@ \section{Conditional Conformances}\label{conditional conformance} \end{quote} The first requirement is already satisfied by the generic signature of \texttt{Dictionary} itself; the second requirement is the conditional requirement of our conformance. -Algorithm~\ref{reqissatisfied}, for checking \index{generic arguments}generic arguments against the requirements of a nominal type's generic signature, is also used to compute conditional requirements. This algorithm takes a substituted requirement as input, which cannot contain type parameters. Here though, we need to find the requirements of a generic signature $G$ that are not satisfied by another generic signature $H$, rather than checking if a specific set of concrete types satisfy the requirements of $H$. To solve this problem, we reach for a generic environment and its archetypes. If $G$ is the generic signature of the extended type, and $H$ is the generic signature of the constrained extension, then the \index{archetype type}archetypes of the \index{generic environment}primary generic environment of $G$ abstractly represent ``the most general'' substitutions which satisfy the requirements of $G$. +We compute the conditional requirements by handing \AlgRef{check generic arguments algorithm} the requirements of $G$ together with the \index{forwarding substitution map}forwarding substitution map~$1_{\EquivClass{H}}$. The algorithm outputs a list of failed and unsatisfied requirements. They're not really ``failed'' or ``unsatisfied'' though; instead, the concatenation of these two lists is precisely the list of conditional requirements in our normal conformance. -\index{substituted requirement} -\begin{algorithm}[Computing conditional requirements]\label{conditional requirements algorithm} -As input, takes the generic signature of the conforming type, and the generic signature of the conforming context. We can assume the conforming context is a constrained extension, since otherwise the algorithm is trivial and always returns an empty list. -\begin{enumerate} -\item Let $G$ be the generic signature of the conforming type, and let $H$ be the generic signature of the constrained extension. -\item Initialize the output list of conditional requirements to be empty. -\item For each requirement of $H$, apply the \index{forwarding substitution map}forwarding substitution map $1_{\EquivClass{G}}$ to the requirement, to get a substituted requirement. -\item If the substituted requirement contains error types due to a substitution failure, add the original requirement to the output list. A substitution failure indicates that the requirement is unsatisfied, because it depends on some other unsatisfied conformance requirement. -\item Otherwise, we have a valid substituted requirement. Apply Algorithm~\ref{reqissatisfied} to check if the substituted requirement is satisfied. -\item If the substituted requirement is not satisfied, add it to the output list. -\item Otherwise the substituted requirement is satisfied, so it is not a conditional requirement. -\end{enumerate} -\end{algorithm} +The ``failed'' case corresponds to a conditional requirement whose subject type is not even a valid type parameter of $H$. The $\ConfReq{T.Element}{Hashable}$ requirement below has the subject type \texttt{T.Element}, which was defined by $\ConfReq{T}{Sequence}$, which is itself a conditional requirement, so $\ConfReq{G}{Sequence}$ has two conditional requirements: +\begin{Verbatim} +struct G {} -\paragraph{Specialized conditional conformances.} -Applying a substitution map $\Sigma$ to a normal conformance $\ConfReq{$\texttt{T}_d$}{P}$ produces a \index{specialized conformance}specialized conformance $\ConfReq{$\texttt{T}_d$}{P}\otimes\Sigma$, which we denoted by $\ConfReq{T}{P}$, where $\texttt{T}=\texttt{T}_d\otimes\Sigma$, and $\texttt{T}_d$ is the declared interface type of the conforming nominal type declaration. If this underlying normal conformance is conditional, the conformance substitution map $\Sigma$ is the context substitution map of the conforming type \texttt{T} with respect to the constrained extension, and the input generic signature of $\Sigma$ is the generic signature of this extension. Thus, $\Sigma$ can be applied to the type parameters of the constrained extension, including the type parameters appearing in the constrained extension's requirements. The conditional requirements of a specialized conformance are then defined as the result of applying $\Sigma$ to the conditional requirements of the underlying normal conformance, making the following \index{commutative diagram}diagram commute: +extension G: Sequence where T: Sequence, T.Element: Hashable {...} +\end{Verbatim} +\paragraph{Specialized conditional conformances.} Building upon \SecRef{conformance lookup}, we now describe how substitution relates to conditional conformances. If $\texttt{T}_d$ is the declared interface type of some nominal type declaration~$d$ that \emph{unconditionally} conforms to \texttt{P}, and $\texttt{T}=\texttt{T}_d\otimes\Sigma$ is a \index{specialized type}specialized type of~$d$ for some substitution map $\Sigma$, then looking up the conformance of \texttt{T} to $\protosym{P}$ returns a \index{specialized conformance}specialized conformance with \index{conformance substitution map}conformance substitution map~$\Sigma$: +\[\protosym{P}\otimes\texttt{T}=\ConfReq{$\texttt{T}_d$}{P}\otimes\Sigma\] +If $\ConfReq{$\texttt{T}_d$}{P}$ is conditional though, we cannot take~$\Sigma$ to be the \index{context substitution map!for a declaration context}context substitution map of~\texttt{T}. The \index{type witness}type witnesses of $\ConfReq{$\texttt{T}_d$}{P}$ might contain type parameters of the constrained extension, and not just the conforming type; however the context substitution map of \texttt{T} has the generic signature of the conforming type. We must set $\Sigma$ to be the context substitution map of \texttt{T} for the generic signature of the constrained extension, as in \SecRef{member type repr}. Indeed, this is the same problem as member type resolution when the referenced type declaration is declared in a constrained extension; we must perform some additional \index{global conformance lookup}global conformance lookups to populate the substitution map completely. + +The conditional requirements of the specialized conformance $\ConfReq{T}{P}$ are the \index{substituted requirement}substituted requirements obtained by applying $\Sigma$ to each conditional requirement of \ConfReq{$\texttt{T}_d$}{P}. This makes the following \index{commutative diagram}diagram commute for each conditional requirement~$R$: \begin{quote} -\newcommand{\GetConditionalRequirements}{\def\arraystretch{0.65}\arraycolsep=0pt\begin{array}{c}\text{get conditional}\\\text{requirements}\end{array}} +\newcommand{\GetConditionalRequirements}{\def\arraystretch{0.65}\arraycolsep=0pt\begin{array}{c}\text{get conditional}\\\text{requirement}\end{array}} \begin{tikzcd}[column sep=3cm,row sep=1cm] -\ConfReq{$\texttt{T}_d$}{P} \arrow[d, "\GetConditionalRequirements"{left}] \arrow[r, "\Sigma"] &\ConfReq{T}{P} \arrow[d, "\GetConditionalRequirements"] \\ -\text{original requirement} \arrow[r, "\Sigma"]&\text{substituted requirement} +\ConfReq{$\texttt{T}_d$}{P} \arrow[d, "\GetConditionalRequirements"{left}] \arrow[r, "\text{apply $\Sigma$}"] &\ConfReq{T}{P} \arrow[d, "\GetConditionalRequirements"] \\ +R \arrow[r, "\text{apply $\Sigma$}"]&R\otimes\Sigma \end{tikzcd} \end{quote} -Consider the $\ConfReq{Array}{Equatable}$ conformance from the standard library. While the generic signature of \texttt{Array} itself has no requirements, the generic signature of the constrained extension is \verb||. The context substitution map of \texttt{Array} with respect to the constrained extension is: -\[\Sigma := \SubstMapLongC{\SubstType{Element}{Int}}{\SubstConf{Element}{Int}{Equatable}}\] -We can apply this substitution map to the normal conformance get the specialized conditional conformance, denoted \ConfReq{Array}{Equatable}: +Consider the $\ConfReq{Array<\ttgp{0}{0}>}{Equatable}$ conformance from the standard library. The generic signature of \texttt{Array} is just \texttt{<\ttgp{0}{0}>} with no requirements, while the conformance context is the constrained extension with signature \texttt{<\ttgp{0}{0} where \ttgp{0}{0}:~Equatable>}. The context substitution map of \texttt{Array} for the constrained extension is: +\[\Sigma := \SubstMapLongC{\SubstType{\ttgp{0}{0}}{Int}}{\SubstConf{\ttgp{0}{0}}{Int}{Equatable}}\] +We compose this substitution map with the normal conformance, and obtain a specialized conditional conformance, denoted \ConfReq{Array}{Equatable}: \[ -\ConfReq{Array}{Equatable} \otimes \Sigma = \ConfReq{Array}{Equatable} +\ConfReq{Array<\ttgp{0}{0}>}{Equatable} \otimes \Sigma = \ConfReq{Array}{Equatable} \] -The normal conformance has a single conditional requirement, \ConfReq{Element}{Equatable}. The conditional requirement of the specialized conformance is obtained by applying $\Sigma$: +The normal conformance has a single conditional requirement, $\ConfReq{\ttgp{0}{0}}{Equatable}$. We apply $\Sigma$ to get the conditional requirement of the specialized conformance: \[ -\ConfReq{Element}{Equatable} \otimes \Sigma = \ConfReq{Int}{Equatable} +\ConfReq{\ttgp{0}{0}}{Equatable} \otimes \Sigma = \ConfReq{Int}{Equatable} \] \paragraph{Global conformance lookup.} @@ -572,24 +562,95 @@ \section{Conditional Conformances}\label{conditional conformance} Global conformance lookup purposely does not check conditional requirements, allowing callers to answer two different questions: \begin{enumerate} \item Does this specialized type with its specific list of generic arguments conform to the protocol? -\item What requirements should these generic arguments satisfy to make the specialized type conform? +\item What requirements should a list of generic arguments satisfy to make the type conform? \end{enumerate} -It is up to the caller to apply Algorithm \ref{reqissatisfied} to each conditional requirement if the first interpretation is desired; if this holds, we say the type \emph{conditionally conforms}. For the second interpretation, the conditional requirements can be extracted for further processing. +When the first interpretation is desired, a convenience entry point can be used that combines global conformance lookup followed by \AlgRef{reqissatisfied} to check any conditional requirements. -Recall that with a specialized type, global conformance lookup first finds the normal conformance, and then applies the context substitution map to produce a specialized conformance. If the conformance is conditional, the context substitution map is computed with respect to the constrained extension. For example, \texttt{Array} does not conditionally conform to \texttt{Equatable}, since \texttt{AnyObject} is not \texttt{Equatable}. Global conformance lookup will still construct a specialized conformance: +Checking conditional requirements is more subtle than checking for the presence of invalid conformances in the conformance substitution map. It is true that their presence indicates a conditional requirement is unsatisfied; for example, \texttt{Array} does not conditionally conform to \texttt{Equatable}, since an \texttt{AnyObject} is not \texttt{Equatable}, and we get the following specialized conformance: \begin{gather*} \protosym{Equatable} \otimes \texttt{Array}\\ -= \ConfReq{Array}{Equatable} \otimes \SubstMapLongC{\SubstType{Element}{AnyObject}}{\ConfReq{Element}{Equatable}\mapsto\text{invalid}}\\ += \ConfReq{Array<\ttgp{0}{0}>}{Equatable} \otimes \SubstMapLongC{\SubstType{\ttgp{0}{0}}{AnyObject}}{\ConfReq{\ttgp{0}{0}}{Equatable}\mapsto\text{invalid}}\\ = \ConfReq{Array}{Equatable} \end{gather*} -While the above conformance substitution map contains an invalid conformance, looking for invalid conformances in the conformance substitution map is not the correct way of checking conditional requirements. If the conforming type conditionally conforms, the conformance substitution map will be valid, but not vice versa, because the conditional requirements might not be conformance requirements. The satisfiability of superclass, same-type, and layout requirement is not encoded in the conformance substitution map. In the below, $\ConfReq{Pair}{P}$ has a valid conformance substitution map, but the conditional requirement $\FormalReq{T == U}$ does not hold: +However, the failure of conditional requirements of other kinds, such as \index{same-type requirement}same-type, \index{superclass requirement}superclass and \index{layout requirement}layout requirements, is not immediately apparent from the structure of the substitution map. In the below, the conformance $\ConfReq{Pair}{Diagonal}$ might look fine at first sight because it does not contain any invalid conformances, but it does not satisfy the conditional requirement $\SameReq{\ttgp{0}{0}}{\ttgp{0}{0}}$: \begin{Verbatim} -protocol P {} +protocol Diagonal {} struct Pair {} -extension Pair: P where T == U {} +extension Pair: Diagonal where T == U {} +\end{Verbatim} +Thus, conditional requirements must always be checked by~\AlgRef{reqissatisfied}, and not in some ``ad-hoc'' way by digging through the substitution map for instance. + +\paragraph{Protocol inheritance.} +\index{inherited protocol} +Protocol inheritance is modeled as an \index{associated conformance requirement}associated conformance requirement on \texttt{Self}, so for instance \verb|Derived| has an associated conformance requirement $\ConfReq{Self}{Base}$: +\begin{Verbatim} +protocol Base {...} +protocol Derived: Base {...} +\end{Verbatim} +When checking a conformance to \texttt{Derived}, the \index{conformance checker}conformance checker ensures that the conforming type satisfies the $\ConfReq{Self}{Base}$ requirement. When the conformance to \texttt{Base} is unconditional, this always succeeds, because the conformance declaration also implies an unconditional conformance to the base protocol: +\begin{Verbatim} +struct Pair {} +extension Pair: Derived {...} +// implies `extension Pair: Base {}' +\end{Verbatim} +The nominal type's \index{conformance lookup table}conformance lookup table synthesizes these implied conformances and makes them available to global conformance lookup. With conditional conformances, such implied conformances are not synthesized because there is no way to guess what the conditional requirements should be. The conformance checker still checks the associated conformance requirement on \texttt{Self} though, so the user must first explicitly declare a conformance to each base protocol when writing a conditional conformance. + +Suppose we wish for \texttt{Pair} to conform to \texttt{Derived} when $\SameReq{\ttgp{0}{0}}{Int}$: +\begin{Verbatim} +extension Pair: Derived where T == Int {...} +\end{Verbatim} +The compiler will \index{diagnostic!conditional conformance}diagnose an error unless there is also an \emph{explicit} conformance of \texttt{Pair} to \texttt{Base}. There are several possible ways to declare a conformance of \texttt{Pair} to \texttt{Base}. The simplest way is to conform unconditionally: +\begin{Verbatim} +extension Pair: Base {...} +\end{Verbatim} +Things get more interesting if the conformance to \texttt{Base} is also conditional, because then the conformance to \texttt{Derived} only makes sense if the conditional requirements of $\ConfReq{Pair}{Derived}$ imply the conditional requirements of $\ConfReq{Pair}{Base}$. We establish this condition as follows. There are three generic signatures in play here: +\begin{enumerate} +\item The signature of \texttt{Pair}; call it~$H$. +\item The signature of the extension declaring conformance to \texttt{Base}; we call it~$G_1$. +\item The signature of the extension declaring conformance to \texttt{Derived}; we call it~$G_2$. +\end{enumerate} +The conformance $\ConfReq{Pair}{Derived}$ satisfies the $\ConfReq{Self}{Base}$ associated conformance requirement of \texttt{Derived} if any specialization of \verb|Pair| which satisfies the conditional requirements of (3) also satisfies the conditional requirements of (2). In the \index{conformance checker}conformance checker, this falls out from the general case of checking associated requirements. + +To each associated requirement, we apply the \index{protocol substitution map}protocol substitution map for the normal conformance, followed by the forwarding substitution map for the generic signature of the conformance. In our example, this gives us the following substituted requirement: +\[ +\ConfReq{Self}{Base}\otimes\Sigma_{\ConfReq{Pair}{Derived}}\otimes 1_{\EquivClass{G_2}} = \ConfReq{Pair<$\archetype{T}$,Int>}{Base} +\] + +The next step asks \AlgRef{reqissatisfied} if the substituted requirement is satisfied. This issues a global conformance lookup and proceeds to check its conditional requirements: +\[\protosym{Base}\otimes \texttt{Pair<$\archetype{T}$,Int>} = \ConfReq{\texttt{Pair<$\archetype{T}$,Int>}}{Base} \] +Since $\texttt{Pair<$\archetype{T}$,Int>}\in\TypeObj{\EquivClass{G_2}}$, we get $\protosym{Base}\otimes \texttt{Pair<$\archetype{T}$,Int>}\in\ConfObj{\EquivClass{G_2}}$. The conditional requirements of this conformance are the requirements of $G_1$ substituted with the \index{primary archetype}primary archetypes of $G_2$. Knowing if they are satisfied or not determines the validity of the conformance to \texttt{Derived}. + +We now look at three different definitions of the conformance $\ConfReq{Pair}{Base}$, and see how each one impacts the validity of the conformance $\ConfReq{Pair}{Derived}$: +\begin{enumerate} +\item +The conditional conformance of \texttt{Pair} to \texttt{Base} might also require that \texttt{T} is \texttt{Int}: +\begin{Verbatim} +extension Pair: Base where T == Int {...} +\end{Verbatim} +The conditional requirement here is $\SameReq{\ttgp{0}{0}}{Int}$, the same as in the \texttt{Derived} conformance, and after substitution we get: +\[\SameReq{\ttgp{0}{0}}{Int}\otimes \SubstMap{\SubstType{\ttgp{0}{0}}{Int},\,\SubstType{\ttgp{0}{1}}{$\archetype{U}$}}=\SameReq{Int}{Int}\] +This requirement is satisfied by \AlgRef{reqissatisfied}, so the conditional requirements of $\ConfReq{Pair}{Derived}$ imply the conditional requirements of $\ConfReq{Pair}{Base}$, and the conformance to \texttt{Derived} is valid. + +\item Instead, we could conform \texttt{Pair} to \texttt{Base} when \texttt{T} conforms to \texttt{Equatable}: +\begin{Verbatim} +extension Pair: Base where T: Equatable {...} +\end{Verbatim} +Now, $\ConfReq{Pair}{Base}$ has the conditional requirement $\ConfReq{\ttgp{0}{0}}{Equatable}$, and after substitution we get: +\[\ConfReq{\ttgp{0}{0}}{Equatable}\otimes\SubstMap{\SubstType{\ttgp{0}{0}}{Int},\,\SubstType{\ttgp{0}{1}}{$\archetype{U}$}}=\ConfReq{Int}{Equatable}\] +This is also satisfied, so once again our conditional conformance $\ConfReq{Pair}{Derived}$ is valid. + +\item Finally consider this, which renders invalid our conformance to \texttt{Derived}: +\begin{Verbatim} +extension Pair: Base where U: Equatable {...} \end{Verbatim} +The conditional requirement $\ConfReq{\ttgp{0}{0}}{Equatable}$ becomes $\ConfReq{$\archetype{U}$}{Equatable}$ after substitution: +\[\ConfReq{\ttgp{0}{1}}{Equatable}\otimes \SubstMap{\SubstType{\ttgp{0}{0}}{Int},\,\SubstType{\ttgp{0}{1}}{$\archetype{U}$}}=\ConfReq{$\archetype{U}$}{Equatable}\] +The archetype $\archetype{U}$ does not conform to \texttt{Equatable} because $\ConfReq{\ttgp{0}{1}}{Equatable}$ is not a \index{derived requirement}derived requirement of $G_2$, so the conditional requirement $\SameReq{\ttgp{0}{0}}{Int}$ of $\ConfReq{Pair}{Derived}$ does not imply the conditional requirement $\ConfReq{\ttgp{0}{1}}{Equatable}$ of $\ConfReq{Pair}{Base}$. For this reason, when $\ConfReq{Pair}{Base}$ is written as above, the compiler rejects our conformance $\ConfReq{Pair}{Derived}$. +\end{enumerate} -\begin{listing}[b!]\captionabove{Infinite recursion while building a specialized conditional conformance}\label{conditional conformance recursion} +\paragraph{Termination.} +Conditional conformances can express \index{non-terminating computation}non-terminating computation at compile time. +The below code is taken from a bug report which remains \index{limitation!non-terminating conditional conformance}unfixed for the time being \cite{sr6724}: \begin{Verbatim} protocol P {} @@ -608,11 +669,7 @@ \section{Conditional Conformances}\label{conditional conformance} func takesP(_: T.Type) {} takesP(G.self) \end{Verbatim} -\end{listing} - -\begin{example} -Conditional conformances can express \index{non-terminating computation}non-terminating computation at compile time. -The code shown in Listing~\ref{conditional conformance recursion} shows an example from a bug report which remains unfixed for the time being \cite{sr6724}. The \texttt{takesP()} function has the below generic signature: +The \texttt{takesP()} function has the below generic signature: \begin{quote} \begin{verbatim} <τ_0_0 where τ_0_0: P> @@ -641,72 +698,93 @@ \section{Conditional Conformances}\label{conditional conformance} \[\ConfReq{\ttgp{0}{0}.[Q]A}{P}\otimes\Sigma=\ConfReq{G}{P}\] So \texttt{G} conditionally conforms to \texttt{P} if the conditional requirement $\ConfReq{G}{P}$ is satisfied; this conditional requirement is satisfied if \texttt{G} conditionally conforms to \texttt{P}. At this point, we are stuck in a loop, again. For now, the compiler crashes on this example while constructing the conformance substitution map; hopefully a future update to this book will describe the eventual resolution of this bug by imposing an iteration limit. -Our example only encodes an infinite loop, but conditional conformances can actually express arbitrary computation. This is shown for the \index{Rust}Rust programming language, which also has conditional conformances, in \cite{rustturing}. We'll also see another example of non-terminating compile-time computation in Section~\ref{recursive conformances}. -\end{example} +Our example only encodes an infinite loop, but conditional conformances can actually express arbitrary computation. This is shown for the \index{Rust}Rust programming language, which also has conditional conformances, in \cite{rustturing}. We'll also see another example of non-terminating compile-time computation in \SecRef{recursive conformances}. -\paragraph{Protocol inheritance.} -\index{inherited protocol} -Protocol inheritance is modeled by conformance requirements on \texttt{Self}, so for instance the requirement signature of \verb|P| has a conformance requirement $\ConfReq{Self}{Q}$: -\begin{Verbatim} -protocol P {...} -protocol Q: P {...} -\end{Verbatim} -When checking a conformance to \texttt{P}, the \index{conformance checker}conformance checker ensures that the conforming type satisfies the $\ConfReq{Self}{P}$ requirement. When the conformance to \texttt{Q} is unconditional, this always succeeds, because an unconditional conformance to a derived protocol also implies an unconditional conformance to the base protocol: +\paragraph{Soundness.} +Substitution maps in a valid program should satisfy a certain condition. +\begin{definition}\label{valid subst map} +If $\Sigma$ is a substitution map with \index{input generic signature}input generic signature~$G$ and \index{fully-concrete type}fully-concrete replacement types, then $\Sigma$ is \IndexDefinition{well-formed substitution map}\emph{well-formed} if for each derived requirement~$R$ of~$G$, the \index{substituted requirement}substituted requirement $R\otimes\Sigma$ is satisfied according to \AlgRef{reqissatisfied}. +\end{definition} +(The restriction to fully-concrete replacement types is not a real limitation; as we saw in \SecRef{checking generic arguments}, we can first compose $\Sigma$ with the \index{forwarding substitution map}forwarding substitution map~$1_{\EquivClass{H}}$ for some generic signature~$H$ if necessary). + +If the \index{derived requirement}derived requirements of $G$ are an infinite set, we cannot directly check if~$\Sigma$ is well-formed. \AlgRef{check generic arguments algorithm} only checks if $\Sigma$ satisfies each \emph{explicit} requirement of~$G$, and unfortunately, \index{limitation!conditional conformance soundness hole}this is not sufficient. + +An immediate counter-example appears when the substituted requirement depends on a conformance that does not satisfy the \index{associated requirement}associated requirements of its protocol. For example, we might declare a \texttt{Bad} type conforming to \texttt{Sequence} whose \texttt{Iterator} \index{type witness}type witness does not conform to \texttt{IteratorProtocol}: \begin{Verbatim} -struct G {} -extension G: Q {...} -// implies `extension G: P {}' +struct Bad: Sequence { + typealias Iterator = Int // error +} \end{Verbatim} -The nominal type's \index{conformance lookup table}conformance lookup table synthesizes these implied conformances and makes them available to global conformance lookup. With conditional conformances, the situation is rather different. The conformance lookup table cannot synthesize a \emph{conditional} conformance to a base protocol because there is no way to guess what the conditional requirements on that conformance should be. The conformance checker still checks the conformance requirement on \texttt{Self} though, which has the effect of requiring explicit declaration of conditional conformances to base protocols. +Now, the substitution map +$\Sigma:=\SubstMapC{\SubstType{\ttgp{0}{0}}{Bad}}{\SubstConf{\ttgp{0}{0}}{Bad}{Sequence}}$ satisfies all explicit requirements of its generic signature, but it does not satisfy the derived requirement $\ConfReq{\ttgp{0}{0}.[Sequence]Iterator}{IteratorProtocol}$. However, \emph{this} is not the soundness hole; we diagnose an error when we check the conformance, so the program is invalid. -Suppose we instead wish for \texttt{G} to conditionally conform to \texttt{Q} when \texttt{T == Int}: +We now show an example where we can write down a substitution map that is not well-formed, and yet \AlgRef{check generic arguments algorithm} does not produce any diagnostics at all, including from checking conformances. We start with two protocols: \begin{Verbatim} -extension G: Q where T == Int {...} +protocol Bar { + associatedtype Beer + func brew() -> Beer +} + +protocol Pub { + associatedtype Beer + func pour() -> Beer +} \end{Verbatim} -The compiler will only accept this declaration if there is also an explicit conformance of \texttt{G} to \texttt{P}. There are several possible ways to declare a conformance of \texttt{G} to \texttt{P}. The simplest way is to conform unconditionally: +We declare a function generic over types that conform to both \texttt{Bar} and \texttt{Pub}, so it has the generic signature \texttt{<\ttgp{0}{0} where \ttgp{0}{0}:~Bar, \ttgp{0}{0}:~Pub>}: \begin{Verbatim} -extension G: P {...} +func both(_ t: T) -> (T.Beer, T.Beer) { + return (t.brew(), t.pour()) +} \end{Verbatim} -Things get more interesting if the conformance to the base protocol is also conditional, because then the conformance $\ConfReq{G}{Q}$ is only valid if its conditional requirements imply the conditional requirements of $\ConfReq{G}{P}$ (if the latter is unconditional, this is of course vacuously true). - -There are three generic contexts in play here: -\begin{enumerate} -\item The nominal type declaration of the conforming type. -\item The constrained extension for the base protocol conformance. -\item The constrained extension for the derived protocol conformance. -\end{enumerate} -Specifically, tensure the conformance $\ConfReq{G}{Q}$ satisfies the requirement signature requirement $\ConfReq{Self}{P}$, the conformance checker must prove that any specialization of \verb|G| which satisfies the conditional requirements of (3) must \emph{also} satisfy the conditional requirements of (2). This is where the notion of a generic environment is useful. Mapping the declared interface type of (1) into the generic environment of (3) gives us a type whose generic arguments satisfy the conditional requirements of (3) in the ``most general'' way: -\[\texttt{G} \otimes \SubstMap{\SubstType{T}{Int},\,\SubstType{U}{$\archetype{U}$}} = \texttt{G}\] -Next, we do a global conformance lookup of this type with the base protocol, which gives us a specialized conformance: -\[\protosym{P}\otimes \texttt{G} = \ConfReq{G}{P} \otimes \SubstMap{\SubstType{T}{Int},\, -\SubstType{U}{$\archetype{U}$}}\] -Specialized conformances apply the conformance substitution map to their conditional requirements, so the above conformance gives us the conditional requirements of (2) but substituted with the archetypes of (3). Checking these requirements determines our final answer. Let's look at some possibilities for our base conformance $\ConfReq{G}{P}$, and how they impact the validity of the conformance to the derived protocol $\ConfReq{G}{Q}$: -\begin{enumerate} -\item -The conditional conformance of \texttt{G} to \texttt{P} might require that \texttt{T} is \texttt{Int}: +Now, we declare a \texttt{BrewPub} struct, and pass an instance of \texttt{BrewPub} to our function: \begin{Verbatim} -extension G: P where T == Int {...} +struct BrewPub {} +let result = both(BrewPub()) \end{Verbatim} -The conditional requirement is $\FormalReq{T == Int}$, and after substitution we get: -\[\FormalReq{T == Int}\otimes \SubstMap{\SubstType{T}{Int},\,\SubstType{U}{$\archetype{U}$}}=\FormalReq{Int == Int}\] -This trivially holds, therefore the conditional requirements of $\ConfReq{G}{Q}$ imply the conditional requirements of $\ConfReq{G}{P}$, so the former is valid. +For this call expression to type check, \texttt{BrewPub} must conform to \texttt{Bar} and \texttt{Pub}; assume for a moment these conformances have already been declared. Let $\Sigma$ be the \index{substitution map}substitution map for the call: +\[ +\Sigma := \SubstMapLongC{\SubstType{\ttgp{0}{0}}{BrewPub}}{ +\SubstConf{\ttgp{0}{0}}{BrewPub}{Bar}\\ +\SubstConf{\ttgp{0}{0}}{BrewPub}{Pub} +} +\] +Applying $\Sigma$ to \texttt{\ttgp{0}{0}.[Bar]Beer} and \texttt{\ttgp{0}{0}.[Pub]Beer} projects the \index{type witness}type witness from each conformance: +\begin{gather*} +\texttt{\ttgp{0}{0}.[Bar]Beer}\otimes\Sigma = \AssocType{[Bar]Beer}\otimes\ConfReq{BrewPub}{Bar}\\ +\texttt{\ttgp{0}{0}.[Pub]Beer}\otimes\Sigma = \AssocType{[Pub]Beer}\otimes\ConfReq{BrewPub}{Pub} +\end{gather*} +For $\Sigma$ to be well-formed, both type witnesses must be \index{canonical type equality}canonically equal, because in the generic signature of \texttt{both()}, the \index{bound dependent member type}bound dependent member types \texttt{\ttgp{0}{0}.[Bar]Beer} and \texttt{\ttgp{0}{0}.[Pub]Beer} are both equivalent to the \index{unbound dependent member type}unbound dependent type \texttt{\ttgp{0}{0}.Beer}. -\item Instead, we could conform \texttt{G} to \texttt{P} when \texttt{T} conforms to \texttt{Equatable}: +If \texttt{BrewPub} conforms to these protocols unconditionally, the redeclaration checking rules prevent us from declaring the two conformances to have distinct type witnesses: \begin{Verbatim} -extension G: P where T: Equatable {...} -\end{Verbatim} -Now, the conditional requirement of $\ConfReq{G}{P}$ is $\ConfReq{T}{Equatable}$, and after substitution we get: -\[\ConfReq{G}{P}\otimes\SubstMap{\SubstType{T}{Int},\,\SubstType{U}{$\archetype{U}$}}=\ConfReq{Int}{Equatable}\] -This also holds, so once again our conditional conformance $\ConfReq{G}{Q}$ is valid. +extension BrewPub: Bar { + typealias Beer = Float + func brew() -> Float { return 0.0 } +} -\item Finally consider the following, which is invalid: +extension BrewPub: Pub { + typealias Beer = String // error: invalid redeclaration + func pour() -> String { return "" } +} +\end{Verbatim} +If the conformances are conditional though, neither type alias is visible from the other \index{constrained extension}constrained extension, so they are not rejected as redeclarations. All that remains is to pick conditional requirements such that our generic argument type \texttt{Int} satisfies both: \begin{Verbatim} -extension G: P where U: Equatable {...} +extension BrewPub: Bar where T: Equatable { + typealias Beer = Float + func brew() -> Float { return 0.0 } +} + +extension BrewPub: Pub where T: ExpressibleByIntegerLiteral { + typealias Beer = String + func pour() -> String { return "" } +} \end{Verbatim} -The conditional requirement is $\ConfReq{U}{Equatable}$, which becomes $\ConfReq{$\archetype{U}$}{Equatable}$ after substitution: -\[\ConfReq{U}{Equatable}\otimes \SubstMap{\SubstType{T}{Int},\,\SubstType{U}{$\archetype{U}$}}=\ConfReq{U}{Equatable}\] -The archetype $\archetype{U}$ appearing in our substitution map is from the generic environment of the extension declaring the conformance to \verb|Q|, where it was not subjected to any requirements. In particular, $\archetype{U}$ does not conform to \texttt{Equatable}. Thus, if $\ConfReq{G}{P}$ is declared as above, the conformance $\ConfReq{G}{Q}$ is rejected, because the conditional requirement $\FormalReq{T == Int}$ of $\ConfReq{G}{Q}$ does not imply the conditional requirement $\ConfReq{U}{Equatable}$ of $\ConfReq{G}{P}$. +When we call our \texttt{both()} function, the caller expects to receive a tuple of two elements both having the same reduced type \texttt{\ttgp{0}{0}.[Bar]Beer}. Inside the function, the type checker thinks that the calls to \texttt{brew()} and \texttt{pour()} return the same type. However, what actually happens is that one returns a \texttt{Float} while the other returns a \texttt{String}. This quickly results in undefined behavior. This soundness hole remains unfixed at the time of writing. There are two possible approaches to solving this problem: +\begin{enumerate} +\item We can tighten the rules around redeclarations of type aliases in constrained extensions, requiring that all such type aliases have canonically equal underlying types. This would reject the conditional conformances written above as invalid. +\item We can extend \AlgRef{check generic arguments algorithm} to look for incompatible conformances in the given substitution map. This would accept the conformances, but reject the call to \texttt{both()} written above. \end{enumerate} +\SecRef{critical pairs} sketches out a potential implementation of~(2) at the end of \ExRef{two protocols same assoc}. \section{Source Code Reference}\label{extensionssourceref} @@ -716,17 +794,14 @@ \section{Source Code Reference}\label{extensionssourceref} \item \SourceFile{lib/AST/Decl.cpp} \end{itemize} -\IndexSource{extension declaration}% \apiref{ExtensionDecl}{class} -Represents an extension declaration. +Represents an \IndexSource{extension declaration}extension declaration. \begin{itemize} -\item \texttt{getExtendedNominal()} returns the extended type declaration. Returns \verb|nullptr| if extension binding failed to resolve the extended type. Asserts if extension binding has not yet visited this extension. -\item \texttt{computeExtendedNominal()} actually evaluates the request which resolves the extended type to a nominal type declaration. Only used by extension binding. -\IndexSource{extended type} -\item \texttt{getExtendedType()} returns the written extended type, which might be a type alias type or a generic nominal type. -\item \texttt{getDeclaredInterfaceType()} returns the declared interface type of the extended type declaration. -\Index{protocol Self type@protocol \texttt{Self} type} -\item \texttt{getSelfInterfaceType()} returns the self interface type of the extended type declaration. Different than the declared interface type for protocol extensions, where the declared interface type is the protocol type, but the self interface type is the protocol \texttt{Self} type. +\item \texttt{getExtendedNominal()} returns the \IndexSource{extended type}extended type declaration. Returns \verb|nullptr| if extension binding failed to resolve the extended type. Asserts if extension binding has not yet visited this extension. +\item \texttt{computeExtendedNominal()} actually evaluates the \IndexSource{extended nominal request}request which resolves the extended type to a nominal type declaration. Only used by extension binding. +\item \texttt{getExtendedType()} returns the written \IndexSource{extended type}extended type, which might be a type alias type or a generic nominal type. This uses ordinary type resolution, so it only happens after extension binding. This is used to implement the syntax sugar described at the start of \SecRef{constrained extensions}. +\item \texttt{getDeclaredInterfaceType()} returns the \IndexSource{declared interface type}declared interface type of the extended type declaration. +\item \texttt{getSelfInterfaceType()} returns the \IndexSource{self interface type}self interface type of the extended type declaration. Different than the declared interface type for \IndexSource{protocol extension}protocol extensions, where the declared interface type is the protocol type, but the self interface type is the \IndexSource{protocol Self type@protocol \texttt{Self} type}protocol \texttt{Self} type. \end{itemize} \subsection*{Extension Binding} @@ -741,7 +816,7 @@ \subsection*{Extension Binding} \IndexSource{extension binding} \apiref{bindExtensions()}{function} -Takes a \texttt{ModuleDecl *}, which must be the main module, and implements Algorithm~\ref{extension binding algorithm}. +Takes a \texttt{ModuleDecl *}, which must be the main module, and implements \AlgRef{extension binding algorithm}. \apiref{ExtendedNominalRequest}{class} The request evaluator request which resolves the extended type to a nominal type declaration. This calls out to a restricted form of type resolution which does not apply generic arguments or perform substitution. @@ -764,9 +839,9 @@ \subsection*{Direct Lookup and Lazy Member Loading} \IndexSource{direct lookup} \apiref{DirectLookupRequest}{class} -The request evaluator request implementing direct lookup. The entry point is the \texttt{NominalTypeDecl::lookupDirect()} method, which was introduced in Section~\ref{compilation model source reference}. This request is uncached, because the member lookup table effectively implements caching outside of the request evaluator. +The request evaluator request implementing direct lookup. The entry point is the \texttt{NominalTypeDecl::lookupDirect()} method, which was introduced in \SecRef{compilation model source reference}. This request is uncached, because the member lookup table effectively implements caching outside of the request evaluator. -You can read through the implementation of \verb|DirectLookupRequest::evaluate()| and the functions that it calls to understand how lazy member loading works: +To understand the implementation of \verb|DirectLookupRequest::evaluate()|, one can start with the following functions: \begin{itemize} \item \texttt{prepareLookupTable()} adds all members from extensions without lazy loaders, and members loaded so far from extensions with lazy loaders, without marking any entries as complete. \item \texttt{populateLookupTableEntryFromLazyIDCLoader()} asks a lazy member loader to load a single entry and adds it to the member lookup table. @@ -795,7 +870,7 @@ \subsection*{Constrained Extensions} \item \SourceFile{lib/Sema/TypeCheckDecl.cpp} \item \SourceFile{lib/Sema/TypeCheckGeneric.cpp} \end{itemize} -The \texttt{GenericSignatureRequest} was previously introduced in Section~\ref{buildinggensigsourceref}. It delegates to a pair of utility functions to implement special behaviors of extensions. +The \texttt{GenericSignatureRequest} was previously introduced in \SecRef{buildinggensigsourceref}. It delegates to a pair of utility functions to implement special behaviors of extensions. \apiref{collectAdditionalExtensionRequirements()}{function} Collects an extension's requirements from the extended type, which handles extensions of pass-through type aliases (\verb|extension CountableRange {...}|) and extensions of bound generic types (\verb|extension Array {...}|). @@ -806,9 +881,6 @@ \subsection*{Constrained Extensions} \subsection*{Conditional Conformances} -\IndexSource{conditional conformance} -\IndexSource{conditional requirement} - Key source files: \begin{itemize} \item \SourceFile{include/swift/AST/GenericSignature.h} @@ -817,22 +889,27 @@ \subsection*{Conditional Conformances} \item \SourceFile{lib/AST/ProtocolConformance.cpp} \item \SourceFile{lib/Sema/TypeCheckProtocol.cpp} \end{itemize} -The \verb|NormalProtocolConformance| and \verb|SpecializedProtocolConformance| classes were previously introduced in Section~\ref{conformancesourceref}. + +\apiref{ModuleDecl}{class} + +See also \SecRef{compilation model source reference} and \SecRef{conformancesourceref}. +\begin{itemize} +\item \texttt{checkConformance()} is a utility that first calls \texttt{lookupConformance()}, and then checks any conditional requirements using \texttt{checkRequirements()}, described in \SecRef{type resolution source ref}. +\end{itemize} + +The \verb|NormalProtocolConformance| and \verb|SpecializedProtocolConformance| classes were previously introduced in \SecRef{conformancesourceref}. \apiref{NormalProtocolConformance}{class} \begin{itemize} -\item \texttt{getConditionalRequirements()} returns an array of conditional requirements, which is empty if the conformance is unconditional. +\item \texttt{getConditionalRequirements()} returns an array of \IndexSource{conditional requirement}conditional requirements; this is non-empty exactly when this is a \IndexSource{conditional conformance}conditional conformance. \end{itemize} \apiref{SpecializedProtocolConformance}{class} \begin{itemize} -\item \texttt{getConditionalRequirements()} returns an array of substituted conditional requirements with the conformance substitution map applied, which is empty if the conformance is unconditional. +\item \texttt{getConditionalRequirements()} applies the \IndexSource{conformance substitution map}conformance substitution map to each conditional requirement of the underlying normal conformance. \end{itemize} \apiref{GenericSignatureImpl}{class} -See also Section~\ref{genericsigsourceref}. +See also \SecRef{genericsigsourceref}. \begin{itemize} -\item \texttt{requirementsNotSatisfiedBy()} returns an array of those requirements of this generic signature not satisfied by the given generic signature. This is used for computing conditional requirements of a \texttt{NormalProtocolConformance}. +\item \texttt{requirementsNotSatisfiedBy()} returns an array of those requirements of this generic signature not satisfied by the given generic signature. This is used for computing the conditional requirements of a \texttt{NormalProtocolConformance}. \end{itemize} -\apiref{TypeChecker::conformsToProtocol()}{function} -Utility wrapper around \texttt{ModuleDecl::lookupConformance()} which checks conditional requirements, returning an invalid conformance if they are not satisfied. - \end{document} diff --git a/docs/Generics/chapters/generic-declarations.tex b/docs/Generics/chapters/generic-declarations.tex deleted file mode 100644 index 457480d1b9742..0000000000000 --- a/docs/Generics/chapters/generic-declarations.tex +++ /dev/null @@ -1,499 +0,0 @@ -\documentclass[../generics]{subfiles} - -\begin{document} - -\chapter{Generic Declarations}\label{generic declarations} - -\IndexDefinition{generic declaration} -\index{generic parameter list} -\IndexDefinition{generic type alias} -\lettrine{V}{arious kinds of declarations} can have a generic parameter list and a trailing \texttt{where} clause, and by looking at their syntax and semantics, our foray into generics can begin in earnest. In this chapter, we will explore the so-called \emph{generic declarations}: -\begin{itemize} -\item classes, structs and enums, -\item type aliases, -\item functions, -\item constructors, -\item subscripts, -\item extensions. -\end{itemize} - -\IndexDefinition{parsed generic parameter list} -\Index{protocol Self type@protocol \texttt{Self} type} -\index{opaque parameter} -The \emph{parsed} generic parameter list of a declaration is what's written in source, with the familiar \texttt{<...>} syntax following the declaration name. The declaration's complete generic parameter list includes the parsed generic parameter list together with any implicit generic parameters: -\begin{enumerate} -\item Functions and subscripts may have a parsed generic parameter list, or they can declare opaque parameters with the \texttt{some} keyword, or both (Section~\ref{opaque parameters}). -\item Protocols always have a single implicit \texttt{Self} generic parameter, and no parsed generic parameter list (Section~\ref{protocols}). -\item Extensions always have an implicit set of generic parameters inherited from the extended type, and no parsed generic parameter list (Chapter~\ref{extension binding}). -\end{enumerate} -Parsed generic parameters, the protocol \texttt{Self} type, and the implicit generic parameters of an extension all have names that remain in scope for the entire source range of the generic declaration. Generic parameters introduced by opaque parameter declarations are unnamed; only the value declared by the opaque parameter has a name. - -\index{declaration context} -\IndexDefinition{generic context} -All generic declarations are declaration contexts, because they contain their generic parameter declarations. A \emph{generic context} is a declaration context where at least one parent context is a generic declaration. Note the subtle distinction in the meaning of ``generic'' when talking about declarations and declaration contexts; a declaration is generic only if it has generic parameters of its own, whereas a declaration context being a generic context is a transitive properly inherited from the parent context. - -\IndexDefinition{depth} -\IndexDefinition{index} -Inside a generic context, unqualified name lookup will find all outer generic parameters. Each generic parameter is therefore uniquely identified within a generic context by its \emph{depth} and \emph{index}: -\begin{itemize} -\item The depth identifies a specific generic declaration, starting from zero for the top-level generic declaration and incrementing for each nested generic declaration. -\item The index identifies a generic parameter within a single generic parameter list. -\end{itemize} - -\index{sugared type} -The declared interface type of a generic parameter declaration is a sugared type that prints as the generic parameter name. The canonical type of this type only stores the depth and index. The notation for a canonical generic parameter type is \ttgp{d}{i}, where \texttt{d} is the depth and \texttt{i} is the index. - -\begin{listing}\captionabove{Two nested generic declarations}\label{linkedlistexample} -\begin{Verbatim} -enum LinkedList { - case none - indirect case entry(Element, LinkedList) - - func mapReduce(_ f: (Element) -> T, - _ m: (A, T) -> A, - _ a: A) -> A { - switch self { - case .none: - return a - case .entry(let x, let xs): - return m(xs.mapReduce(f, m, a), f(x)) - } -} -\end{Verbatim} -\end{listing} - -\begin{example} -Listing~\ref{linkedlistexample} declares a \texttt{LinkedList} type with a single generic parameter named \texttt{Element}, and a \texttt{mapReduce()} method with two generic parameters named \texttt{T} and \texttt{A}. All three generic parameters are visible from inside the method: -\begin{quote} -\begin{tabular}{|l|l|l|l|} -\hline -Name&Depth&Index&Canonical type\\ -\hline -\texttt{Element}&0&0&\ttgp{0}{0}\\ -\texttt{T}&1&0&\ttgp{1}{0}\\ -\texttt{A}&1&1&\ttgp{1}{1}\\ -\hline -\end{tabular} -\end{quote} -\end{example} - -\paragraph{History} -The list of generic declaration kinds at the beginning of this chapter has grown two new additions over time: generic type aliases were introduced in Swift 3 \cite{se0048}, while generic subscripts were introduced in Swift 4 \cite{se0148}. It is conceivable that Swift might get generic (computed) properties at some point, or even generic associated types (which would be a major redesign). Beyond that, there are probably limited opportunities for allowing more existing declaration kinds to be generic, but who knows. - -\section{Constraint Types}\label{constraints} - -\IndexDefinition{constraint type} -\index{requirement} -\IndexDefinition{inheritance clause} -\index{generic parameter declaration} -A generic requirement adds new capabilities to a generic parameter type, by restricting the possible substituted concrete types to those that provide this capability. The next section will introduce the trailing \texttt{where} clause syntax for stating generic requirements in a fully general way. Before doing that, we'll take a look at the simpler mechanism of stating a \emph{constraint type} in the inheritance clause of a generic parameter declaration: -\begin{Verbatim} -func allEqual(_ elements: [T]) {...} -\end{Verbatim} -\index{protocol type}% -\index{protocol composition type}% -\index{parameterized protocol type}% -\index{class type}% -\Index{AnyObject@\texttt{AnyObject}}% -\index{layout constraint}% -\Index{Any@\texttt{Any}}% -A constraint type is one of the following: -\begin{enumerate} -\item A protocol type, like \texttt{Hashable}. -\item A parameterized protocol type, like \texttt{Sequence} (Section~\ref{protocols}). -\item A protocol composition, like \texttt{ShapeProtocol \& MyClass}. Protocol compositions were originally just compositions of protocol types, but they can include class types as of Swift 4 \cite{se0156}. -\item A class type, like \texttt{NSObject}. -\item The \texttt{AnyObject} \emph{layout constraint}, which restricts the possible concrete types to those represented as a single reference-counted pointer. -\item The empty protocol composition, written \texttt{Any}. Writing \texttt{Any} in a generic parameter's inheritance clause is pointless, but it is allowed for completeness. -\end{enumerate} -Constraint types can appear in various positions: -\begin{enumerate} -\item In the inheritance clause of a generic parameter declaration, which is the focus of this section. -\item On the right hand side of a conformance, superclass or layout requirement in a \texttt{where} clause, which you will see shortly. -\item In the inheritance clauses of protocols and associated types (Section~\ref{protocols}). -\item Following the \texttt{some} keyword in an opaque parameter (Section~\ref{opaque parameters}) or return type (Chapter~\ref{opaqueresult}). -\item Following the \texttt{any} keyword in an existential type (Chapter~\ref{existentialtypes}). A single class type cannot be the constraint type of an existential; \texttt{any~NSObject} is just written as \texttt{NSObject}. Existential types where the constraint type is \texttt{AnyObject} and \texttt{Any} can also be written without the \texttt{any} keyword. -\end{enumerate} -\begin{example} -In Listing~\ref{dependentconstrainttype}, the generic parameter \texttt{B} of \texttt{open(box:)} references the generic parameter \texttt{C} from its constraint type; the lexical scope of \texttt{C} includes its own generic parameter list. -\end{example} -\begin{listing}\captionabove{The constraint type of \texttt{B} in \texttt{open(box:)} refers to \texttt{C}}\label{dependentconstrainttype} -\begin{Verbatim} -class Box { - var contents: Contents -} - -func open, C>(box: B) -> C { - return box.contents -} - -struct Vegetables {} -class FarmBox: Box {} -let vegetables: Vegetables = open(box: FarmBox()) -\end{Verbatim} -\end{listing} - -\section{Requirements}\label{trailing where clauses} -\IndexDefinition{where clause@\texttt{where} clause} -\index{trailing where clause@trailing \texttt{where} clause|see{\texttt{where} clause}} -\IndexDefinition{requirement representation} -\IndexDefinition{constraint requirement representation} -\IndexDefinition{same-type requirement representation} -\IndexDefinition{requirement} -A constraint type in the inheritance clause of a generic parameter declaration is sugar for a \texttt{where} clause requirement whose subject type is the generic parameter type: -\begin{Verbatim} -struct Set {...} -struct Set where Element: Hashable {...} -\end{Verbatim} -The requirements in a \texttt{where} clause name the subject type explicitly, so that dependent member types can be constrained too, for example: -\begin{Verbatim} -func isSorted(_: S) where S.Element: Comparable {...} -\end{Verbatim} -Another generalization over generic parameter inheritance clauses is that \texttt{where} clauses can define same-type requirements: -\begin{Verbatim} -func merge(_: S, _: T) -> [S.Element] - where S: Comparable, S.Element == T.Element {...} -\end{Verbatim} -Formally, a \texttt{where} clause is a list of one or more \emph{requirement representations}. There are three kinds of requirement representations, with the first two kinds storing a pair of type representations, and the third storing a type representation and layout constraint: -\begin{enumerate} -\item \textbf{Constraint requirement representations}, written as \texttt{T:\ C}, where \texttt{T} and \texttt{C} are type representations, called the subject type and constraint type, respectively. -\item \textbf{Same-type requirement representations}, written as \texttt{T == U}, where \texttt{T} and \texttt{U} are type representations. -\item \textbf{Layout requirement representations}, written as \texttt{T:\ L} where \texttt{L} is a layout constraint. The only type of layout constraint which can be written in the source language is \texttt{AnyObject}, but this is actually parsed as a constraint requirement representation. Bona-fide layout requirement representations only appear within the \texttt{@\_specialize} attribute. -\end{enumerate} -\index{requirement resolution} -Just as type resolution resolves type representations to types, \emph{requirement resolution} resolves requirement representations to \emph{requirements}. Requirements store types instead of type representations. Figure~\ref{typerequirementrepresentation} shows the correspondence. -\begin{figure}\captionabove{Types and requirements, at the syntactic and semantic layers}\label{typerequirementrepresentation} -\begin{center} -\begin{tikzcd}[column sep=3cm,row sep=1cm] -\mathboxed{requirement representation} \arrow[d, "\text{contains}"{left}] \arrow[r, "\text{resolves to}"] &\mathboxed{requirement} \arrow[d, "\text{contains}"] \\ -\mathboxed{type representation} \arrow[r, "\text{resolves to}"]&\mathboxed{type} -\end{tikzcd} -\end{center} -\end{figure} - -\IndexDefinition{conformance requirement}% -\IndexDefinition{superclass requirement}% -\IndexDefinition{layout requirement}% -\IndexDefinition{same-type requirement}% -\IndexDefinition{requirement kind}% -Requirement resolution resolves each type representation to a type, and computes the requirement kind. The requirement kind encodes more detail than the requirement representation kind: -\begin{itemize} -\item \textbf{Conformance requirements} state that a type must conform to a protocol, protocol composition or parameterized protocol type. -\item \textbf{Superclass requirements} state that a type must either equal to be a subclass of the superclass type. -\item \textbf{Layout requirements} state that a type must satisfy a layout constraint. -\item \textbf{Same-type requirements} state that two interface types are equivalent under the \index{reduced type equality}\index{equivalent type parameter|see{reduced type equality}}reduced type equality relation (this concept was first introduced in Chapter~\ref{types} and will be detailed in Section~\ref{reducedtypes}). -\end{itemize} -Constraint requirement representations resolve to conformance, superclass and layout requirements; the exact kind of requirement is only known after type resolution resolves the constraint type by performing name lookups. Same-type requirement representations always resolve to same-type requirements. - -The simpler syntax introduced in the previous section, where a constraint type can be written in the inheritance clause of a generic parameter declaration, also resolves to a requirement. The requirement's subject type is the generic parameter type. The requirement kind is always a conformance, superclass or layout requirement, never a same-type requirement. - -\paragraph{History} -The \texttt{where} clause syntax used to be part of the generic parameter list itself, but was moved to the modern \Index{where clause@\texttt{where} clause}``trailing'' form in Swift 3 \cite{se0081}. Implementation limitations prevented \texttt{where} clause requirements from constraining outer generic parameters until Swift 3. Once these implementation difficulties were solved, it no longer made sense to restrict a \texttt{where} clause to appear only on a declaration that has its own generic parameter list; this restriction was lifted in Swift 5.3 \cite{se0261}, allowing any declaration in a generic context to declare a \texttt{where} clause. - -For example, the following became valid: -\begin{Verbatim} -enum LinkedList { - ... - - func sum() -> Element where Element: AdditiveArithmetic {...} -} -\end{Verbatim} -There is no semantic distinction between attaching a \texttt{where} clause to a member of a type, or moving the member to a constrained extension, so the above is equivalent to the following: -\begin{Verbatim} -extension LinkedList where Element: AdditiveArithmetic { - func sum() -> Element {...} -} -\end{Verbatim} -\index{mangling} -Unfortunately, due to historical quirks in the name mangling scheme, the above is not an \index{ABI}ABI-compatible transformation. - -\index{value requirement} -\index{conforming type} -\paragraph{Protocol requirements} -There is still one situation where constraining outer generic parameters is prohibited, for usability reasons. The \emph{value requirements} of a protocol (properties, subscripts and methods) cannot constrain \texttt{Self} or its associated types in their \Index{where clause@\texttt{where} clause}\texttt{where} clause. Since value requirements must be fulfilled by all concrete conforming types, if a value requirement's \texttt{where} clause imposed additional constraints on \texttt{Self}, it would be impossible for a concrete type which did not otherwise satisfy those constraints to declare a witness for this value requirement. Rather than allow defining a protocol which cannot be conformed to, the type checker diagnoses an error. -\begin{example} -The following protocol attempts to define an \texttt{Element} associated type with no requirements, and a \texttt{minElement()} method which requires that \texttt{Element} conform to the \texttt{Comparable} protocol: -\begin{Verbatim} -protocol SetProtocol { - associatedtype Element - - func minElement() -> Element where Element: Comparable -} -\end{Verbatim} -This is not allowed, because there is no way to implement the \texttt{minElement()} requirement in a concrete conforming type whose \texttt{Element} type is not \texttt{Comparable}. One way to fix the error is to move the \texttt{where} clause from the protocol method to the associated type, which would instead impose the requirement on all conforming types.\end{example} - -\section{Opaque Parameters}\label{opaque parameters} - -\index{opaque parameter} -\index{depth} -\index{index} -\index{parsed generic parameter list} -In the type of a function or subscript parameter, the \texttt{some} keyword declares an \emph{opaque parameter type}. The \texttt{some} keyword is followed by a constraint type. This introduces an unnamed generic parameter, and the constraint type imposes a conformance, superclass or layout requirement on this generic parameter. When a declaration has both a parsed generic parameter list and opaque parameters, the opaque parameters have the same depth as the parsed generic parameters, and appear after the parsed generic parameters in index order. - -\index{expression} -Opaque parameter types are unnamed, and therefore are not visible to type resolution. In particular, there is no way to refer to an opaque parameter type within the function's \Index{where clause@\texttt{where} clause}\texttt{where} clause, or from a type annotation on a declaration nested in the function's body. From expression context however, the type of an opaque parameter can be obtained via the built-in \texttt{type(of:)} pseudo-function,\footnote{It looks like a function call, but the type checking behavior of \texttt{type(of:)} cannot be described by a Swift function type; it is not a real function.} which produces a metatype value. This allows for invoking static methods and such. -\begin{example} -These two definitions are equivalent: -\begin{Verbatim} -func merge(_: some Sequence, _: some Sequence) -> [E] {} -func merge, T: Sequence>(_: S, _: T) -> [E] {} -\end{Verbatim} -The constraint types here are parameterized protocol types, which are described in the next section. -\end{example} -Opaque parameter declarations were introduced in Swift 5.7 \cite{se0341}. Note that \texttt{some} appearing in the return type of a function declares an \emph{opaque return type}, which is a related but quite different feature (Chapter~\ref{opaqueresult}). - -\section{Protocol Declarations}\label{protocols} - -\index{protocol declaration} -\index{conforming type} -\IndexDefinition{protocol Self type@protocol \texttt{Self} type} -Protocols have an implicit generic parameter list with a single generic parameter named \texttt{Self}, which represents the conforming type of a concrete conformance. Protocols can impose \IndexDefinition{associated requirement}\emph{associated requirements} on the \texttt{Self} type and its member types, and any concrete conformance to this protocol must satisfy those requirements. The associated requirements are collected in the protocol's requirement signature (Section~\ref{requirement sig}). Protocols can only appear at the top level of a source file, and structs, classes and enums cannot be nested inside protocols (Section~\ref{nested nominal types}). - -\IndexDefinition{primary associated type}% -\index{parameterized protocol type}% -\paragraph{Primary associated types} -A protocol can declare a list of \emph{primary associated types} with a syntax resembling a generic parameter list. While generic parameter lists introduce new generic parameter declarations, the entries in the primary associated type list reference \emph{existing} associated types declared in the protocol's body. - -A protocol with primary associated types can then be used as a parameterized protocol type. As a constraint type, a parameterized protocol type is equivalent to a conformance requirement between the subject type and the protocol, together with same-type requirements. The same-type requirements relate the primary associated types of the subject type with the arguments of the parameterized protocol type. - -\begin{example} -The standard library's iterator protocol has a primary associated type: -\begin{Verbatim} -protocol IteratorProtocol { - associatedtype Element - mutating func next() -> Element? -} -\end{Verbatim} -We can then write a parameterized protocol type: -\begin{Verbatim} -func sumOfSquares>(_: I) -> Int {...} -\end{Verbatim} -The above is equivalent to the following \emph{desugaring}, which will receive a more formal treatment in Section~\ref{requirement desugaring}: -\begin{Verbatim} -func sumOfSquares(_: I) -> Int - where I.Element == Int {...} -\end{Verbatim} -\end{example} -Parameterized protocol types and primary associated types were added to the language in Swift~5.7~\cite{se0346}. - -\index{associated type declaration}% -\Index{where clause@\texttt{where} clause}% -\index{inheritance clause}% -\paragraph{Associated type requirements} -Associated types can state one or more constraint types in their inheritance clause, in addition to an optional \texttt{where} clause. Constraint types in the inheritance clause resolve to requirements whose subject type is the associated type declaration's declared interface type---which you might recall is the dependent member type \texttt{Self.[P]A}, where \texttt{A} is the associated type declaration and \texttt{P} is the protocol. The standard library \texttt{Sequence} protocol demonstrates all of these features: -\begin{Verbatim} -protocol Sequence { - associatedtype Iterator: IteratorProtocol - associatedtype Element where Iterator.Element == Element - - func makeIterator() -> Iterator -} -\end{Verbatim} -The conformance requirement on \texttt{Iterator} could have been written with a \texttt{where} clause as well: -\begin{Verbatim} -associatedtype Iterator where Iterator: IteratorProtocol -\end{Verbatim} -Finally, a \texttt{where} clause can be attached to the protocol itself; there is no semantic difference between that and attaching it to an associated type: -\begin{Verbatim} -protocol Sequence where Iterator: IteratorProtocol, - Iterator.Element == Element {...} -\end{Verbatim} -Unlike generic parameters, associated type inheritance clauses allow multiple entries, separated by commas. This is effectively equivalent to a single inheritance clause entry containing a protocol composition: -\begin{Verbatim} -associatedtype Data: Codable & Hashable -associatedtype Data: Codable, Hashable -\end{Verbatim} -\paragraph{Unqualified lookup inside protocols} -Within the entire source range of the protocol declaration, unqualified references to associated types, like \texttt{Element} and \texttt{Iterator} above, resolve to their declared interface type. This is a shorthand for accessing the associated type as a member type of the protocol \texttt{Self} type. The \texttt{Sequence} protocol above could instead have been declared as follows: -\begin{Verbatim} -protocol Sequence where Self.Iterator: IteratorProtocol, - Self.Iterator.Element == Self.Element {...} -\end{Verbatim} -\index{inheritance clause} -\IndexDefinition{inherited protocol} -\index{protocol inheritance|see{inherited protocol}} -\index{conforming type} -\paragraph{Protocol inheritance clauses} -Constraint types appearing in the protocol's inheritance clause become generic requirements on \texttt{Self} in the same manner that constraint types in generic parameter inheritance clauses become requirements on the generic parameter type. Requirements on \texttt{Self} are imposed by the conformance checker on concrete types conforming to the protocol. - -If the constraint type is another protocol, we call the protocol stating the requirement the \emph{derived protocol} and the protocol named by the constraint type the \emph{base protocol}. The derived protocol is said to \emph{inherit} from (or sometimes, \emph{refine}) the base protocol. Protocol inheritance can be observed in two ways; first, every concrete type conforming to the derived protocol must also conform to the base protocol. Second, qualified name lookup will search through inherited protocols when the lookup begins from the derived protocol or one of its concrete conforming types. - -For example, the standard library's \texttt{Collection} protocol inherits from \texttt{Sequence}, therefore any concrete type conforming \texttt{Collection} must also conform to \texttt{Sequence}. If some type parameter \texttt{T} is known to conform to \texttt{Collection}, members of both the \texttt{Collection} and \texttt{Sequence} protocols will be visible to qualified name lookup on a value of type \texttt{T}. -\begin{Verbatim} -protocol Collection: Sequence {...} -\end{Verbatim} -Protocols can restrict their conforming types to those with a reference-counted pointer representation by stating an \texttt{AnyObject} layout constraint: -\begin{Verbatim} -protocol BoxProtocol: AnyObject {...} -\end{Verbatim} -Protocols can also impose a superclass requirement on their conforming types: -\begin{Verbatim} -class Plant {} -class Animal {} -protocol Duck: Animal {} -class MockDuck: Plant, Duck {} -// error: MockDuck is not a subclass of Animal -\end{Verbatim} - -\IndexDefinition{class-constrained protocol} -A protocol is \emph{class-constrained} if the \texttt{Self:~AnyObject} requirement can be proven from its inheritance clause; either directly stated, implied by a superclass requirement, or inherited from another protocol. Qualified name lookup understands a superclass in a protocol's inheritance clause, making the members of the superclass visible to all lookups that look into the protocol. - -We'll say more about the semantics of protocol inheritance clauses and name lookup in Section \ref{requirement sig}~and~\ref{identtyperepr}. - -\paragraph{History} -In older releases of Swift, protocols could only constrain associated types by writing a constraint type in the associated type's inheritance clause, which limited the kinds of requirements that could be imposed on the associated type. The trailing \texttt{where} clause syntax was extended to cover associated types and protocols in Swift~4~\cite{se0142}. - -\index{recursive conformance requirement}% -Another important generalization allowed an associated type to conform to the same protocol that it appears in, either directly or indirectly. The ability to declare a so-called \emph{recursive conformance} was introduced in Swift 4.1 \cite{se0157}. This feature has some profound implications, which are further explored in Section~\ref{type parameter graph}, \ref{recursive conformances}, \ref{monoidsasprotocols}, and \ref{recursive conformances redux}. - -\section{Source Code Reference}\label{genericdeclsourceref} - -Key source files: -\begin{itemize} -\item \SourceFile{include/swift/AST/Decl.h} -\item \SourceFile{include/swift/AST/DeclContext.h} -\item \SourceFile{include/swift/AST/GenericParamList.h} -\item \SourceFile{lib/AST/Decl.cpp} -\item \SourceFile{lib/AST/DeclContext.cpp} -\item \SourceFile{lib/AST/GenericParamList.cpp} -\end{itemize} -Other source files: -\begin{itemize} -\item \SourceFile{include/swift/AST/Types.h} -\item \SourceFile{lib/AST/NameLookup.cpp} -\end{itemize} - -\index{declaration context} -\IndexSource{generic context} -\IndexSource{generic declaration} -\IndexSource{parsed generic parameter list} -\apiref{DeclContext}{class} -See also Section~\ref{name lookup}, Section~\ref{declarationssourceref} and Section~\ref{genericsigsourceref}. -\begin{itemize} -\item \texttt{isGenericContext()} answers if this declaration context or one of its parent contexts has a generic parameter list. -\item \texttt{isInnermostContextGeneric()} answers if this declaration context is a generic context with its own generic parameter list, that is, if its declaration is a generic declaration. -\end{itemize} -\apiref{GenericContext}{class} -Base class for declarations which can be generic. See also Section~\ref{genericsigsourceref}. -\begin{itemize} -\item \texttt{getParsedGenericParams()} returns the declaration's parsed generic parameter list, or \texttt{nullptr}. -\item \texttt{getGenericParams()} returns the declaration's full generic parameter list, which includes any implicit generic parameters. Evaluates a \texttt{GenericParamListRequest}. -\item \texttt{isGeneric()} answers if this declaration has a generic parameter list. -\item \texttt{getGenericContextDepth()} returns the depth of the innermost generic parameter list, or \texttt{(unsigned)-1} if neither this declaration nor any outer declaration is generic. -\item \texttt{getTrailingWhereClause()} returns the trailing \texttt{where} clause, or \texttt{nullptr}. -\end{itemize} - -Trailing \texttt{where} clauses are not preserved in serialized generic contexts. Most code uses \texttt{GenericContext::getGenericSignature()} instead (Section~\ref{genericsigsourceref}), except when actually building the generic signature. - -\IndexSource{generic parameter list} -\apiref{GenericParamList}{class} -A generic parameter list. -\begin{itemize} -\item \texttt{getParams()} returns an array of generic parameter declarations. -\item \texttt{getOuterParameters()} returns the outer generic parameter list, linking multiple generic parameter lists for the same generic context. Only used for extensions of nested generic types. -\end{itemize} - -\IndexSource{protocol Self type@protocol \texttt{Self} type} -\apiref{GenericParamListRequest}{class} -This request creates the full generic parameter list for a declaration. Kicked off from \texttt{GenericContext::getGenericParams()}. -\begin{itemize} -\item For protocols, this creates the implicit \texttt{Self} parameter. -\item For functions and subscripts, calls \texttt{createOpaqueParameterGenericParams()} to walk the formal parameter list and look for \texttt{OpaqueTypeRepr}s. -\item For extensions, calls \texttt{createExtensionGenericParams()} which clones the generic parameter lists of the extended nominal itself and all of its outer generic contexts, and links them together via \texttt{GenericParamList::getOuterParameters()}. -\end{itemize} - -\IndexSource{generic parameter declaration} -\apiref{GenericTypeParamDecl}{class} -A generic parameter declaration. -\begin{itemize} -\item \texttt{getDepth()} returns the depth of the generic parameter declaration. -\item \texttt{getIndex()} returns the index of the generic parameter declaration. -\item \texttt{getName()} returns the name of the generic parameter declaration. -\item \texttt{getDeclaredInterfaceType()} returns the non-canonical generic parameter type for this declaration. -\item \texttt{isOpaque()} answers if this generic parameter is associated with an opaque parameter. -\item \texttt{getOpaqueTypeRepr()} returns the associated \texttt{OpaqueReturnTypeRepr} if this is an opaque parameter, otherwise \texttt{nullptr}. -\item \texttt{getInherited()} returns the generic parameter declaration's inheritance clause. -\end{itemize} - -Inheritance clauses are not preserved in serialized generic parameter declarations. Requirements stated on generic parameter declarations are part of the corresponding generic context's generic signature, so except when actually building the generic signature, most code uses \texttt{GenericContext::getGenericSignature()} instead (Section~\ref{genericsigsourceref}). - -\IndexSource{generic parameter type} -\IndexSource{depth} -\IndexSource{index} -\apiref{GenericTypeParamType}{class} -A generic parameter type. -\begin{itemize} -\item \texttt{getDepth()} returns the depth of the generic parameter declaration. -\item \texttt{getIndex()} returns the index of the generic parameter declaration. -\item \texttt{getName()} returns the name of the generic parameter declaration, only if this is a non-canonical type. -\end{itemize} - -\IndexSource{where clause@\texttt{where} clause} -\apiref{TrailingWhereClause}{class} -The syntactic representation of a trailing \texttt{where} clause. -\begin{itemize} -\item \texttt{getRequirements()} returns an array of \texttt{RequirementRepr}. -\end{itemize} - -\IndexSource{requirement representation} -\apiref{RequirementRepr}{class} -The syntactic representation of a requirement in a trailing \texttt{where} clause. -\begin{itemize} -\item \texttt{getKind()} returns a \texttt{RequirementReprKind}. -\item \texttt{getFirstTypeRepr()} returns the first \texttt{TypeRepr} of a same-type requirement. -\item \texttt{getSecondTypeRepr()} returns the second \texttt{TypeRepr} of a same-type requirement. -\item \texttt{getSubjectTypeRepr()} returns the first \texttt{TypeRepr} of a constraint or layout requirement. -\item \texttt{getConstraintTypeRepr()} returns the second \texttt{TypeRepr} of a constraint requirement. -\item \texttt{getLayoutConstraint()} returns the layout constraint of a layout requirement. -\end{itemize} - -\apiref{RequirementReprKind}{enum class} -\begin{itemize} -\item \texttt{RequirementRepr::TypeConstraint} -\item \texttt{RequirementRepr::SameType} -\item \texttt{RequirementRepr::LayoutConstraint} -\end{itemize} - -\apiref{WhereClauseOwner}{class} -Represents a reference to some set of requirement representations which can be resolved to requirements, for example a trailing \texttt{where} clause. This is used by various requests, such as the \texttt{RequirementRequest} below, and the \texttt{InferredGenericSignatureRequest} in Section~\ref{buildinggensigsourceref}. -\begin{itemize} -\item \texttt{getRequirements()} returns an array of \texttt{RequirementRepr}. -\item \texttt{visitRequirements()} resolves each requirement representation and invokes a callback with the \texttt{RequirementRepr} and resolved \texttt{Requirement}. -\end{itemize} - -\apiref{RequirementRequest}{class} -Request which can be evaluated to resolve a single requirement representation in a \texttt{WhereClauseOwner}. Used by \texttt{WhereClauseOwner::visitRequirements()}. - -\IndexSource{protocol declaration} -\IndexSource{primary associated type} -\apiref{ProtocolDecl}{class} -A protocol declaration. -\begin{itemize} -\item \texttt{getTrailingWhereClause()} returns the protocol \texttt{where} clause, or \texttt{nullptr}. -\item \texttt{getAssociatedTypes()} returns an array of all associated type declarations in the protocol. -\item \texttt{getPrimaryAssociatedTypes()} returns an array of all primary associated type declarations in the protocol. -\item \texttt{getInherited()} returns the parsed inheritance clause. -\end{itemize} - -Trailing \texttt{where} clauses and inheritance clauses are not preserved in serialized protocol declarations. Except when actually building the requirement signature, most code uses \texttt{ProtocolDecl::getRequirementSignature()} instead (Section~\ref{genericsigsourceref}). - -\IndexSource{inherited protocol} -The last four utility methods operate on the requirement signature, so are safe to use on deserialized protocols: -\begin{itemize} -\item \texttt{getInheritedProtocols()} returns an array of all protocols directly inherited by this protocol, computed from the inheritance clause. -\item \texttt{inheritsFrom()} determines if this protocol inherits from the given protocol, possibly transitively. -\item \texttt{getSuperclass()} returns the protocol's superclass type. -\item \texttt{getSuperclassDecl()} returns the protocol's superclass declaration. -\end{itemize} - -\index{associated type declaration} -\apiref{AssociatedTypeDecl}{class} -An associated type declaration. -\begin{itemize} -\item \texttt{getTrailingWhereClause()} returns the associated type's trailing \texttt{where} clause, or \texttt{nullptr}. -\item \texttt{getInherited()} returns the associated type's inheritance clause. -\end{itemize} - -Trailing \texttt{where} clauses and inheritance clauses are not preserved in serialized associated type declarations. Requirements on associated types are part of a protocol's requirement signature, so except when actually building the requirement signature, most code uses \texttt{ProtocolDecl::getRequirementSignature()} instead (Section~\ref{genericsigsourceref}). - -\end{document} \ No newline at end of file diff --git a/docs/Generics/chapters/generic-signatures.tex b/docs/Generics/chapters/generic-signatures.tex index 9b929466af016..59a8b5d177ed9 100644 --- a/docs/Generics/chapters/generic-signatures.tex +++ b/docs/Generics/chapters/generic-signatures.tex @@ -9,30 +9,32 @@ \chapter{Generic Signatures}\label{genericsig} \Index{where clause@\texttt{where} clause} \index{inheritance clause} \index{opaque parameter} -\lettrine{G}{eneric signatures describe} the interface between generic declarations and their usages. Every generic declaration has its own generic signature, constructed from the assortment of syntactic building blocks described by the previous chapter. When generic declarations nest, outer generic parameters are visible in the inner declaration, but the inner declaration can also introduce new generic parameters of its own, as well as impose new requirements (possibly on outer parameters). All of this suggests a ``flat'' representation. A generic signature thus records the following in one place: +\lettrine{G}{eneric signatures} are semantic objects that describe the type checking behavior of generic declarations. Declarations can nest, and outer generic parameters are visible inside inner declarations, so a generic signature is a ``flat'' representation that collects all \emph{generic parameter types} and \emph{requirements} that apply in the declaration's scope. This abstracts away the concrete syntax described in the previous chapter: \begin{itemize} -\item All generic parameter types visible from the declaration's body. This includes the generic parameters defined with an explicit generic parameter list \texttt{<...>} in source, as well as those implicitly introduced by opaque parameter declarations, like \texttt{some P}. The generic parameters of each outer generic declaration are also included. -\item A list of all generic requirements that apply to these generic parameters, which again includes requirements from outer declarations. We've seen three different syntactic forms for stating requirements: generic parameter inheritance clauses, trailing \texttt{where} clauses, and opaque parameters. A fourth, requirement inference, will be described in Section~\ref{requirementinference}. +\item The list of generic parameter types begins with all outer generic parameters, which are followed by generic parameters from the explicit generic parameter list \texttt{<...>}, together with any generic parameter types implicitly introduced by opaque parameter declarations, like the ``\texttt{some P}'' in ``\verb|func f(_: some P)|''. +\item The list of requirements in a generic signature includes the requirements stated by outer generic declarations, as well as any requirements from generic parameter inheritance clauses, the trailing \texttt{where} clause, and opaque parameters. A fourth mechanism, called requirement inference, will be described in \SecRef{requirementinference}. \end{itemize} -We're going to use this written notation for generic signatures: -\[\underbrace{\texttt{}}_{\text{requirements}}\] +The following notation for generic signatures is used throughout this book, as well as debugging output from the compiler: +\begin{center} +$\underbrace{\texttt{}}_{\text{requirements}}$ +\end{center} -A \index{requirement}\emph{requirement} is a statement about a type parameter, called the \emph{subject type} of the requirement. Requirements were introduced together with trailing \texttt{where} clauses in Section~\ref{trailing where clauses}; the requirements of a generic signature use the same representation but with a further invariant. They are \index{minimal requirement}\emph{minimal}: no requirement can be derived from any other requirement, or replaced with an equivalent but ``simpler'' requirement. +The requirements in a generic signature use the same semantic representation as the requirements in a trailing \texttt{where} clause, given by \DefRef{requirement def}, with a few additional properties. A generic signature always omits any redundant requirements from the list, and the remaining ones are written to be ``as simple as possible.'' We will describe this in \ChapRef{building generic signatures}, when we see how the generic signature of a declaration is built from the syntactic forms described in the previous chapter, but for now, we're just going to assume we're working with an existing generic signature that was given to us by this black box. -After some preliminaries, we will introduce a formal model for reasoning about type parameters and requirements in Section~\ref{derived req}, and develop it over the three subsequent sections. For now, we're just going to assume we're working with an existing generic signature that was given to us by the type checker or some other part of the compiler. Understanding how generic signatures are built, and how minimal requirements are actually derived from user-written requirements, is left to Chapter~\ref{building generic signatures}. +After some preliminaries, we will go on to introduce a formal system for reasoning about requirements and type parameters in \SecRef{derived req}. This makes precise the earlier concept of the \index{interface type}\emph{interface type} of a declaration---it contains valid type parameters of the declaration's generic signature. \SecRef{genericsigqueries} describes \emph{generic signature queries}, which are fundamental primitives in the implementation, used by the rest of the compiler to answer questions about generic signatures. These questions will be statements in our formal system. -\paragraph{Debugging} The\IndexFlag{debug-generic-signatures} \texttt{-debug-generic-signatures} frontend flag prints generic signatures of each declaration being type checked. Here is a simple program with three nested generic declarations: +\paragraph{Debugging.} The \IndexFlag{debug-generic-signatures}\texttt{-debug-generic-signatures} frontend flag gives us a glimpse into the generics implementation by printing the generic signature of each declaration being type checked. Here is a simple program with three nested generic declarations: \begin{Verbatim} struct Outer { struct Inner { - func transform() -> (T, U) where T.Element == U { + func transform() where T.Element == U { ... } } } \end{Verbatim} -Notice how the generic signature at each level of nesting incorporates all information from the outer declaration's generic signature: +If we run the compiler with this flag, we see that the generic signature at each level of nesting incorporates all information from the outer declaration's generic signature: \begin{Verbatim} debug.(file).Outer@debug.swift:1:8 Generic signature: @@ -44,733 +46,1209 @@ \chapter{Generic Signatures}\label{genericsig} Generic signature: \end{Verbatim} -\paragraph{Empty generic signature} -\IndexDefinition{empty generic signature} -\IndexDefinition{fully concrete type} -If a nominal type declaration is not a generic context (that is, neither it nor any parent context has any generic parameters), then its generic signature will have no generic parameters or generic requirements. This is called the \emph{empty generic signature}. Lacking any generic parameters, the empty generic signature more generally has no type parameters, either. The valid interface types of the empty generic signature are the fully concrete types, that is, types that do not contain any type parameters. - -\paragraph{Canonical signatures} -\IndexDefinition{canonical generic signature} -\IndexDefinition{generic signature equality} -Generic signatures are immutable and uniqued, so two generic signatures with the same structure and the same sugared types are equal pointers. A generic signature is \emph{canonical} if all listed generic parameter types are canonical, and any types appearing in requirements are canonical. A canonical signature is computed from an arbitrary generic signature by replacing any sugared types appearing in the signature with canonical types. Two generic signatures are canonically equal if their canonical signatures are equal pointers. There is no notion of a ``reduced generic signature'' the way we have reduced types. The generic requirements in a generic signature are already reduced in this sense; the only variation allowed is type sugar. -\begin{example} -These two declarations have the same canonical generic signature: +This flag also allows us to observe \index{requirement minimization}\emph{requirement minimization}. Here are three functions, each one generic over a pair of types conforming to \texttt{Sequence}: \begin{Verbatim} -func allEqual1, U: Sequence>(_: T, _U: U) - -> Bool {} +func sameElt(_ s1: S1, _ s2: S2) + where S1.Element == S2.Element {} + +func sameIter(_ s1: S1, _ s2: S2) + where S1.Iterator == S2.Iterator {} -func allEqual2(_: A, _: B) -> Bool - where A: Sequence, - B: Sequence, - B.Element == A.Element {} +func sameEltAndIter(_ s1: S1, _ s2: S2) + where S1.Element == S2.Element, + S1.Iterator == S2.Iterator {} \end{Verbatim} -The first declaration's generic signature: +The first function expects two sequences with the same \texttt{Element} associated type, so ``\verb|sameElt(Array(), Set())|'' for example. The second requires both have the same \texttt{Iterator} type. The third requires both, but a consequence of how \texttt{Sequence} is declared in the standard library is that the second is a stronger condition; if two sequences have the same \texttt{Iterator} type, they will also have the same \texttt{Element} type, but not vice versa. In other words, the same-type requirement $\SameReq{S1.Element}{S2.Element}$ is \emph{redundant} in the trailing \texttt{where} clause of \texttt{sameEltAndIter()}. (We will be able to \emph{prove} it when we revisit this generic signature in \ExRef{same name rule example}.) When we compile this program with \texttt{-debug-generic-signatures}, we observe that the generic signature of \texttt{sameElt()} is completely distinct, while the other two have a generic signature that looks like this: \begin{quote} \begin{verbatim} - + \end{verbatim} \end{quote} +Requirement minimization is described in \SecRef{minimal requirements}. -The second declaration's generic signature: -\begin{quote} -\begin{verbatim} - -\end{verbatim} -\end{quote} -The two generic signatures only differ by type sugar; namely, they use the corresponding sugared generic parameter types from their declaration. This makes them canonically equal but not equal pointers. The canonical generic signature of both is obtained by replacing generic parameters with their canonical types: +\paragraph{Canonical signatures.} +\IndexDefinition{canonical generic signature} +\IndexDefinition{generic signature equality} +Generic signatures are immutable and uniqued, so two generic signatures with the same structure and pointer-equal types are always pointer-equal generic signatures. While requirements in a generic signature are minimal and sorted in a certain order, they can still differ by type sugar. A generic signature is \emph{canonical} if all listed generic parameter types are canonical, and any types appearing in requirements are canonical. A canonical signature is computed from an arbitrary generic signature by replacing any sugared types appearing in the signature with canonical types, and this gives us the notion of canonical equality of generic signatures; two generic signatures are canonically equal if their canonical signatures are pointer-equal. + +The generic signatures of \texttt{sameIter()} and \texttt{sameEltAndIter()} above are canonically equal, but not pointer-equal; while their generic parameters have the same \emph{names}, the sugared generic parameter types refer to actual \emph{declarations}, which are distinct. Their canonical signature looks like this: \begin{quote} \begin{verbatim} -<τ_0_0, τ_0_1 - where τ_0_0: Sequence, τ_0_1: Sequence, - τ_0_0.[Sequence]Element == τ_0_1.[Sequence]Element> +<τ_0_0, τ_0_1 where τ_0_0: Sequence, τ_0_1: Sequence, + τ_0_0.[Sequence]Iterator == τ_0_1.[Sequence]Iterator> \end{verbatim} \end{quote} -\end{example} + +\paragraph{Empty generic signature.} +\IndexDefinition{empty generic signature} +\IndexDefinition{fully-concrete type} +A generic declaration without a generic parameter list or trailing \texttt{where} clause inherits the generic signature from the parent context. If no outer parent context is generic, we get the \emph{empty generic signature} with no generic parameters or requirements. The interface types described by the empty generic signature are the fully concrete types, that is, types that do not contain any type parameters. + +\paragraph{Protocol generic signature.} +In \SecRef{protocols} we saw that every protocol declaration has a generic parameter named \texttt{Self} that conforms to the protocol itself. The \index{G P@$G_\texttt{P}$|see{protocol generic signature}}\Index{protocol Self type@protocol \texttt{Self} type}protocol \texttt{Self} type is always canonically equal to~\rT. We will denote the \IndexDefinition{protocol generic signature}generic signature of a protocol~\texttt{P} by $G_\texttt{P}$: +\[ +G_\texttt{P} := \verb|| +\] \section{Requirement Signatures}\label{requirement sig} -Just as a generic signature encodes the contract between a generic declaration and its usages, a \IndexDefinition{requirement signature}\emph{requirement signature} is the contract between a protocol and its conforming types. Section~\ref{protocols} enumerated the various ways of writing requirements inside a protocol declaration: +Each protocol has a \IndexDefinition{requirement signature}\emph{requirement signature}, which collects the protocol's \index{associated type declaration}associated type declarations, and the \index{associated requirement}associated requirements imposed upon them. This abstracts away the concrete syntax from \SecRef{protocols}. There is a duality between generic signatures and requirement signatures, illustrated with the following diagram that shows each kind of entity with a concrete specimen to remind us of the notation: +\begin{center} +\begin{tabular}{cc} +\toprule +\textbf{Generic signature}&\textbf{Requirement signature}\\ +\midrule +Generic parameters:&Associated types:\\ +\ttgp{0}{1}&\texttt{associatedtype Iterator}\\ +\midrule +Generic requirements:&Associated requirements:\\ +$\ConfReq{\ttgp{0}{1}}{Sequence}$&$\ConfReq{Self.Iterator}{IteratorProtocol}_\texttt{Sequence}$\\ +\bottomrule +\end{tabular} +\end{center} +(Requirement signatures also describe the type alias members of the protocol; these are called \index{protocol type alias}\emph{protocol type aliases}. They're not covered by the formalism described in the next section, but we will discuss them in \SecRef{building rules}.) + +If a generic signature $G$ states a conformance requirement $\ConfReq{T}{P}$, the requirement signature of~\texttt{P} generates additional structure, which we describe informally at first: \begin{itemize} -\item As a constraint type in the protocol's \index{inheritance clause}inheritance clause, which is a way of stating a requirement on the \Index{protocol Self type@protocol \texttt{Self} type}protocol \texttt{Self} type. -\item As a constraint type in the inheritance clause of some \index{associated type declaration}associated type \texttt{A}, which similarly becomes a requirement on the dependent member type \texttt{Self.[P]A}. -\item In the \Index{where clause@\texttt{where} clause}trailing \texttt{where} clause, either on an associated type, or equivalently the protocol itself, which allows stating arbitrary requirements. +\item For every associated type \texttt{A} of \texttt{P}, we can talk about the dependent member type \texttt{T.[P]A}, which abstracts over the \index{type witness}type witness in the conformance. +\item Because the associated requirements are satisfied by every concrete conforming type, they also hold ``in relation'' to the abstract type parameter~\texttt{T}. \end{itemize} +To interpret a generic signature, we must also consult the requirement signatures of some set of protocols that the generic signature ``depends on.'' The question of \emph{which} protocols exactly is settled in \SecRef{protocol component}, but for now, we can take our generic signatures to exist in the universe of all protocols visible to name lookup, declared in the source program and all serialized modules. This set of protocols is always finite. -A requirement signature collects all of the above requirements together. Just like with generic signatures, the requirements in a requirement signature are always in a minimal form. Unlike a generic signature though, there is no list of generic parameters; the only generic parameter is the implicit protocol \texttt{Self} type. +Remember how the \index{protocol generic signature}generic signature of a protocol~\texttt{P} just has a single conformance requirement~$\ConfReq{Self}{P}$; thus, it is the simplest generic signature that allows us to ``look inside'' the requirement signature of~\texttt{P}. -A concrete type conforming to a protocol must satisfy all requirements of the protocol's requirement signature: -\begin{enumerate} -\item The conforming type must declare a \emph{type witness} for each associated type. -\item The conforming type must conform to any \index{inherited protocol}inherited protocols, which are encoded as conformance requirements on \texttt{Self}. -\item Similarly, the conforming type must be a class if the protocol imposes has a superclass or \texttt{AnyObject} requirement on \texttt{Self}. -\item Finally, if the subject type of a requirement is not \texttt{Self}, it must be a dependent member type. These requirements must be satisfied by the type witnesses of the conforming type. -\end{enumerate} -We'll have a lot to say about the representation of conformances, how they store type witnesses, and so on, in Chapter~\ref{conformances}. It is also worth mentioning that each one of the above checks (except for the first one concerning the existence of type witnesses) is actually an instance of the more general problem of checking whether concrete types satisfy generic requirements, which is something we'll cover in detail in Section~\ref{checking generic arguments}. +The \IndexFlag{debug-generic-signatures}\texttt{-debug-generic-signatures} frontend flag also prints the requirement signature of each protocol as its type checked. While we print a requirement signature as a generic signature with the single generic parameter \texttt{Self}, requirement signatures and generic signatures are distinct in theory and implementation. -Again, the requirement signature of a protocol defines a contract. One side is the concrete conforming type. The other side is a \emph{conformance requirement in a generic signature}. Let's look at the simplest case. The generic signature of a protocol, say \texttt{Sequence}, always has a single generic parameter \texttt{Self} together with a single conformance requirement $\ConfReq{Self}{Sequence}$: +Our examples often use the fact that the \texttt{Sequence} protocol states two associated requirements: +\begin{gather*} +\ConfReq{Self.Iterator}{IteratorProtocol}_\texttt{Sequence}\\ +\SameReq{Self.Element}{Self.Iterator.Element}_\texttt{Sequence} +\end{gather*} + +\begin{example}\label{motivating derived reqs} +To motivate the formal system in the next section, we consider how the associated requirements of a protocol can manifest in a generic signature. Here is a generic function that states a conformance requirement to \texttt{Sequence}---two, in fact: +\begin{Verbatim} +func firstTwoEqual(_ s1: S1, _ s2: S2) + where S1.Element == S2.Element, S1.Element: Equatable { + var iter1 = s1.makeIterator() + var iter2 = s2.makeIterator() + return iter1.next()! == iter2.next()! +} +\end{Verbatim} +Instead of the \index{sugared type}sugared \index{generic parameter types}generic parameter types ``\texttt{S1}'' and ``\texttt{S2}'', we will use canonical types to emphasize that type sugar has no semantic effect. Here is the generic signature of \texttt{firstTwoEqual()}: \begin{quote} \begin{verbatim} - +<τ_0_0, τ_0_1 where τ_0_0: Sequence, τ_0_1: Sequence, + τ_0_0.Element: Equatable, + τ_0_0.Element == τ_0_1.Element> \end{verbatim} \end{quote} -The requirement signature of \texttt{Sequence}, on the other hand, actually encodes information about the protocol: +We expect roughly the following to happen while type checking the function body: +\begin{enumerate} +\item ``\texttt{s1}'' has type $\rT$, which conforms to \texttt{Sequence}, so it has a \texttt{makeIterator()} method that we can call. +\item This method returns \texttt{Self.Iterator}. We substitute \texttt{Self} with $\rT$, and conclude that ``\texttt{iter1}'' has type \texttt{\rT.Iterator}. +\item Similarly, ``\texttt{iter2}'' has type \texttt{\rU.Iterator}. +\end{enumerate} +Then, consider the sub-expression ``\verb|iter1.next()!|'': +\begin{enumerate} +\item The requirement signature of \texttt{Sequence} tells us that \texttt{\rT.Iterator} conforms to \texttt{IteratorProtocol}, so we know it has a \texttt{next()} method. +\item This method returns \texttt{Optional}, and we substitute \texttt{Self} with \texttt{\rT.Iterator} to get \texttt{Optional<\rT.Iterator.Element>}. +\item The forced unwrap expression has type \texttt{\rT.Iterator.Element}. +\end{enumerate} +A similar analysis shows that ``\verb|iter2.next()!|'' has type \texttt{\rU.Iterator.Element}. We now look at the interface type of the \texttt{==} operator in the \texttt{Equatable} protocol: \begin{quote} \begin{verbatim} - + (Self, Self) -> Bool \end{verbatim} \end{quote} -Intuitively, the conformance requirement $\ConfReq{Self}{Sequence}$ in the generic signature of \texttt{Sequence} applies all requirements written in the requirement signature of \texttt{Sequence}, even though they're not written down in the generic signature. While this might seem pointless at first---why not just encode all of these requirements in the \emph{generic} signature of \texttt{Sequence}?---the next section will make it apparent what's going on. +We see that for the expression ``\verb|iter1.next()! == iter2.next()!|'' to type check, the compiler must convince itself that these two requirements below are implied by the explicit requirements of our generic signature (together with the requirement signature of \texttt{Sequence}): +\begin{gather*} +\SameReq{\rT.Iterator.Element}{\rU.Iterator.Element}\\ +\ConfReq{\rT.Iterator.Element}{Equatable} +\end{gather*} +We will formalize what this actually means first, and eventually describe a \index{decision procedure}\emph{decision procedure}. It will answer affirmatively for either of the above two requirements, but report a negative answer for the below requirement for example, because it's not a consequence of our signature: +\begin{gather*} +\ConfReq{\rT.Iterator}{Equatable} +\end{gather*} +Our decision procedure will also tell us that \texttt{\rT.Iterator.Element} is a valid type parameter, but something like \texttt{\rT.Element.Iterator} is not. +\end{example} -\paragraph{Debugging} -The \texttt{-debug-generic-signatures} frontend flag also prints the requirement signature of each protocol that is type checked. The written representation of a requirement signature looks like a generic signature over the protocol's single \texttt{Self} generic parameter. We use the same printed representations for both requirement signatures and generic signatures, but they are not interchangeable. A requirement signature is almost never going to be a valid generic signature, because the conformance requirement on \texttt{Self} is implicit in the requirement signature. +\section{Derived Requirements}\label{derived req} -\paragraph{Protocol type aliases} -\index{protocol type alias} -Requirement signatures also store a compact description of all protocol type aliases defined within the protocol; these are used when resolving \texttt{where} clause requirements involving subject types that name protocol type aliases (Section~\ref{building rules}). Protocol type aliases are not shown by the \texttt{-debug-generic-signatures} flag. +This section and the one that follows will define the \IndexDefinition{derived requirement}\emph{derived requirements} and \IndexDefinition{valid type parameter}\emph{valid type parameters} of a generic signature. This will allow us to precisely state those questions the compiler must be able to answer about generic signatures while type checking generic declarations. -\section{Derived Requirements}\label{derived req} +We do this by working in a system of deductive reasoning, or a \IndexDefinition{formal system}\emph{formal system}, of the sort studied in mathematical logic \cite{curry}. We define our formal system in relation to a fixed generic signature~$G$, so each generic signature has its own corresponding formal system. A formal system has three constituent parts, which we describe somewhat informally: +\begin{enumerate} +\item A description of the possible \emph{statements} in the formal system, generated by finite combination from a fixed set of symbols. -\cite{combinatory} +Our statements are requirements, such as $\ConfReq{\rT.Element}{Equatable}$, and type parameters, such as \texttt{\rU.Iterator}; their syntactic structure is quite simple, as we've already seen. +\item A finite set of \IndexDefinition{elementary statement}\emph{elementary statements} which are assumed to be true. -What \emph{exactly} are these mysterious \index{dependent member type}dependent member types, the \IndexDefinition{type parameter}type parameters that are not top-level generic parameters? The definition offered so far---a dependent member type stores a base type parameter together with a reference to an identifier or associated type declaration---is unsatisfying, for several reasons. Primarily, it does not address the question of which \IndexDefinition{valid type parameter}dependent member types are valid, or where they come from. It turns out the answer to this question is intertwined with the notion of a \IndexDefinition{derived requirement}\emph{derived requirement}. +Our elementary statements are the \index{generic parameter type}generic parameters and \index{expliit requirement}explicit requirements of our generic signature~$G$; all generic parameters are valid type parameters, and all explicit requirements are trivially ``derived.'' -Just like type parameters are more general than generic parameters, we can talk about requirements that are ``known to be true'' as a more general concept than the minimal requirements directly stated in the generic signature. This section will define a set of \emph{derivation rules} which start with a base set of assumptions---the generic parameters and minimal requirements of a generic signature---and prove new type parameters and derived requirements. Understanding this formalism will motivate much of the rest of the book. +\item A specification of the \IndexDefinition{inference rule}\emph{inference rules} for deriving new statements from previous statements, in a manner that depends only on their formal syntactic structure. -Let's take the following generic signature (it's a canonical signature, hence the lack of generic parameter names, but that doesn't really matter): -\begin{quote} -\begin{verbatim} -<τ_0_0, τ_0_1 where τ_0_1: Sequence, - τ_0_0 == τ_0_1.[Sequence]Element> -\end{verbatim} -\end{quote} -What are the type parameters of this signature? We can start with the generic parameter types \ttgp{0}{0} and \ttgp{0}{1}. These appear directly in the generic signature so we can write them down without further justification. Let's introduce some new notation: -\begin{gather*} -\vdash\ttgp{0}{0}\\ -\vdash\ttgp{0}{1} -\end{gather*} -This symbol \index{$\vdash$}\index{$\vdash$!z@\igobble|seealso{derived requirement}}$\vdash$ means that we proved the thing on the right, from the assumptions on the left (this is sometimes called the \IndexDefinition{turnsile operator}``turnsile operator''). In the above, the validity generic parameter types follows immediately from ``first principles''; so there is nothing on the left side of the $\vdash$. In general, the assumptions must be facts previously proved. - -To prove the existence of other type parameters, we need to use the conformance requirement $\ConfReq{\ttgp{0}{1}}{Sequence}$. This is one of the minimal requirements of the generic signature, so again we just re-state it, but we'll give it a number so that we don't have to ``prove'' it again: -\begin{gather} -\vdash\ConfReq{\ttgp{0}{1}}{Sequence}\tag{1} -\end{gather} -So far this is just abstract nonsense, but here's the trick. The \texttt{Sequence} protocol declares two associated types, and we know \ttgp{0}{1} conforms to \texttt{Sequence}. This gives us a pair of dependent member types: +We will describe inference rules for requirements and type parameters shortly. +\end{enumerate} + +The \IndexDefinition{theory}\emph{theory} of a formal system is the set of all statements we can prove within the system. This always includes the elementary statements, but it should not include \emph{all} possible statements. (A theory where \emph{everything} is true explains nothing.) + +The theory of a generic signature is the set union of its derived requirements and valid type parameters. To demonstrate membership in this set, we write down a \IndexDefinition{derivation}\index{derivation|seealso{derived requirement}}\emph{derivation}---a finite sequence of steps that give a constructive proof of our desired statement as a consequence of the formal system: +\begin{enumerate} +\item A derivation necessarily begins by deriving one or more elementary statements. +\item This is followed by zero or more steps that derive new statements from previous statements via inference rules. +\item The derivation ends with the final step proving the conclusion. (However, we can also think of a derivation as proving multiple things, if we take \emph{all} conclusions from each intermediate step.) +\end{enumerate} +There can be many ways to prove the same thing, and we allow ``useless'' steps whose conclusions are unused, so derivations are not unique in general. The point of writing down a derivation is that we can \emph{check} the reasoning at each step, and convince ourselves the conclusion is a true statement in the theory. + +A remark for practitioners. We use our formal system to \emph{specify} various behaviors of the implementation, but the compiler itself does not directly encode derivations as data structures. (We learn in \ChapRef{symbols terms rules} that we reason about derived requirements and valid type parameters by translating the problem into a \emph{string rewrite system}.) + +\paragraph{Notation.} A \IndexDefinition{derivation step}derivation step will always be written on a single line, with the conclusion first; then on the right-hand side we write the ``kind'' of derivation step in small caps, followed by the list of assumptions. The conclusion and assumptions are statements: +\[\textsl{conclusion}\tag{\textsc{Kind} \textsl{assumption}}\] +An \emph{elementary derivation step} has no assumptions, so it proves an elementary statement. Any other kind of derivation step applies an inference rule to one or more assumptions, which are themselves conclusions of previous steps. When discussing a specific derivation step, we write the assumptions in place. When listing a derivation, we prefer to be concise so we \emph{number} each step and refer to the conclusion of a prior step by number: \begin{gather*} -(1)\vdash\texttt{\ttgp{0}{1}.[Sequence]Iterator}\\ -(1)\vdash\texttt{\ttgp{0}{1}.[Sequence]Element} +1.\ \textsl{an elementary statement of ``Pooh'' sort}\tag{\textsc{Pooh}}\\ +2.\ \textsl{a consequence of the above by the ``Piglet'' principle}\tag{\textsc{Piglet} 1} \end{gather*} -This time, our derivations actually made use of a previously-proven fact, albeit still in a rather trivial way. For the next step, we recall that the requirement signature of \texttt{Sequence} conforms \texttt{Self.[Sequence]Iterator} to \texttt{IteratorProtocol}. Just like the minimal requirements of a generic signature, the minimal requirements of a requirement signature can be immediately stated: -\begin{gather} -\vdash\ConfReq{Self.[Sequence]Iterator}{IteratorProtocol}_\texttt{Sequence}\tag{2} -\end{gather} -The above requirement applies to the requirement signature of \texttt{Sequence}. If we replace \texttt{Self} with \ttgp{0}{1} in (2), we get a derived requirement for our generic signature. This derivation is valid, because we established that \ttgp{0}{1} conforms to \texttt{Sequence} in (1). Thus, -\begin{gather} -(1),\,(2)\vdash\ConfReq{\ttgp{0}{1}.[Sequence]Iterator}{IteratorProtocol}\tag{3} -\end{gather} -Now, \texttt{IteratorProtocol} declares an associated type named \texttt{Element}, so we can derive a third type parameter: -\[(3)\vdash\texttt{\ttgp{0}{1}.[Sequence]Iterator.[IteratorProtocol]Element}\] -This last derivation is now non-trivial---it is a consequence of two requirements, one in our generic signature and one in the requirement signature of \texttt{Sequence}. Let's do one more. We begin by recalling the same-type requirement in our generic signature: -\begin{gather} -\vdash\FormalReq{\ttgp{0}{0} == \ttgp{0}{1}.[Sequence]Element}\tag{4} -\end{gather} -We also make use of that same-type requirement in the \texttt{Sequence} protocol; to avoid overfull hboxes, let's abbreviate the protocol qualification in those dependent member types: -\begin{gather} -\vdash\FormalReq{Self.[S]Element == Self.[S]Iterator.[I]Element}_\texttt{Sequence}\tag{5} -\end{gather} -Let's use this requirement signature requirement to derive a same-type requirement in our generic signature: -\begin{gather} -(1),\,(4)\vdash\FormalReq{\ttgp{0}{1}.[S]Element == \ttgp{0}{1}.[S]Iterator.[I]Element}\tag{6} -\end{gather} -Notice that the right hand side of $(5)$ is identical to the left hand side of $(6)$. We can derive one more same-type requirement: -\begin{gather} -(5),\,(6)\vdash\FormalReq{\ttgp{0}{0} == \ttgp{0}{1}.[S]Iterator.[I]Element}\tag{7} -\end{gather} -Requirement (7) is something which might be intuitively obvious, but it was not written down anywhere, and in fact the derivation makes use of \emph{every} requirement from both the generic signature itself and the requirement signature of \texttt{Sequence}. +We use the ``turnsile operator'' $\vdash$ as a predicate. If $G$ is a generic signature and $D$ is a requirement or type parameter, \index{$\vdash$}\index{$\vdash$!z@\igobble|seealso{derivation}}$G\vdash D$ means $D$ is an element of the theory of $G$, as in ``if $G\vdash D$, then \ldots''. -We'll now enumerate all derivation rules. It is important to realize that we're working in a single generic signature, but we can make use of requirements from the requirement signature of multiple protocols. +Our formal system has 6 kinds of elementary statements and 17 inference rules; this section together with the next will explain them all, with examples in-between: +\begin{enumerate} +\item We start with \index{conformance requirement}conformance requirements, and \index{same-type requirement}same-type requirements between type parameters. The theory completely describes the behavior of this subset of the language. We will also sketch out the remaining requirement kinds. +\item The next section defines some more inference rules so that the derived same-type requirements define an equivalence relation on type parameters. We will study the equivalence classes generated by this relation. +\item We initially only consider type parameters formed from unbound dependent member types. The next section will also extend our theory to explain bound dependent member types, but this won't reveal anything fundamentally new. +\end{enumerate} -\paragraph{Initial derivations} -Every \index{generic parameter type}generic parameter in a generic signature is immediately a valid type parameter for this generic signature, with no assumptions: -\[\vdash\ttgp{d}{i}\] -\IndexStepDefinition{GenSig}Every minimal requirement of our generic signature can be immediately derived: +\paragraph{Elementary statements.} +Let $G$ be a generic signature. We can \IndexStepDefinition{Generic}derive each generic parameter $\ttgp{d}{i}$ of $G$: \begin{gather*} -\vdash\ConfReq{T}{P}\tag{\textsc{GenSig}}\\ -\vdash\ConfReq{T}{C}\\ -\vdash\ConfReq{T}{AnyObject}\\ -\vdash\FormalReq{T == U} +\GenericStepDef \end{gather*} - -\paragraph{Dependent member types} \IndexStepDefinition{AssocType}All other type parameters are dependent member types, and they owe their existence to conformance requirements. If we have a \index{conformance requirement}conformance requirement $\ConfReq{T}{P}$, and the protocol \texttt{P} declares an \index{associated type declaration}associated type named \texttt{A}, we get a pair of valid type parameters from this conformance requirement. These are the \index{unbound dependent member type}unbound and \index{bound dependent member type}bound \index{dependent member type}dependent member type for \texttt{A} with base type \texttt{T}. We also want them to be equivalent, so we derive a same-type requirement to that effect: +Let $\ConfReq{T}{P}$ be an explicit \index{conformance requirement}conformance requirement of $G$, so \texttt{T} is some type parameter and \texttt{P} is a protocol. We can \IndexStepDefinition{Conf}derive this requirement: \begin{gather*} -\ConfReq{T}{P}\vdash\texttt{T.A}\tag{\textsc{AssocType}}\\ -\ConfReq{T}{P}\vdash\texttt{T.[P]A}\\ -\ConfReq{T}{P}\vdash\FormalReq{T.[P]A == T.A} +\ConfStepDef +\end{gather*} +Let $\SameReq{T}{U}$ be an explicit \index{same-type requirement}same-type requirement of $G$, with \texttt{T}~and~\texttt{U} type parameters. We can \IndexStepDefinition{Same}derive this requirement: +\begin{gather*} +\SameStepDef \end{gather*} -\paragraph{Requirement signatures} -\IndexStepDefinition{ReqSig} For every protocol \texttt{P}, every minimal requirement of the requirement signature of \texttt{P} can be immediately derived; they are annotated with their protocol, making them distinct from other requirements of other protocols: +\paragraph{Requirement signatures.} +We now assume we have derived a specific conformance requirement $\ConfReq{T}{P}$ for some type parameter~\texttt{T} and protocol~\texttt{P}. Each element of the requirement signature of \texttt{P} defines an inference rule that generates a new statement: +\begin{center} +$\ConfReq{T}{P} + \text{requirement signature of \texttt{P}} = \text{more statements}$ +\end{center} +The three kinds of inference rules are \textsc{AssocName}, \textsc{AssocConf}, and \textsc{AssocSame}. They correspond to requirement signature elements in the same way that the \textsc{Generic}, \textsc{Conf} and \textsc{Same} elementary statements correspond to generic signature elements. + +From each \index{associated type declaration}associated type declaration~\texttt{A} of~\texttt{P}, the \IndexStepDefinition{AssocName}\textsc{AssocName} inference rule derives the \index{unbound dependent member type}unbound \index{dependent member type}dependent member type \texttt{T.A}, with base type~\texttt{T} and identifier~\texttt{A}. Notice how this step is a consequence of its assumption, the assumption being the original conformance requirement: \begin{gather*} -\vdash\ConfReq{Self.U}{P}_\texttt{P}\tag{\textsc{ReqSig}}\\ -\vdash\ConfReq{Self.U}{C}_\texttt{P}\\ -\vdash\ConfReq{Self.U}{AnyObject}_\texttt{P}\\ -\vdash\FormalReq{Self.U == Self.V}_\texttt{P} +\AssocNameStepDef \end{gather*} -These requirements, all rooted in the \Index{protocol Self type@protocol \texttt{Self} type}protocol \texttt{Self} type, come into play if we are also able to derive a conformance requirement $\ConfReq{T}{P}$. In fact, while any requirement signature requirement can be immediately derived as above, there is no way to make use of one unless one can prove a conformance requirement first. The idea of which protocols we can ``reach'' comes up again in Section~\ref{recursive conformances} and \ref{protocol component}. -\IndexStepDefinition{Conf} Suppose then we have our conformance requirement. We can combine it with a requirement signature requirement, ``rebasing'' the requirement signature requirement from the \texttt{Self} type to the conformance requirement's subject type \texttt{T}: +From each \index{associated conformance requirement}associated conformance requirement of~\texttt{P}, the \IndexStepDefinition{AssocConf}\textsc{AssocConf} inference rule derives the conformance requirement obtained by substituting~\texttt{T} in place of \Index{protocol Self type@protocol \texttt{Self} type}\texttt{Self}. An arbitrary \index{associated conformance requirement}associated conformance requirement takes the form $\ConfReq{Self.U}{Q}_\texttt{P}$ for some type parameter \texttt{Self.U} and protocol~\texttt{Q}. The substituted type parameter we described above is denoted by ``\texttt{T.U}'': \begin{gather*} -\ConfReq{T}{P},\,\ConfReq{Self.U}{Q}\vdash\FormalReq{T.U:~Q}\tag{\textsc{Conf}}\\ -\ConfReq{T}{P},\,\ConfReq{Self.U}{C}\vdash\FormalReq{T.U:~C}\\ -\ConfReq{T}{P},\,\ConfReq{Self.U}{AnyObject}\vdash\FormalReq{T.U:~AnyObject}\\ -\ConfReq{T}{P},\,\FormalReq{Self.U == Self.V}\vdash\FormalReq{T.U == T.V} +\AssocConfStepDef \end{gather*} -The \texttt{Self.U} can be a dependent member type nested to any depth, so really \texttt{Self.A.B}, and so on. The construction of \texttt{T.U} from \texttt{Self.U} and \texttt{T} is not completely trivial; we will show in Section~\ref{contextsubstmap} that this is understood as applying a \index{protocol substitution map}\emph{protocol substitution map} to \texttt{Self.U}. -\paragraph{Same-type requirements} -From every valid type parameter \texttt{T}, we \IndexStepDefinition{Equiv}derive a vacuous \index{same-type requirement}same-type requirement $\FormalReq{T == T}$ stating that the type parameter is equivalent to itself. While this doesn't give us anything new, it provides the justification for considering such a requirement as redundant if written by the user: -\[\texttt{T}\vdash\FormalReq{T == T}\tag{\textsc{Equiv}}\] -We can derive a new requirement from a same-type requirement by flipping the two types around: -\[\FormalReq{T == U}\vdash\FormalReq{U == T}\] -Same-type requirements combine as follows: if we have two same-type requirements where the first type of the second is equal to the second type of the first, we can derive a third same-type requirement relating the other pair of types: -\[\FormalReq{T == U},\,\FormalReq{U == V}\vdash\FormalReq{T == V}\] -A same-type requirement \FormalReq{T == U} also combines with the other requirement kinds. From a requirement with subject type \texttt{U}, we can \IndexStepDefinition{Same}derive a corresponding requirement with subject type \texttt{T}. This is actually three derivations, one for conformance, superclass and layout requirements, respectively: +From each \index{associated same-type requirement}associated same-type requirement of~\texttt{P}, the \IndexStepDefinition{AssocSame}\textsc{AssocSame} inference rule derives a new same-type requirement by substituting \texttt{T} in place of \texttt{Self}. The most general form of an \index{associated same-type requirement}associated same-type requirement is $\SameReq{Self.U}{Self.V}_\texttt{P}$ for type parameters \texttt{Self.U} and \texttt{Self.V}, so we're forming a same-type requirement from the two substituted type parameters ``\texttt{T.U}'' and ``\texttt{T.V}'': \begin{gather*} -\ConfReq{U}{P},\,\FormalReq{T == U}\vdash\ConfReq{T}{P}\tag{\textsc{Same}}\\ -\ConfReq{U}{C},\,\FormalReq{T == U}\vdash\ConfReq{T}{C}\\ -\ConfReq{U}{AnyObject},\,\FormalReq{T == U}\vdash\ConfReq{T}{AnyObject} +\AssocSameStepDef \end{gather*} -The above derivations only go in one direction, but we don't lose any generality by doing so. Suppose we have $\ConfReq{T}{P}$ and $\FormalReq{T == U}$. We cannot immediately conclude that $\ConfReq{U}{P}$, because the first rule does not apply. However, if we first derive $\FormalReq{U == T}$ by symmetry, we can then derive that $\ConfReq{U}{P}$. -Section~\ref{reducedtypes} develops the idea where two type parameters are essentially equivalent if we can derive a same-type requirement between them. +We will omit the associated requirement from the list of assumptions when listing a derivation in a fixed generic signature, because there is no ambiguity in doing so. + +A few words about the meta-syntactic variables used above, such as ``\texttt{T}'' and ``\texttt{U}'' and so on. In the conclusion of an elementary derivation step, they specify the most general form of an explicit requirement of~$G$; we're saying that we have a fixed set of elementary derivation steps, and each one can be obtained from the schema by suitable substitution of concrete entities in place of ``\texttt{T}'', ``\texttt{U}'' and ``\texttt{P}''. When meta-syntactic variables appear in the assumptions of a derivation step as above, they mean something else---we're defining the inference rule by pattern matching, and we can make use of the rule as long as a suitable substitution of the meta-syntactic variables matches each assumption to the conclusion of a previous step. -\paragraph{Member types} -\IndexStepDefinition{Member} Conformance requirements and same-type requirements have a further interaction, where conformance to a protocol with an associated type \texttt{A} implies a same-type requirement between the corresponding dependent member types: +The general form of a type parameter inside an associated requirement is denoted by ``\texttt{Self.U}'' because it's a type parameter for the protocol generic signature~$G_\texttt{P}$, so it must be \texttt{Self} recursively wrapped in dependent member types, \emph{zero} or more times. This means that \texttt{Self} itself is a valid substitution for ``\texttt{Self.U}''. This comes up with \index{protocol inheritance}protocol inheritance. If \texttt{Derived} inherits from \texttt{Base}, we have the associated requirement $\ConfReq{Self}{Base}_\texttt{Derived}$. Given a derivation of $\ConfReq{T}{Derived}$, we get a derivation of $\ConfReq{T}{Base}$ by adding an \textsc{AssocConf} derivation step for our associated requirement. + +\begin{example}\label{motivating derived equiv} +The inference rules presented so far can already prove some interesting statements, but we they're not sufficient to explain everything in the informal overview from \ExRef{motivating derived reqs}. Let's revisit the generic signature from that example: +\begin{quote} +\begin{verbatim} +<τ_0_0, τ_0_1 where τ_0_0: Sequence, τ_0_1: Sequence, + τ_0_0.Element == τ_0_1.Element, + τ_0_0.Element: Equatable> +\end{verbatim} +\end{quote} +The \texttt{Sequence} protocol declares two associated types \texttt{Iterator} and \texttt{Element}, and since $\rT$ conforms to \texttt{Sequence}, we can derive \texttt{\rT.Element}, for example: +\begin{gather*} +\ConfStep{\rT}{Sequence}{1}\\ +\AssocNameStep{1}{\rT.Element}{2} +\end{gather*} +We can derive \texttt{\rT.Iterator} similarly. Now consider \texttt{\rT.Iterator.Element}. We recall that \texttt{Sequence} declares two associated requirements: +\begin{gather*} +\ConfReq{Self.Iterator}{IteratorProtocol}_\texttt{Sequence}\\ +\SameReq{Self.Element}{Self.Iterator.Element}_\texttt{Sequence} +\end{gather*} +The first associated requirement allows us to derive that \texttt{\rT.Iterator} conforms to \texttt{IteratorProtocol}, and from there we can derive \texttt{\rT.Iterator.Element}: +\begin{gather*} +\ConfStep{\rT}{Sequence}{1}\\ +\AssocConfStep{1}{\rT.Iterator}{IteratorProtocol}{2}\\ +\AssocNameStep{2}{\rT.Iterator.Element}{3} +\end{gather*} +We can also derive \texttt{\rU.Iterator.Element} if we start from $\ConfReq{\rU}{Sequence}$ instead: +\begin{gather*} +\ConfStep{\rU}{Sequence}{1}\\ +\AssocConfStep{1}{\rU.Iterator}{IteratorProtocol}{2}\\ +\AssocNameStep{2}{\rU.Iterator.Element}{3} +\end{gather*} +Remember our original goal in \ExRef{motivating derived reqs} was to establish a same-type requirement relating \texttt{\rT.Iterator.Element} with \texttt{\rU.Iterator.Element}, and not just to show that these two type parameters exist. We're going to attempt to write down a derivation. The associated same-type requirement of \texttt{Sequence} gives us a same-type requirement between \texttt{\rT.Element} and \texttt{\rT.Iterator.Element}: +\begin{gather*} +\ConfStep{\rT}{Sequence}{1}\\ +\AssocSameStep{1}{\rT.Element}{\rT.Iterator.Element}{2} +\end{gather*} +We can also derive a similar statement about \rU: +\begin{gather*} +\ConfStep{\rU}{Sequence}{3}\\ +\AssocSameStep{1}{\rU.Element}{\rU.Iterator.Element}{4} +\end{gather*} +Now remember we have an explicit same-type requirement also: +\begin{gather*} +\SameStep{\rT.Element}{\rU.Element}{5} +\end{gather*} +At this point, we have derived the three same-type requirements (2), (4) and (5), but no apparent way to conclude anything else: \begin{gather*} -\ConfReq{U}{P},\,\FormalReq{T == U}\vdash\FormalReq{T.A == U.A}\tag{\textsc{Member}}\\ -\ConfReq{U}{P},\,\FormalReq{T == U}\vdash\FormalReq{T.[P]A == U.[P]A} -\end{gather*} -This has the intuitive interpretation that collapsing two type parameters into a single equivalence class also collapses the corresponding member types. Once we develop type substitution in Chapter~\ref{substmaps}, this derivation will also make sense when one of the two sides of the original same-type requirement is a concrete type. - -\paragraph{Concrete decomposition} -\IndexStepDefinition{Concrete} The theory of requirements involving concrete types is more complex. If we derive a same-type requirement where both sides are concrete, we can break down the requirement into smaller requirements. +\SameReq{\rT.Element}{\rT.Iterator.Element}\\ +\SameReq{\rU.Element}{\rU.Iterator.Element}\\ +\SameReq{\rT.Element}{\rU.Element} +\end{gather*} +We put this example aside until the next section, when we introduce more inference rules for working with same-type requirements between type parameters. +\end{example} -For example, from $\FormalReq{T == Array}$ and $\FormalReq{T == Array}$, we can derive the funny requirement $\FormalReq{Array == Array}$ using the same-type requirement derivations above. This requirement has concrete types with identical structure on both sides. We would like to derive $\FormalReq{U == Int}$ by collapsing the parallel structure: +\paragraph{Other requirements.} +We will now extend our formal system to cover concrete same-type requirements, superclass requirements and layout requirements, with the caveat that this extended theory is \index{limitation!derived requirements}incomplete. All inferences made by these rules are correct, but they do not describe all implemented behaviors of these other requirement kinds. We will see examples in \SecRef{genericsigqueries} and \SecRef{minimal requirements}. + +We add some more elementary statements. Let $G$ be a generic signature. \IndexStepDefinition{Concrete}If $\SameReq{T}{X}$ is an explicit \index{same-type requirement}same-type requirement of~$G$, where~\texttt{T} is a type parameter and~\texttt{X} is a concrete type (so \texttt{X} is not a type parameter but it may contain type parameters): +\begin{gather*} +\ConcreteStepDef +\end{gather*} +\IndexStepDefinition{Super}If $\ConfReq{T}{C}$ is an explicit \index{superclass requirement}superclass requirement of~$G$, so \texttt{C} is a \index{class type}class type: +\begin{gather*} +\SuperStepDef +\end{gather*} +\IndexStepDefinition{Layout}If $\ConfReq{T}{AnyObject}$ is an explicit \index{layout requirement}layout requirement of~$G$: \begin{gather*} -\FormalReq{Array == Array}\vdash\FormalReq{T == U}\tag{\textsc{Concrete}} +\LayoutStepDef \end{gather*} -In full generality, any structural or nominal type to be decomposed in this manner; an algorithm is presented in Section~\ref{requirement desugaring}. There might also be more than one way to decompose parallel structure, so this rule potentially derives more than one requirement. For example, -\begin{gather} -\ldots\vdash\FormalReq{((U) -> V) == ((String) -> Int)}\tag{1}\\ -(1)\vdash\FormalReq{U == String}\tag{2}\\ -(1)\vdash\FormalReq{V == Int}\tag{3} -\end{gather} +We also have inference rules for the \index{associated requirement}associated requirement forms of the above, so in what follows, we assume we can derive $\ConfReq{T}{P}$ for some \texttt{T}~and~\texttt{P}. -\paragraph{Concrete embedding} -If we start with $\FormalReq{T == Array}$ and $\FormalReq{U == Int}$, we want to conclude that $\FormalReq{T == Array}$, but we missing one more rule. We need to go in the other direction too, building up structure around both sides of a simpler same-type requirement: +Suppose that \texttt{P} declares an associated same-type requirement $\SameReq{Self.U}{X}_\texttt{P}$, for some \texttt{Self.U} and concrete type~\texttt{X}. We substitute \texttt{T} for \texttt{Self} inside \texttt{Self.U} to get a type parameter \texttt{T.U}, and inside \texttt{X} to get a concrete type~$\texttt{X}^\prime$. We can then \IndexStepDefinition{AssocConcrete}derive: \begin{gather*} -\FormalReq{T == U}\vdash\FormalReq{G == G} +\AssocConcreteStepDef \end{gather*} -With this rule, we could derive $\FormalReq{Array == Array}$ from $\FormalReq{U == Int}$, and finally combine it with the other same-type requirement to get $\FormalReq{T == Array}$. +For an associated \IndexStepDefinition{AssocSuper}superclass requirement $\ConfReq{Self.U}{C}_\texttt{P}$, we substitute \texttt{T} for \texttt{Self} in \texttt{C} to get $\texttt{C}^\prime$: +\begin{gather*} +\AssocSuperStepDef +\end{gather*} +Finally, an \IndexStepDefinition{AssocLayout}associated layout requirement does not have anything to substitute on the right-hand side, so we derive the same statement about \texttt{T.U}: +\begin{gather*} +\AssocLayoutStepDef +\end{gather*} + +\pagebreak + +At this point, we've seen all of the kinds of elementary statements that exist, but a few more inference rules remain to be defined. Here is a quick summary of all the elementary statements and inference rules we've seen so far: +\begin{center} +\begin{tabular}{ccc} +\toprule +\multicolumn{3}{c}{\textbf{Basic model:}}\\ +\textsc{Generic}&\textsc{Conf}&\textsc{Same}\\ +\textsc{AssocName}&\textsc{AssocConf}&\textsc{AssocSame}\\ +\midrule +\multicolumn{3}{c}{\textbf{Other requirements:}}\\ +\textsc{Concrete}&\textsc{Super}&\textsc{Layout}\\ +\textsc{AssocConcrete}&\textsc{AssocSuper}&\textsc{AssocLayout}\\ +\bottomrule +\end{tabular} +\end{center} -\paragraph{There's more} -Requirements with concrete subject type are discussed in Section~\ref{checking generic arguments}. Superclass requirements need several derivation rules we won't talk about until Chapter~\ref{classinheritance}. +\section{Valid Type Parameters}\label{type params} -\section{The Type Parameter Order}\label{typeparams} +In \ChapRef{types}, we discussed how \index{type}types containing \index{type parameter}type parameters have two kinds of equality. \index{canonical type equality}Canonical type equality tells us if the type parameters are spelled in the same way, and \index{reduced type equality}\emph{reduced type equality}, relative to a generic signature, also takes \index{same-type requirement}same-type requirements into account. We are now in a position to define reduced type equality on type parameters. \SecRef{genericsigqueries} will generalize it to all interface types. -\index{type parameter order}% -\index{conformance requirement}% -\index{associated conformance requirement}% -\index{witness table}% -\index{mangling}% -\IndexDefinition{type parameter order}% -The type parameters of a generic signature are linearly ordered with respect to each other. This linear order defines reduced types and reduced type equality, and plays an important role in requirement minimization. It also surfaces directly in the Swift \index{ABI}ABI: -\begin{enumerate} -\item The calling convention of a generic function passes a witness table for each protocol conformance requirement in the function's generic signature. The conformance requirements are ordered by comparing their subject type. -\item The in-memory layout of a witness table is determined by the requirement signature of the protocol, with each associated conformance requirement corresponding to an entry that points at some other witness table. The associated conformance requirements are again ordered by comparing their subject type. -\item The mangled symbol names of generic functions encode their parameter and return types. If those types contain type parameters, the type parameters are reduced. -\end{enumerate} -\index{total order|see{linear order}} -Let's begin by first defining partial orders, and then linear orders as a special kind of partial order. Swift programmers will recognize the \texttt{Comparable} protocol as abstracting over types that have a linear order. For a more thorough treatment of relations and orders, consult a discrete mathematics textbook like \cite{grimaldi}. +Reduced type equality is an \emph{equivalence relation}, so we begin by reviewing this idea in an abstract setting. Given a fixed set that we call the \index{domain}\emph{domain}, a relation models some property that a pair of elements might have. In programming languages, ``relational operators'' are typically functions taking a pair of values to a true or false result. In mathematics, we instead imagine that a relation is the set of those \index{ordered pair}ordered pairs $(x,y)$ such that the property is true of $x$ and $y$. \begin{definition}\label{def relation} -A \IndexDefinition{relation}\emph{relation} on a \index{set}set $S$ is a \index{subset}subset of the \index{Cartesian product}Cartesian product $R\subseteq S\times S$ (so the elements of $R$ are \index{ordered pair}ordered pairs). A special kind of relation is a \IndexDefinition{partial order}\emph{partial order}. We say that $R$ is a partial order if it is anti-reflexive and transitive: +Let $S$ be a \index{set}set. A \IndexDefinition{relation}\emph{relation} with domain $S$ is a \index{subset}subset of the \index{Cartesian product}Cartesian product $S\times S$. +\end{definition} + +\begin{definition} +An \IndexDefinition{equivalence relation}\emph{equivalence relation} $R\subseteq S\times S$ is reflexive, symmetric and transitive: \begin{itemize} -\item $R$ is \emph{anti-reflexive} if $(x,x)\not\in R$ for any $x\in S$. -\item $R$ is \IndexDefinition{transitive relation}\emph{transitive} if whenever $(x,y)\in R$ and $(y,z)\in R$, then $(x,z)\in R$. +\item $R$ is \IndexDefinition{reflexive relation}\emph{reflexive} if $(x,x)\in R$ for all $x\in S$. +\item $R$ is \IndexDefinition{symmetric relation}\emph{symmetric} if $(x,y)\in R$ implies that $(y,x)\in R$ for all $x$, $y\in S$. +\item $R$ is \IndexDefinition{transitive relation}\emph{transitive} if $(x,y)$, $(y,z)\in R$ implies that $(x,z)\in R$ for all $x$, $y$, $z\in S$. \end{itemize} -If the partial order relation $R$ is understood from context, we can write $xx$ or $x=y$ simultaneously. -\begin{algorithm}[Generic parameter order]\label{generic parameter order} \IndexDefinition{generic parameter order}Takes \index{generic parameter type}generic parameter types \ttgp{d}{i} and \ttgp{D}{I} as input, where that all four of \texttt{d}, \texttt{i}, \texttt{D} and \texttt{I} are non-negative intgers. Returns one of ``$<$'', ``$>$'' or ``$=$'' as output. +\begin{proposition} +Let $R$ be an equivalence relation with some domain $S$. Every element of $S$ belongs to exactly one equivalence class of $R$. +\end{proposition} +\begin{proof} +The phrase ``exactly one'' is actually a shorthand for these two statements: \begin{enumerate} -\item If $\texttt{d}<\texttt{D}$, return ``$<$''. -\item If $\texttt{d}>\texttt{D}$, return ``$>$''. -\item If $\texttt{d}=\texttt{D}$ and $\texttt{i}<\texttt{I}$, return ``$<$''. -\item If $\texttt{d}=\texttt{D}$ and $\texttt{i}>\texttt{I}$, return ``$>$''. -\item If $\texttt{d}=\texttt{D}$ and $\texttt{i}=\texttt{I}$, return ``$=$''. +\item Every $x\in S$ belongs to \emph{at least} one equivalence class. +\item Every $x\in S$ belongs to \emph{at most} one equivalence class, so if $x\in\EquivClass{y}$ and $x\in\EquivClass{z}$ for some $y$, $z\in S$, then in fact $\EquivClass{y}=\EquivClass{z}$ (this is equality of sets, meaning they have the same elements). \end{enumerate} -\end{algorithm} -\IndexDefinition{protocol order}% -\begin{algorithm}[Protocol order] \label{linear protocol order} Takes protocols \texttt{P} and \texttt{Q} as input, and returns one of ``$<$'', ``$>$'' or ``$=$'' as output. + +For the first part, the assumption that $R$ is reflexive says that $(x,x)\in R$ for all $x\in S$. By definition of $\EquivClass{x}$, this means that $x\in\EquivClass{x}$, so every $x$ is, at the very least, an element of its \emph{own} equivalence class $\EquivClass{x}$. + +For the second part, assume that for some $y$, $z\in S$, there exists an $x\in\EquivClass{y}\cap\EquivClass{z}$. To show that $\EquivClass{y}=\EquivClass{z}$, we prove that $\EquivClass{y}\subseteq\EquivClass{z}$ and $\EquivClass{z}\subseteq\EquivClass{y}$ separately. For the ``$\subseteq$'' direction, we see that if we take an arbitrary element $t\in\EquivClass{y}$, we can chase a series of equivalences to derive that $t\in\EquivClass{z}$: \begin{enumerate} -\item Compare the names of the modules of \texttt{P} and \texttt{Q} lexicographically. Return the result if it is ``$<$'' or ``$>$''. Otherwise, both are defined in the same module, so keep going. -\item Compare the names of \texttt{P} and \texttt{Q} lexicographically and return the result. +\item $t\in\EquivClass{y}$ and $x\in\EquivClass{y}$, so $(t,y)$, $(x,y)\in R$. +\item $(x,y)\in R$, but $R$ is symmetric, so $(y,x)\in R$. +\item $(t,y)$, $(y,x)\in R$, but $R$ is transitive, so $(t,x)\in R$. +\item $(t,x)$, $(x,z)\in R$, but $R$ is transitive, so $(t,z)\in R$; that is, $t\in\EquivClass{z}$. \end{enumerate} -\end{algorithm} +This gives us $\EquivClass{y}\subseteq\EquivClass{z}$. To prove $\EquivClass{z}\subseteq\EquivClass{y}$, we pretend to copy and paste the above, and swap $y$~and~$z$ everywhere that they appear. (We tend to avoid writing out the other direction of the proof in a situation like this where it is completely mechanical.) + +We've shown that if two equivalence classes of $S$ have at least one element in common, they must in fact coincide. The only other possibility is that the two equivalence classes are disjoint, that is, their \index{intersection}intersection is the empty set. Note that we made use of all three defining properties of an equivalence relation; the result no longer holds if we relax either assumption of reflexivity, symmetry or transitivity. +\end{proof} + +\paragraph{Reduced type equality.} We now define an equivalence relation on type parameters. We want to say that two type parameters are equivalent if we can derive a same-type requirement between them: + +\begin{definition} +Let $G$ be a generic signature. The \IndexDefinition{reduced type equality}\emph{reduced type equality} relation on the valid type parameters of $G$ is the set of all pairs \texttt{T} and \texttt{U} such that $G\vdash\SameReq{T}{U}$. +\end{definition} + +For this to work, we must extend our formal system with three new inference rules, one for each defining axiom of an equivalence relation. The first rule says that if we can derive a valid type parameter~\texttt{T}, we can \IndexStepDefinition{Reflex}derive the trivial \index{same-type requirement}same-type requirement $\SameReq{T}{T}$. Note that we're deriving a \emph{requirement} from a \emph{type parameter}; this is the only time we can do that: +\begin{gather*} +\ReflexStepDef +\end{gather*} +Next, if we can derive a same-type requirement $\SameReq{T}{U}$ for type parameters \texttt{T} and \texttt{U}, we can \IndexStepDefinition{Sym}derive the opposite same-type requirement: +\begin{gather*} +\SymStepDef +\end{gather*} +Finally, if we can derive two same-type requirements $\SameReq{T}{U}$ and $\SameReq{U}{V}$ where the right-hand side of the first is identical to the left-hand side of the second, we can \IndexStepDefinition{Trans}derive a same-type requirement that gets us from one end to the other in one jump: +\begin{gather*} +\TransStepDef +\end{gather*} +Given these new rules, we immediately see: +\begin{proposition} +Reduced type equality is an equivalence relation. +\end{proposition} +\begin{proof} +Each axiom follows from the corresponding inference rule: +\begin{itemize} +\item +(Reflexivity) Suppose that \texttt{T} is a valid type parameter of $G$. Given a derivation $G\vdash\texttt{T}$, we derive $G\vdash\SameReq{T}{T}$ via \IndexStep{Reflex}\textsc{Reflex}. Thus, \texttt{T} is equivalent to itself. +\item +(Symmetry) Suppose that \texttt{T} is equivalent to \texttt{U}. Given a derivation $G\vdash\SameReq{T}{U}$, we derive $G\vdash\SameReq{U}{T}$ via \IndexStep{Sym}\textsc{Sym}. Thus, \texttt{U} is equivalent to \texttt{T}. +\item +(Transitivity) Suppose that \texttt{T} is equivalent to \texttt{U}, and \texttt{U} is equivalent to \texttt{V}. Given the two derivations $G\vdash\SameReq{T}{U}$ and $G\vdash\SameReq{U}{V}$, we concatenate them together, and derive $G\vdash\SameReq{T}{V}$ via \IndexStep{Trans}\textsc{Trans}. Thus, \texttt{T} is equivalent to \texttt{V}. +\end{itemize} + +We are careful to limit our relation's domain to the valid type parameters of~$G$, instead of considering all type parameters that can be formed syntactically. If \texttt{T} is some invalid type parameter that cannot be derived from $G$, we cannot use \textsc{Reflex} to derive $\SameReq{T}{T}$, so we would no longer have an equivalence relation. +\end{proof} + +So far, nothing prevents us from deriving a requirement $\SameReq{T}{U}$ where \texttt{T}~or~\texttt{U} is not itself derivable. If this happens, we exclude the ordered pair from our relation. We will not worry about this for now, because \SecRef{generic signature validity} will show that such situations can be ruled out by diagnosing invalid requirements written by the user. + +\begin{example}\label{derived equiv example} +The proof that reduced type equality is an equivalence relation may seem useless or circular, but the point is we can use these inference rules to derive more requirements. Returning to \ExRef{motivating derived equiv}, we had derived (2), (4) and (5) below: +\begin{gather*} +\ConfStep{\rT}{Sequence}{1}\\ +\AssocSameStep{1}{\rT.Element}{\rT.Iterator.Element}{2}\\ +\ConfStep{\rU}{Sequence}{3}\\ +\AssocSameStep{3}{\rU.Element}{\rU.Iterator.Element}{4}\\ +\SameStep{\rT.Element}{\rU.Element}{5} +\end{gather*} +We use \textsc{Sym} to flip the first requirement, and observe that the three requirements now form a chain where the right-hand side of each one is identical to the left-hand side of the next. Two applications of \textsc{Trans} give us what we're looking for: +\begin{gather*} +\SymStep{2}{\rT.Iterator.Element}{\rT.Element}{6}\\ +\TransStep{6}{5}{\rT.Iterator.Element}{\rU.Element}{7}\\ +\TransStep{7}{4}{\rT.Iterator.Element}{\rU.Iterator.Element}{8} +\end{gather*} +We managed to derive the first requirement from \ExRef{motivating derived reqs}. Along the way, we've shown that our generic signature has four ``singleton'' equivalence classes, +\begin{gather*} +\{\texttt{\rT}\},\, +\{\texttt{\rU}\},\, +\{\texttt{\rT.Iterator}\},\, +\{\texttt{\rU.Iterator}\}, +\end{gather*} +and one final equivalence class formed from the other four type parameters, +\begin{align*} +\{&\texttt{\rT.Element},\,\texttt{\rU.Element},\,\\ +&\texttt{\rT.Iterator.Element},\,\texttt{\rU.Iterator.Element}\}. +\end{align*} +To prove that \texttt{\rT.Iterator.Element} conforms to \texttt{Equatable}, we need another rule. +\end{example} + +\paragraph{Compatibility.} +Our equivalence relation on fractions had an interesting property: if we pick any two fractions from a pair of equivalence classes and add them together (or multiply, etc), the equivalence class of the result does not depend on our choice of representatives. In Swift generics, we have an analogous goal: any observable behavior of a type parameter, be it conforming to a protocol, having a superclass bound, being fixed to a concrete type, or having a dependent member type with a given name, should only depend on its equivalence class, and not on the ``spelling.'' + +Once again, we wave our magic wand, and decree that reduced type equality is to have this property, by adding new \index{inference rule}inference rules. These rules relate \index{same-type requirement}same-type requirements with the other requirement kinds by replacing the other requirement's subject type. That is, if we can derive a same-type requirement $\SameReq{T}{U}$ and a conformance requirement $\ConfReq{U}{P}$, we can \IndexStepDefinition{SameConf}derive $\ConfReq{T}{P}$: +\begin{gather*} +\SameConfStepDef +\end{gather*} +A same-type requirement $\SameReq{T}{U}$ also composes with the \IndexStepDefinition{SameConcrete}\IndexStepDefinition{SameSuper}\IndexStepDefinition{SameLayout}other requirement kinds: +\begin{gather*} +\SameConcreteStepDef\\ +\SameSuperStepDef\\ +\SameLayoutStepDef +\end{gather*} +There is a second compatibility condition. We want equivalent type parameters to have equivalent member types, as follows. Suppose we can derive a conformance requirement $\ConfReq{U}{P}$ and a same-type requirement $\SameReq{T}{U}$. For every associated type~\texttt{A} of~\texttt{P}, we can \IndexStepDefinition{SameName}derive a same-type requirement $\SameReq{T.A}{U.A}$: +\begin{gather*} +\SameNameStepDef +\end{gather*} +We have yet to describe the inference rules for bound dependent member types, so that's coming up next. Apart from that, our formal system is complete, so let's look at a few more examples to justify these rules we just added. + \begin{example} -Say the \texttt{Barn} module defines a \index{horse}\texttt{Horse} protocol, and the \texttt{Swift} module defines \texttt{Collection}. We have $\mathtt{Barn.Horse}<\mathtt{Swift.Collection}$, since $\mathtt{Barn}<\mathtt{Swift}$. +We can derive the second requirement of \ExRef{motivating derived reqs} by applying \textsc{SameConf} to $\ConfReq{\rT.Element}{Equatable}$, because as we already saw, \texttt{\rT.Element} and \texttt{\rT.Iterator.Element} are equivalent: +\begin{gather*} +\ConfStep{\rT.Element}{Equatable}{1}\\ +\AssocSameStep{1}{\rT.Element}{\rT.Iterator.Element}{2}\\ +\SymStep{2}{\rT.Iterator.Element}{\rT.Element}{3}\\ +\SameConfStep{1}{3}{$\rT$.Iterator.Element}{Equatable}{3} +\end{gather*} +In our generic signature, this \emph{entire} equivalence class conforms to \texttt{Equatable}: +\begin{align*} +\{&\texttt{\rT.Element},\,\texttt{\rU.Element},\,\\ +&\texttt{\rT.Iterator.Element},\,\texttt{\rU.Iterator.Element}\}. +\end{align*} +\end{example} -If the \texttt{Barn} module also defines a \texttt{Saddle} protocol, then $\mathtt{Barn.Horse}<\mathtt{Barn.Saddle}$; both are from the same module, so we compare protocol names, $\mathtt{Horse}<\mathtt{Saddle}$. +\begin{example}\label{same name rule example} +To see \textsc{SameName} in action, we now look recall the generic signature of \texttt{sameIter()} from the beginning of the chapter: +\begin{quote} +\begin{verbatim} + +\end{verbatim} +\end{quote} +We also saw another declaration, \texttt{sameIterAndElt()}, which states both requirements: +\begin{gather*} +\SameReq{\rT.Iterator}{\rU.Iterator}\\ +\SameReq{\rT.Element}{\rU.Element} +\end{gather*} +We claimed that \index{requirement minimization}requirement minimization will drop the second requirement because it is \index{redundant requirement}redundant, giving \texttt{sameIterAndElt()} the same generic signature as \texttt{sameIter()}. This means we can derive the second requirement without invoking \emph{itself} as an elementary statement. We do this by first taking $\SameReq{\rT.Iterator}{\rU.Iterator}$, and applying \textsc{SameName} to derive the same-type requirement (3): +\begin{gather*} +\ConfStep{\rU}{Sequence}{1}\\ +\SameStep{\rT.Iterator}{\rU.Iterator}{2}\\ +\SameNameStep{1}{2}{\rT.Iterator.Element}{\rU.Iterator.Element}{3} +\end{gather*} +We then rewrite both sides of (3) into their short form by making use of the associated same-type requirement of \texttt{Sequence}: +\begin{gather*} +\ConfStep{\rT}{Sequence}{4}\\ +\AssocSameStep{4}{\rT.Element}{\rT.Iterator.Element}{5}\\ +\AssocSameStep{1}{\rU.Element}{\rU.Iterator.Element}{6}\\ +\SymStep{6}{\rU.Iterator.Element}{\rU.Element}{7}\\ +\TransStep{5}{3}{\rT.Element}{\rU.Iterator.Element}{8}\\ +\TransStep{8}{7}{\rT.Element}{\rU.Element}{9} +\end{gather*} +We could not have derived this requirement without \textsc{SameName}. \end{example} -Adding or removing an associated type with the same name as an associated type of an inherited protocol should have no effect on the binary interface of a shared library. For this reason, the linear order essentially ignores associated type declarations which re-state an associated type from an inherited protocol. -\IndexDefinition{root associated type}% -\index{inherited protocol}% -\begin{definition}\label{root associated type} A \emph{root associated type} is an associated type defined in a protocol such that no inherited protocol has an associated type with the same name. -\end{definition} -\begin{example} In the following, \texttt{Q.A} is \emph{not} a root associated type, because \texttt{Q} inherits \texttt{P} and \texttt{P} also declares an associated type named \texttt{A}, but \texttt{Q.B} is a root: + +\begin{example}\label{protocol n example} +We can conjure up a generic signature with infinitely many equivalence classes. We start with this protocol: \begin{Verbatim} -protocol P { - associatedtype A // root +protocol N { + associatedtype A: N } +\end{Verbatim} +Now, consider the protocol generic signature $G_\texttt{N}$. We can derive an infinite sequence of conformance requirements from $\ConfReq{\rT}{N}$, by repeated application of the \textsc{AssocConf} inference rule with the associated conformance requirement $\ConfReq{Self.A}{N}_\texttt{N}$: +\begin{gather*} +\ConfStep{\rT}{N}{1}\\ +\AssocConfStep{1}{\rT.A}{N}{2}\\ +\AssocConfStep{2}{\rT.A.A}{N}{3}\\ +4.\ \ldots +\end{gather*} +We can also derive infinitely many valid type parameters: +\begin{gather*} +\GenericStep{\rT}{4}\\ +\AssocNameStep{2}{\rT.A}{5}\\ +\AssocNameStep{3}{\rT.A.A}{6}\\ +7.\ \ldots +\end{gather*} +Each one of these type parameters is in its own equivalence class, because we cannot derive any non-trivial same-type requirements. We've shown that $G_\texttt{N}$ defines an infinite set of equivalence classes. +\end{example} -protocol Q: P { - associatedtype A // not a root - associatedtype B // root +\begin{example}\label{protocol collection example} +Here is a simplified form of the standard library \texttt{Collection} protocol: +\begin{Verbatim} +protocol Collection: Sequence { + associatedtype SubSequence: Collection + where Element == SubSequence.Element + SubSequence == SubSequence.SubSequence } \end{Verbatim} -\end{example} -\IndexDefinition{associated type order}% -\begin{algorithm}[Associated type order]\label{associated type order}% -Takes associated type declarations $\texttt{A}_1$ and $\texttt{A}_2$ as input, and returns one of ``$<$'', ``$>$'' or ``$=$'' as output. -\begin{enumerate} -\item First, compare their names lexicographically. Return the result if it is ``$<$'' or ``$>$''. Otherwise, both associated types have the same name, so keep going. -\item If $\texttt{A}_1$ is a root associated type and $\texttt{A}_2$ is not, return ``$<$''. -\item If $\texttt{A}_2$ is a root associated type and $\texttt{A}_1$ is not, return ``$>$''. -\item Compare the protocols of $\texttt{A}_1$ and $\texttt{A}_2$ using Algorithm~\ref{linear protocol order} and return the result. -\end{enumerate} -\end{algorithm} -Finally, we can use the generic parameter order and associated type order to define the type parameter order. We can summarize the type parameter order as follows: a type parameter of shorter length always precedes one of longer length, and when two type parameters have the same length we walk them in parallel and compare their structure. We can define the \IndexDefinition{type parameter length}\emph{length} of a type parameter recursively. A generic parameter type has length one, and the length of a dependent member type is one more than the length of its base type. -\begin{algorithm}[Type parameter order]\label{type parameter order} -Takes type parameters \texttt{T} and \texttt{U} as input, and returns one of ``$<$'', ``$>$'' or ``$=$'' as output. +Our protocol inherits the \texttt{Element} and \texttt{Iterator} associated types from \texttt{Sequence}, then declares a new associated type and states these four associated requirements: +\begin{gather*} +\ConfReq{Self}{Sequence}_\texttt{Collection}\\ +\ConfReq{Self.SubSequence}{Collection}_\texttt{Collection}\\ +\SameReq{Self.Element}{Self.SubSequence.Element}_\texttt{Collection}\\ +\SameReq{Self.SubSequence}{Self.SubSequence.SubSequence}_\texttt{Collection} +\end{gather*} +We can read the associated requirements as follows: \begin{enumerate} -\item If \texttt{T} and \texttt{U} are both generic parameter types, compare them using Algorithm~\ref{generic parameter order} and return the result. -\item If \texttt{T} is a generic parameter type and \texttt{U} is a \index{dependent member type}dependent member type, return ``$<$''. -\item If \texttt{T} is a dependent member type and \texttt{U} is a generic parameter type, return ``$>$''. -\item Otherwise, both are dependent member types. -\item Recursively invoke this algorithm to compare the base type of \texttt{T} with the base type of \texttt{U}, and return the result it is ``$<$'' or ``$>$''. Otherwise, both have the same base type, so keep going. -\item If \texttt{T} is \index{bound dependent member type}bound and \texttt{U} is \index{unbound dependent member type}unbound, return ``$<$''. -\item If \texttt{T} is unbound and \texttt{U} is bound, return ``$>$''. -\item If \texttt{T} is unbound and \texttt{U} is unbound, compare their names lexicographically and return the result. -\item If \texttt{T} is bound and \texttt{U} is bound, compare their associated types using Algorithm~\ref{associated type order} and return the result. +\item The first conformance requirement states that all collections are sequences. +\item The second conformance requirement states that a subsequence of a collection is another collection, of a possibly different concrete type. +\item The first same-type requirement states that a subsequence of an arbitrary collection always has the same element type as the original. +\item The second same-type requirement states that a subsequence is just a subsequence of the original collection, so subsequences don't stack more than once. \end{enumerate} -\end{algorithm} -The type parameter order is actually a special case of a \index{shortlex order}\emph{shortlex order}; we will see another shortlex order in Section~\ref{finding conformance paths}, and finally generalize the concept in Section~\ref{rewritesystemintro}. +For example, the \texttt{SubSequence} of \texttt{Array} is \texttt{ArraySlice}, and \texttt{SubSequence} of \texttt{ArraySlice} is again \texttt{ArraySlice}. (Type witnesses of concrete conformances are discussed in \SecRef{type witnesses}). -\begin{example}\label{typeparameterorderexample} Table~\ref{typeparameterordertable} shows the type parameters of the following generic signature in type parameter order: -\begin{quote} -\begin{verbatim} -<τ_0_0, τ_0_1 where τ_0_1: Sequence, - τ_0_0 == τ_0_1.[Sequence]Element> -\end{verbatim} -\end{quote} -A few unbound type parameters are also thrown in the mix to show how they are ordered with respect to the bound type parameters. -\end{example} -\begin{table}\captionabove{Type parameters from Example~\ref{typeparameterorderexample}, ordered and grouped by length}\label{typeparameterordertable} +We're going to look at the protocol generic signature $G_\texttt{Collection}$ and attempt to understand its equivalence classes. We can use the protocol inheritance relationship to derive \texttt{\rT.Iterator}: +\begin{gather*} +\ConfStep{\rT}{Collection}{1}\\ +\AssocConfStep{1}{\rT}{Sequence}{2}\\ +\AssocNameStep{2}{\rT.Iterator}{3} +\end{gather*} +The first thing we notice is that \rT\ and \texttt{\rT.Iterator} are not equivalent to each other or to any other type parameter. Each one is in an equivalence class by itself, so we have our first two equivalence classes. + +To understand how the remaining equivalence classes are formed, we notice that $\ConfReq{Self.SubSequence}{Collection}_\texttt{Collection}$ generates an infinite family of conformance requirements, like $\ConfReq{Self.A}{N}_\texttt{N}$ did in the previous example: +\begin{gather*} +\ConfStep{\rT}{Collection}{1}\\ +\AssocConfStep{1}{\rT.SubSequence}{Collection}{2}\\ +\AssocConfStep{2}{\rT.SubSequence.SubSequence}{Collection}{3}\\ +4.\ \ldots +\end{gather*} +To talk about this phenomenon, we introduce the following notation: +\begin{gather*} +\texttt{\rT.$\texttt{SubSequence}^n$} = \begin{cases} +\rT&n=0\\ +\texttt{\rT.SubSequence}&n=1\\ +\texttt{\rT.SubSequence}^{n-1}\texttt{.SubSequence}&n>1 +\end{cases} +\end{gather*} +So the theory of $G_\texttt{Collection}$ contains $\ConfReq{\rT.$\texttt{SubSequence}^n$}{Collection}$ for all $n\geq 0$. Furthermore, for each $n\geq 0$, we can derive the following from (1) below: +\begin{gather*} +\AnyStep{\ConfReq{\rT.$\texttt{SubSequence}^n$}{Collection}}{1}\\ +\AssocConfStep{1}{\rT.$\texttt{SubSequence}^n$}{Sequence}{2}\\ +\AssocSameStep{1}{\rT.$\texttt{SubSequence}^n$.Element}{\rT.$\texttt{SubSequence}^{n+1}$.Element}{3}\\ +\AssocSameStep{2}{\rT.$\texttt{SubSequence}^n$.Element}{\rT.$\texttt{SubSequence}^n$.Iterator.Element}{4} +\end{gather*} +The infinite families (3) and (4) define a single equivalence class. This equivalence class contains \texttt{\rT.$\texttt{SubSequence}^n$.Element} and \texttt{\rT.$\texttt{SubSequence}^n$.Iterator.Element}, for all $n\geq 0$. (In particular, it contains \texttt{\rT.Element}.) + +We can also generate another infinite family of same-type requirements: +\begin{gather*} +\AssocSameStep{1}{\rT.$\texttt{SubSequence}^{n+1}$}{\rT.$\texttt{SubSequence}^{n+2}$}{5} +\end{gather*} +These form an equivalence class from all \texttt{\rT.$\texttt{SubSequence}^n$.SubSequence} for $n\geq 0$. Finally, applying \textsc{SameName} gives us one more infinite family: +\begin{gather*} +\SameNameStep{5}{2}{\rT.$\texttt{SubSequence}^{n+1}$.Iterator}{\rT.$\texttt{SubSequence}^{n+2}$.Iterator}{6} +\end{gather*} +These form the last equivalence class, because the \texttt{SubSequence} may have a distinct \texttt{Iterator} from the original sequence. + +We see that $G_\texttt{Collection}$ defines five equivalence classes, and three of those contain infinitely many representative type parameters each: +\begin{align*} +\{&\rT\}\tag{1}\\ +\{&\texttt{\rT.Element},\,\texttt{\rT.SubSequence.Element},\,\ldots\}\tag{2}\\ +&\cup\{\texttt{\rT.Iterator.Element},\, \texttt{\rT.SubSequence.Iterator.Element},\, \ldots\},\\ +\{&\texttt{\rT.Iterator}\}\tag{3}\\ +\{&\texttt{\rT.SubSequence},\,\texttt{\rT.SubSequence.SubSequence},\,\ldots\},\tag{4}\\ +\{&\texttt{\rT.SubSequence.Iterator},\tag{5}\\ +&\texttt{\rT.SubSequence.SubSequence.Iterator},\,\\ +&\ldots\} +\end{align*} +To describe the conformance requirements that apply to each equivalence class, it is sufficient to pick a single representative conformance requirement for each combination of type parameter and protocol, because the \textsc{SameConf} inference rule allows us to derive the rest. This table summarizes our ``pen and paper'' investigation: \begin{center} -\begin{tabular}{l} +\begin{tabular}{ll} \toprule -\ttgp{0}{0}\\ -\ttgp{0}{1}\\ +\textbf{Representative:}&\textbf{Conforms to:}\\ \midrule -\texttt{\ttgp{0}{1}.[Sequence]Element}\\ -\texttt{\ttgp{0}{1}.[Sequence]Iterator}\\ -\texttt{\ttgp{0}{1}.Element}\\ -\texttt{\ttgp{0}{1}.Iterator}\\ -\midrule -\texttt{\ttgp{0}{1}.[Sequence]Iterator.[IteratorProtocol]Element}\\ -\texttt{\ttgp{0}{1}.[Sequence]Iterator.Element}\\ -\texttt{\ttgp{0}{1}.Iterator.[IteratorProtocol]Element}\\ -\texttt{\ttgp{0}{1}.Iterator.Element}\\ +\texttt{\rT}&\texttt{Collection} and \texttt{Sequence}\\ +\texttt{\rT.Element}&none\\ +\texttt{\rT.Iterator}&\texttt{IteratorProtocol}\\ +\texttt{\rT.SubSequence}&\texttt{Collection} and \texttt{Sequence}\\ +\texttt{\rT.SubSequence.Iterator}&\texttt{IteratorProtocol}\\ \bottomrule \end{tabular} \end{center} -\end{table} +That's pretty much all there is to say about the generic signature $G_\texttt{Collection}$, however our above description is somewhat lacking because it doesn't make the member type relationships apparent. We will revisit all of the generic signatures we saw here in \SecRef{type parameter graph}, when we introduce the \emph{type parameter graph}. We will define an edge relation on equivalence classes, which become the vertices of a graph. This will give us a visual aid for understanding member type relationships. +\end{example} -\section{Reduced Types}\label{reducedtypes} -\index{same-type requirement} -\IndexDefinition{reduced type equality} -\IndexDefinition{equivalence class} -\IndexDefinition{reduced type} -Canonical type equality does not take generic signatures into account at all; it only tells us if two type parameters are spelled in the same way. To correctly model same-type requirements, the generics implementation has a second, stronger, notion of equality on type parameters. With a generic signature on hand, \emph{reduced type equality} determines whether they abstractly represent the same replacement type within this signature. We will begin by reviewing equivalence relations and equivalence classes. +Notice how our formal system can generate an infinite theory! We saw that~$G_\texttt{N}$ has an infinite set of finite equivalence classes, while in~$G_\texttt{Collection}$, the equivalence classes themselves were infinite. Of course both can happen simultaneously. A generic signature with an infinite set of infinite equivalence classes will appear in \SecRef{monoidsasprotocols}. + +\paragraph{Bound dependent member types.} +We saw in \SecRef{fundamental types} that dependent member types come in two varieties: +\begin{itemize} +\item \index{unbound dependent member type}Unbound dependent member types refer to an identifier. Denoted \texttt{T.A} for some base type \texttt{T} and identifier \texttt{A}. +\item \index{bound dependent member type}Bound dependent member types refer to an associated type declaration. Denoted \texttt{T.[P]A} for some base type \texttt{T} and associated type declaration \texttt{A} of some protocol \texttt{P}. +\end{itemize} \begin{definition} -Recall that a relation on $S$ is a subset of $S\times S$. A relation $R$ is an \IndexDefinition{equivalence relation}\emph{equivalence relation} on $S$ if it is reflexive, symmetric and transitive. +We define the following for convenience: \begin{itemize} -\item $R$ is \IndexDefinition{reflexive relation}\emph{reflexive} if $(x,x)\in R$ for all $x\in S$. -\item $R$ is \IndexDefinition{symmetric relation}\emph{symmetric} if whenever $(x,y)\in R$, then $(y,x)\in R$. -\item $R$ is \index{transitive relation}\emph{transitive} if whenever $(x,y)\in R$ and $(y,z)\in R$, then $(x,z)\in R$. +\item An \IndexDefinition{unbound type parameter}\emph{unbound type parameter} is a generic parameter type, or an unbound dependent member type whose base type is another unbound type parameter. + +\item A \IndexDefinition{bound type parameter}\emph{bound type parameter} is a generic parameter type, or a bound dependent member type whose base type is another bound type parameter. \end{itemize} -If $R$ is an equivalence relation on a set $S$ and $x\in S$, the \emph{equivalence class} of $x$, denoted $\EquivClass{x}$, is the set of all $y\in S$ such that $(x, y)\in R$. +The bound and unbound type parameters do not exhaustively partition the set of all type parameters. A generic parameter type is \emph{both} bound and unbound per the above, while a type parameter like \texttt{\rT.[Sequence]Iterator.Element} is neither bound nor unbound, because it contains a mix of bound and unbound dependent member types. \end{definition} + +The fundamental tension here is the following: +\begin{itemize} +\item Unbound type parameters types appear when \index{type resolution}type resolution resolves the requirements in a \texttt{where} clause; they directly represent what was written by the user. + +\item Type substitution only operates on bound dependent member types, because as we will see in \SecRef{abstract conformances}, we need all three pieces of information that describe one as a valid type parameter: the base type, the protocol, and the associated type declaration. +\end{itemize} +We often apply substitution maps to requirements of \index{generic signature}generic signatures and requirement signatures, and the \index{interface type}interface types of declarations. Indeed, all of these semantic objects must only contain bound type parameters. We resolve this as follows: +\begin{itemize} +\item Requirement minimization (\SecRef{minimal requirements}) converts unbound type parameters in the \texttt{where} clause into bound type parameters when building a generic signature. +\item Queries against an existing generic signature (\SecRef{genericsigqueries}) allow unbound type parameters. We will see that one can obtain a bound type parameter by asking the generic signature for an unbound type parameter's \index{reduced type}\emph{reduced type}. +\item Type resolution makes use of generic signature queries to form bound dependent member types when resolving the interface type of a declaration (\ChapRef{typeresolution}). +\end{itemize} + +We now extend our formal system to describe these behaviors. As always, let $G$ be a generic signature. We assume that $G\vdash\ConfReq{T}{P}$ for some~\texttt{T} and~\texttt{P}. + +We first add an inference rule that is analogous to \IndexStep{AssocName}\textsc{AssocName} except that it derives a bound dependent member type. From each \index{associated type declaration}associated type declaration~\texttt{A} of~\texttt{P}, the \IndexStepDefinition{AssocDecl}\textsc{AssocDecl} inference rule derives the \index{bound dependent member type}bound \index{dependent member type}dependent member type \texttt{T.[P]A}, with base type~\texttt{T}, referencing the associated type declaration~\texttt{A}: +\begin{gather*} +\AssocDeclStepDef +\end{gather*} +To encode the equivalence of bound and unbound dependent member types, we also add an inference rule that derives a same-type requirement between the two. That is, from each \index{associated type declaration}associated type declaration~\texttt{A} of~\texttt{P}, the \IndexStepDefinition{AssocBind}\textsc{AssocBind} inference rule derives a same-type requirement between the two dependent member types that \textsc{AssocDecl} and \textsc{AssocName} would derive under the same assumptions: +\begin{gather*} +\AssocBindStepDef +\end{gather*} +The \IndexStep{SameName}\textsc{SameName} inference rule, which gave us $\SameReq{T.A}{U.A}$ from $\ConfReq{U}{P}$ and $\SameReq{T}{U}$, also has a bound type parameter equivalent. Under the same assumptions, \IndexStepDefinition{SameDecl}\textsc{SameDecl} derives $\SameReq{T.[P]A}{U.[P]A}$: +\begin{gather*} +\SameDeclStepDef +\end{gather*} +The \textsc{SameDecl} inference rule is fundamentally different from all of the others, because its introduction does not contribute any novel statements to our theory: \begin{proposition} -If $R$ is an equivalence relation on $S$, then every element of $S$ belongs to exactly one equivalence class of $R$. +Any derivation containing a \textsc{SameDecl} step can be transformed into one without. \end{proposition} \begin{proof} -We first show that every element $x\in S$ belongs to \emph{at least one} equivalence class, specifically its own equivalence class $\EquivClass{x}$. Indeed, if $x\in S$, then $(x,x)\in R$, since $R$ is reflexive. From the definition of $\EquivClass{x}$, this means that $x\in\EquivClass{x}$. This can also be stated another way. With our \index{turnsile operator}``tursile'' $\vdash$ operator, we can view the reflexivity of $R$ as a ``derivation rule'' of sorts, for constructing elements of $R$ from elements of $S$: -\[x\in S\vdash (x,x)\in R\] -Next, we will show that every element belongs to \emph{exactly one} equivalence class. We will start with the assumption that some $t$ is an element of both $\EquivClass{x}$ and $\EquivClass{y}$, and argue that $\EquivClass{x}=\EquivClass{y}$, that is, $\EquivClass{x}\subseteq\EquivClass{y}$ and $\EquivClass{y}\subseteq\EquivClass{x}$. Let $u\in\EquivClass{x}$ be some other arbitrary element. We can write down derivations for the elements of $R$ we know exist so far: +Given $G\vdash\ConfReq{U}{P}$ and $G\vdash\SameReq{T}{U}$, we can derive $\SameReq{T.[P]A}{U.[P]A}$ without \textsc{SameDecl} by adding the below steps, for any suitable choice of \texttt{T}, \texttt{U}, \texttt{P} and \texttt{A}: +\begin{gather*} +\AnyStep{\ConfReq{U}{P}}{1}\\ +\AnyStep{\SameReq{T}{U}}{2}\\ +\SameNameStep{1}{2}{T.A}{U.A}{3}\\ +\AssocBindStep{1}{U.[P]A}{U.A}{4}\\ +\SymStep{4}{U.A}{U.[P]A}{5}\\ +\SameConfStep{1}{2}{T}{P}{6}\\ +\AssocBindStep{6}{T.[P]A}{T.A}{7}\\ +\TransStep{7}{3}{T.[P]A}{U.A}{8}\\ +\TransStep{8}{5}{T.[P]A}{U.[P]A}{9} +\end{gather*} +However, it is certainly more convenient to write 1 derivation step instead of 7 every time we need this equivalence. Thus, \textsc{SameDecl} is syntax sugar for our formal system. +\end{proof} + +\begin{example} +Consider the protocol generic signature $G_\texttt{Sequence}$, and recall that \texttt{Sequence} states two associated requirements: \begin{gather*} -t\in\EquivClass{x}\vdash(x, t)\in R\\ -t\in\EquivClass{y}\vdash(y, t)\in R\\ -u\in\EquivClass{x}\vdash(x, u)\in R +\ConfReq{Self.Iterator}{IteratorProtocol}_\texttt{Sequence}\\ +\SameReq{Self.Element}{Self.Iterator.Element}_\texttt{Sequence} \end{gather*} -The symmetry and transitivity of $R$ also give us two more rules for deriving new elements of $R$ from existing elements of $R$. We can thus derive the fact that $u\in\EquivClass{y}$: +Let's continue to pretend that those associated requirements are written with unbound type parameters at first, but allow use of the new inference rules. We can derive the bound type parameter \texttt{\rT.[Sequence]Iterator.[IteratorProtocol]Element}: \begin{gather*} -(x, t)\in R\vdash(t, x)\in R\\ -(y, t),\,(t, x)\in R\vdash (y, x)\in R\\ -(y, x),\,(x, u)\in R\vdash (y, u)\in R\\ -(y, u)\in R\vdash u\in\EquivClass{y} +\ConfStep{\rT}{Sequence}{1}\\ +\AssocConfStep{1}{\rT.Iterator}{IteratorProtocol}{2}\\ +\AssocBindStep{2}{\rT.[Sequence]Iterator}{\rT.Iterator}{3}\\ +\SameConfStep{2}{3}{\rT.[Sequence]Iterator}{IteratorProtocol}{4}\\ +\AssocDeclStep{4}{\rT.[Sequence]Iterator.[IteratorProtocol]Element}{5} \end{gather*} -However, since $u\in\EquivClass{x}$ was arbitrary, we've actually shown that $\EquivClass{x}\subseteq\EquivClass{y}$. But also, the same argument gives $\EquivClass{y}\subseteq\EquivClass{x}$ if you swap $x$ and $y$ throughout. Therefore, $\EquivClass{x}=\EquivClass{y}$, concluding the proof. Note that we made use of all three defining properties of an equivalence relation; the result no longer holds after relaxing any of these conditions. -\end{proof} -Now, we can define reduced type equality using the idea of \index{derived requirement}derived requirements from Section~\ref{derived req}. -\begin{definition} -We say two type parameters \texttt{T} and \texttt{U} are \index{equivalent type parameters|see{reduced type equality}}\IndexDefinition{reduced type equality}equivalent with respect to a generic signature $G$ if the same-type requirement $\FormalReq{T == U}$ can be derived from $G$. Additionally, if we can derive a concrete same-type requirement $\FormalReq{T == C}$ which fixes a type parameter \texttt{T} to a concrete type \texttt{C}, we say that \texttt{T} is equivalent to the concrete type \texttt{C}. That this is an equivalence relation can be seen from the three \IndexStep{Equiv}\textsc{Equiv} derivation steps: +We used the new inference rules in (3) and (5). Notice how (4) looks a lot like our associated conformance requirement, but with a bound type parameter. + +Now, we can also define our formal system so that the explicit requirements of our generic signature and the associated requirements of protocols are written in terms of bound type parameters, as they actually would be in the implementation: +\begin{gather*} +\ConfReq{Self.[Sequence]Iterator}{IteratorProtocol}_\texttt{Sequence}\\ +[\texttt{Self.[Sequence]Element ==}\\ +\qquad\qquad\texttt{Self.[Sequence]Iterator.[IteratorProtocol]Element}]_\texttt{Sequence} +\end{gather*} +We see that \texttt{\rT.[Sequence]Iterator.[IteratorProtocol]Element} can be derived, now with fewer steps: +\begin{gather*} +\ConfStep{\rT}{Sequence}{1}\\ +\AssocConfStep{1}{\rT.[Sequence]Iterator}{IteratorProtocol}{2}\\ +\AssocDeclStep{2}{\rT.[Sequence]Iterator.[IteratorProtocol]Element}{3} +\end{gather*} +And there's a symmetry here---we can just as easily derive the unbound type parameter \texttt{\rT.Iterator.Element} from our associated requirements that contain bound type parameters: +\begin{gather*} +\ConfStep{\rT}{Sequence}{1}\\ +\AssocConfStep{1}{\rT.[Sequence]Iterator}{IteratorProtocol}{2}\\ +\AssocBindStep{1}{\rT.[Sequence]Iterator}{\rT.Iterator}{3}\\ +\SymStep{3}{\rT.Iterator}{\rT.[Sequence]Iterator}{4}\\ +\SameConfStep{2}{4}{\rT.Iterator}{IteratorProtocol}{5}\\ +\AssocNameStep{5}{\rT.Iterator.Element}{6} +\end{gather*} +\end{example} + +In fact, we will prove the following in \ChapRef{building generic signatures}: \begin{itemize} -\item (Reflexivity) Given a valid type parameter \texttt{T}, we can derive the vacuous requirement $\FormalReq{T == T}$. Thus, \texttt{T} is equivalent to \texttt{T}. -\item (Symmetry) Given a requirement $\FormalReq{T == U}$, we can derive the requirement $\FormalReq{U == T}$. Thus, if \texttt{T} is equivalent to \texttt{U}, then \texttt{U} is equivalent to \texttt{T}. -\item (Transitivity) Given a pair of requirements $\FormalReq{T == U}$ and $\FormalReq{U == V}$, we can derive the requirement $\FormalReq{T == V}$. Thus, if \texttt{T} is equivalent to \texttt{U} and \texttt{U} is equivalent to \texttt{V}, then \texttt{T} is equivalent to \texttt{V}. +\item \ThmRef{bound and unbound equiv} will show that every equivalence class of valid type parameters always contains at least one bound and at least one unbound type parameter. +\item \PropRef{equiv generic signatures} will tell us that we can start with a list of explicit requirements written with bound or unbound type parameters without changing the theory. \end{itemize} + +Suppose a generic parameter \rT\ conforms to two protocols \texttt{P1} and \texttt{P2}, and both declare an associated type named \texttt{A}. The same unbound dependent type \texttt{\rT.A} can be derived from either conformance requirement, and this effectively merges the associated requirements imposed on \texttt{A} in the language semantics. This remains true once we add bound dependent member types, because we can derive: +\begin{gather*} +\ConfStep{\rT}{P1}{1}\\ +\AssocBindStep{1}{\rT.[P1]A}{\rT.A}{2}\\ +\ConfStep{\rT}{P2}{3}\\ +\AssocBindStep{3}{\rT.[P2]A}{\rT.A}{4}\\ +\SymStep{4}{\rT.A}{\rT.[P2]A}{5}\\ +\TransStep{2}{5}{\rT.[P1]A}{\rT.[P2]A}{6} +\end{gather*} +A special case is when the first protocol inherits from the other. Re-stating an associated type declaration with the same name as an associated type of an inherited protocol has no effect. This is explored further in \SecRef{tietze transformations}. + +Bound type parameters don't seem to yield anything new. Why bother then? Type substitution must issue generic signature queries in the general case anyway, and we could instead consult the generic signature every time we decompose a dependent member type. Instead, our representation basically pre-computes certain information and encodes it in the structure of the dependent member type itself. This avoids generic signature queries in simple cases, and our formal system proves this is equivalent. + +\paragraph{Summary.} +A complete summary of all elementary statements and inference rules in the derived requirements formalism appears in \AppendixRef{derived summary}. Bound type parameters are trivial from a theoretical standpoint, and our theory of concrete type requirements is incomplete, which leaves us with this fundamental set of elementary statements and inference rules: +\begin{center} +\begin{tabular}{cccc} +\toprule +\textbf{Elementary statements:}&\textsc{Generic}&\textsc{Conf}&\textsc{Same}\\ +\midrule +\textbf{Inference rules:}&\textsc{AssocName}&\textsc{AssocConf}&\textsc{AssocSame}\\ +&\textsc{Reflex}&\textsc{Sym}&\textsc{Trans}\\ +&&\textsc{SameConf}&\textsc{SameName}\\ +\bottomrule +\end{tabular} +\end{center} + +So far, we've only used our formal system to derive concrete statements \emph{in} a fixed generic signature. In later chapters, we prove results \emph{about} generic signatures: +\begin{itemize} +\item In \ChapRef{building generic signatures}, we describe how we diagnose invalid requirements, and show that if no diagnostics are emitted, our formal system has a particularly nice theory. +\item In \ChapRef{conformance paths}, we take a closer look at derived conformance requirements, to describe substitution of dependent member types; in particular, we prove that a certain algorithm must terminate. +\item In \ChapRef{monoids}, we use derived requirements to show that a Swift protocol can encode an arbitrary finitely-presented monoid, which demonstrates that a generic signature can have an undecidable theory. +\item In \ChapRef{symbols terms rules}, we will translate the explicit requirements of a generic signature into rewrite rules for a string rewrite system, and then show that derived requirements correspond to \emph{rewrite paths} under this correspondence. This provides a correctness proof for the implementation. +\end{itemize} + +\section{Reduced Type Parameters}\label{reduced types} + +An equivalence class of type parameters might be infinite, or finite but large. For this reason, we cannot model an equivalence class as a set in the implementation. Instead, we define a way to consistently select a unique representative from each equivalence class, called the \emph{reduced type parameter} of the equivalence class. The reduced type parameter can ``stand in'' for its entire equivalence class, and the set of all reduced type parameters gives another description of the equivalence class structure of a generic signature. + +To continue the analogy with fractions from the previous section, we don't usually think of a single rational number as an infinite set of fractions. Instead, after performing a series of arithmetic operations, we \emph{reduce} the result to lowest terms by eliminating common factors from the numerator and denominator. For example, $2/4$ and $(-3)/(-6)$ reduce to $1/2$, while $1/2$ is already reduced. This gives us a new way to check if two fractions are equivalent: we reduce both, and then check if the reduced fractions are identical. A key fact is that we can find a reduced fraction by \emph{ordering} the elements of its equivalence class: the reduced fraction has the smallest positive denominator. + +\begin{definition}\label{def order} +A \IndexDefinition{partial order}\emph{partial order} $R\subseteq S\times S$ is anti-reflexive and transitive: +\begin{itemize} +\item $R$ is \emph{anti-reflexive} if $(x,x)\notin R$ for all $x\in S$. +\item $R$ is \index{transitive relation}\emph{transitive} if $(x,y)$, $(y,z)\in R$ implies that $(x,z)\in R$. +\end{itemize} +If $R$ is clear from context, we write $x$, $\leq$ and $\geq$ in the usual way in terms of $<$. \end{definition} -The type parameters in an equivalence class can be sorted by the type parameter order from the previous section, which gives us the notion of ``simplest representative'' type parameters. -\begin{definition} -If \texttt{T} is any type parameter, the \IndexDefinition{reduced type parameter}\emph{reduced type} of \texttt{T} is the least element in the equivalence class of \texttt{T}. If \texttt{T} itself is the smallest element in its own equivalence class, we say that \texttt{T} is a reduced type parameter. As a special case, if we can derive a same-type requirement $\FormalReq{T == C}$ with a concrete type \texttt{C} on the right hand side, then \texttt{T} is not considered a reduced type parameter, and its reduced type is the concrete type \texttt{C}. -\end{definition} +Note that $xy$. +\end{enumerate} +If none of these are true, we say $x$ and $y$ are \emph{incomparable}. We will consider general partial orders later, but for now we're going to restrict our attention to those orders where \emph{exactly} one of the three conditions above is true, for any pair of elements: \begin{definition} -An interface type is a \IndexDefinition{reduced type}\emph{reduced type} with respect to a generic signature if all type parameters appearing inside the interface type are reduced type parameters. Reduced type equality generalizes from type parameters to interface types; we say that two interface types are \emph{equivalent} if they become canonically equal once each type parameter is replaced with its reduced type. +A \index{total order|see{linear order}}\IndexDefinition{linear order}\emph{linear order} is a partial order without incomparable elements. In a linear order, $x\not< y$ is another way of saying $x \geq y$. \end{definition} -In the implementation, the reduced type computation is actually a more primitive concept than the reduced equality check. Reduced type equality is implemented to first compute the reduced type of both sides, and then test for type pointer equality. Compare this with how canonical type equality takes the canonical type of both sides and tests type pointer equality. All reduced types are also canonical, so canonical equality implies reduced equality (but not vice versa). -These definitions characterize reduced types but don't give an algorithm for computing the reduced type of an arbitrary type parameter. In fact, it is not immediately obvious that reduced types \emph{exist}; that is, if each equivalence class even \emph{has} a unique smallest element. For example, consider the set of (positive and negative) \index{integers}integers, $\mathbb{Z}$. The integers can be linearly ordered with the standard ``less-than'' relation, but the \index{subset}subset of negative integers does not have a minimum element, because we can exhibit an \emph{infinite descending chain} where each element is smaller than the one to the right: -\[\cdots < -3 < -2 < -1\] -On the other hand, in the set of \index{natural numbers}natural numbers (non-negative integers) $\mathbb{N}$, every non-empty subset $S\subseteq\mathbb{N}$ has a minimum element. We first check if $0\in S$; if so, we're done. Otherwise, we check if $1\in S$, $2\in S$, and so on. Since $S$ was non-empty this must terminate after a finite number of steps and produce a minimum element. The difference between the ``less-than'' order on $\mathbb{N}$ and $\mathbb{Z}$ is given by the following definition. +We will define a linear order on type parameters, denoted by $<$. This will express the idea that if $G\vdash\SameReq{T}{U}$ and $\texttt{T}<\texttt{U}$, then \texttt{T} is ``more reduced'' than \texttt{U}, within this equivalence class that contains both. The minimum out of all representatives is then the reduced type parameter itself. We will start by measuring the ``complexity'' of a type parameter by counting the number of dependent member types involved in its construction: \begin{definition} -A \index{partial order}partial order over a set $S$ is \IndexDefinition{well-founded order}\emph{well-founded} if $S$ does not have an infinite descending chain; that is, there does not exist an infinite sequence of elements $x_i\in S$ such that: -\[\ldots 1$. Thus, the number of type parameters of length $\leq n$, being a finite sum, is itself finite. Assume then, the type parameter order is not well-founded, and we have an infinite descending chain of type parameters: -\[\ldots <\texttt{T}_n<\ldots <\texttt{T}_3<\texttt{T}_2<\texttt{T}_1\] -For every $n>1$, we have $\texttt{T}_n<\texttt{T}_1$, and thus $|\texttt{T}_n|\leq|\texttt{T}_1|$, so $\{\texttt{T}_n\}_{n>1}$ is an infinite set of type parameters of length $\leq |\texttt{T}_1|$. But we just showed this set is always finite. This is a contradiction, so our assumption that an infinite descending chain exists must be invalid. Therefore the type parameter order is well-founded. -\end{proof} -\begin{table}\captionabove{Equivalence classes defined by the generic signature in Example~\ref{typeparameterorderexample}}\label{equivalenceclassestable} -\begin{center} + +We want the reduced type parameter to be one of minimum possible length, and we also want to be able to apply substitution maps to it, so it ought to be a \index{bound type parameter}bound type parameter. This gives us two conditions our type parameter order must satisfy: +\begin{enumerate} +\item If $|\texttt{T}|<|\texttt{U}|$, then $\texttt{T}<\texttt{U}$. +\item Bound dependent member types precede unbound dependent member types. +\end{enumerate} + +\begin{example} +In \ExRef{protocol collection example}, we studied a simplified \texttt{Collection} protocol. We saw that the equivalence class of \texttt{\rT.SubSequence} in +$G_\texttt{Collection}$ contains an infinite sequence of type parameters of increasing length, \[\{\texttt{\rT.SubSequence}^n\}_{n\geq 1}.\] +After adding inference rules for bound dependent member types, our equivalence class will also contain some of those: +\[\{\texttt{\rT.[Collection]SubSequence}^n\}_{n\geq 1}.\] +(It also contains ``mixed'' bound and unbound type parameters.) The reduced type parameter of this equivalence class must be $\texttt{\rT.[Collection]SubSequence}$. + +\pagebreak + +Each equivalence class of $G_\texttt{Collection}$ has a unique representative of minimum length, which isn't true of all generic signatures in general. This condition means that if two equivalent type parameters have the same length, they only differ in being bound or unbound, so our two conditions completely determine the reduced type parameter of each equivalence class: +\begin{quote} \begin{tabular}{l} \toprule -\ttgp{0}{0} $(*)$\\ -\midrule -\ttgp{0}{1} $(*)$\\ -\midrule -\texttt{\ttgp{0}{1}.[Sequence]Element} $(*)$\\ -\texttt{\ttgp{0}{1}.Element}\\ -\texttt{\ttgp{0}{1}.[Sequence]Iterator.[IteratorProtocol]Element}\\ -\texttt{\ttgp{0}{1}.[Sequence]Iterator.Element}\\ -\texttt{\ttgp{0}{1}.Iterator.[IteratorProtocol]Element}\\ -\texttt{\ttgp{0}{1}.Iterator.Element}\\ -\midrule -\texttt{\ttgp{0}{1}.[Sequence]Iterator} $(*)$\\ -\texttt{\ttgp{0}{1}.Iterator}\\ +\texttt{\rT}\\ +\texttt{\rT.[Sequence]Element}\\ +\texttt{\rT.[Sequence]Iterator}\\ +\texttt{\rT.[Collection]SubSequence}\\ +\texttt{\rT.[Collection]SubSequence.[Sequence]Iterator}\\ \bottomrule \end{tabular} -\end{center} -\end{table} - -\begin{example} -Table~\ref{equivalenceclassestable} groups the type parameters from Example~\ref{typeparameterorderexample} into equivalence classes, with the reduced type of each equivalence class marked with $(*)$. The equivalence class of \texttt{T.[Sequence]Element} includes a number of elements; we've seen derivations for some of the same-type requirements already. However, notice that each equivalence class also has a bound and unbound form of each dependent member type represented. These are worth discussing in a little more detail. +\end{quote} +(The real \texttt{Collection} protocol defines these reduced types together with a few more equivalence classes related to the \texttt{Index} and \texttt{Indices} associated types.) \end{example} -\paragraph{Bound and unbound} -If an equivalence class contains dependent member types, it will contain the \index{bound dependent member type}bound and \index{unbound dependent member type}unbound form of each one. So, in the generic signature of Example~\ref{typeparameterorderexample}, both \texttt{\ttgp{0}{1}.Element} and \texttt{\ttgp{0}{1}.[Sequence]Element} belong to the same equivalence class. This follows from our requirement derivation rules; specifically, a conformance requirement $\ConfReq{T}{P}$ implies for each associated type \texttt{A} of \texttt{P} a pair of valid type parameters, \verb|T.A| and \verb|T.[P]A|, and a same-type requirement between them. +To complete our description of the type parameter order then, we must show how to compare type parameters of equal length. Note that all of the below algorithms take a pair of values $x$ and $y$ and then compute whether $xy$ or $x=y$ simultaneously. We start with the generic parameters, which are the type parameters of length~1. + +\begin{algorithm}[Generic parameter order]\label{generic parameter order} \IndexDefinition{generic parameter order}Takes two \index{generic parameter type}generic parameter types \ttgp{d}{i} and \ttgp{D}{I} as input. Returns one of ``$<$'', ``$>$'' or ``$=$'' as output. +\begin{enumerate} +\item If $\texttt{d}<\texttt{D}$, return ``$<$''. +\item If $\texttt{d}>\texttt{D}$, return ``$>$''. +\item If $\texttt{d}=\texttt{D}$ and $\texttt{i}<\texttt{I}$, return ``$<$''. +\item If $\texttt{d}=\texttt{D}$ and $\texttt{i}>\texttt{I}$, return ``$>$''. +\item If $\texttt{d}=\texttt{D}$ and $\texttt{i}=\texttt{I}$, return ``$=$''. +\end{enumerate} +\end{algorithm} -A \index{bound type parameter}type parameter is bound if it does not contain any unbound dependent member types (so generic parameter types are also automatically bound type parameters). The bound form of a dependent member type precedes the unbound form in the type parameter order, so reduced type parameters are always bound type parameters. (The converse is not true, however; when we introduced our running example at the start of Section~\ref{derived req}, we immediately saw derivations of same-type requirements between pairs of bound type parameters.) +To order dependent member types, we must order associated type declarations, and to do that we must order protocol declarations. -The generics implementation itself is happy to operate on unbound type parameters. When building a generic signature, user-written requirements come in written with unbound type parameters, because initially type resolution has no way to resolve names to associated type declarations; we don't have a generic signature yet. The minimal requirements in the final generic signature, on the other hand, only involve reduced type parameters, which in particular are always bound type parameters. This transformation is part of the requirement minimization algorithm. +\begin{algorithm}[Protocol order]\label{linear protocol order} Takes \IndexDefinition{protocol order}protocols \texttt{P} and \texttt{Q} as input, and returns one of ``$<$'', ``$>$'' or ``$=$'' as output. +\begin{enumerate} +\item Compare the parent \index{module declaration}module names of \texttt{P} and \texttt{Q} with the usual lexicographic order on identifiers. Return the result if it is ``$<$'' or ``$>$''. Otherwise, \texttt{P} and \texttt{Q} are declared in the same module, so keep going. +\item Compare the names of \texttt{P} and~\texttt{Q} and return the result if it is ``$<$'' or ``$>$''. If \texttt{P}~and~\texttt{Q} are actually the same protocol, return ``$=$''. Otherwise, the program is invalid because it declares two protocols with the same name. Any tie-breaker can be used, such as source location. +\end{enumerate} +\end{algorithm} -After a generic signature is available, further invocations of type resolution can query the generic signature about conformance requirements, and find associated type declarations by performing name lookups into protocols. This ensures that type representations only refer to valid type parameters, which are resolved in their bound form. In particular, this means that all type parameters appearing in the \index{interface type}interface type of a \index{value declaration}value declaration will always be bound. +Suppose the \texttt{Barn} module declares a \index{horse}\texttt{Horse} protocol, and the \texttt{Swift} module declares \texttt{Collection}. Then \texttt{Horse} precedes \texttt{Collection}, because they are declared in different modules, and $\texttt{Barn}<\texttt{Swift}$. If the \texttt{Barn} module also declares a \texttt{Saddle} protocol, then $\texttt{Horse}<\texttt{Saddle}$, because both are from the same module, so we compare their protocol names. -This property of interface types is important for type substitution. While type substitution does not require the type parameters inside of an interface type to be reduced, it does require them bound. The reason being, to look up a type witness in a conformance requires the actual associated type declaration, not just its name. The staged behavior where type resolution can output both unbound and bound type parameters is discussed in Chapter~\ref{typeresolution}. +In the previous section, we showed that re-stating an inherited associated type has no effect in our formal system. We also want this to have no effect on reduced types, so our type parameter order must prefer certain associated type declarations over others: -There is one more subtle behavior that follows immediately from the requirement derivation rules. If a type parameter conforms to two unrelated protocols that declare an associated type with the same name, the derivation rules actually make these associated types equivalent. That is, if \texttt{T} conforms to \texttt{P} and \texttt{Q}, both of which define an associated type named \texttt{A}, then we can always derive $\FormalReq{T.[P]A == T.A}$ and $\FormalReq{T.[Q]A == T.A}$ with a \textsc{Member} step, and then $\FormalReq{T.[P]A == T.[Q]A}$ with \textsc{Equiv}. The equivalence class of \texttt{T.A} will contain all three of \verb|T.A|, \verb|T.[P]A|, and \verb|T.[Q]A|. Therefore, we effectively always introduce implicit \index{same-type requirement}same-type requirements between unrelated associated types with the same name. This is explored further in Sections and \ref{tietze transformations}. +\begin{definition}\label{root associated type} A \IndexDefinition{root associated type}\emph{root associated type} is an associated type whose parent protocol does not \index{inherited protocol}inherit from any protocol having an associated type with the same name. +\end{definition} -\paragraph{To infinity} The well-foundedness of the type parameter order is only important if we find ourselves working with infinite sets of type parameters. What does it mean to have infinitely many type parameters though? This is actually quite easy to conjure up. We can declare a protocol \texttt{N} with an associated type conforming to itself: +In the following, \texttt{Derived.Foo} is not a root, because \texttt{Derived} inherits \texttt{Base} and \texttt{Base} also declares an associated type named \texttt{Foo}. However, \texttt{Derived.Bar} is a root: \begin{Verbatim} -protocol N { - associatedtype A: N +protocol Base { + associatedtype Foo // root +} + +protocol Derived: Base { + associatedtype Foo // not a root + associatedtype Bar // root } \end{Verbatim} -Now consider \texttt{<\ttgp{0}{0} where \ttgp{0}{0}:~N>}. In this generic signature, we can derive an infinite sequence of type parameters. We start with some initial derivations: -\begin{gather} -\vdash\ConfReq{Self}{A}_\texttt{N}\tag{1}\\ -\vdash\ttgp{0}{0}\tag{2} -\end{gather} -Now, we can derive a dependent member type: -\begin{gather} -\vdash\ConfReq{\ttgp{0}{0}}{N}\tag{3}\\ -(3)\vdash\texttt{\ttgp{0}{0}.[N]A}\tag{4} -\end{gather} -And another: -\begin{gather} -(3),\,(1)\vdash\ConfReq{\ttgp{0}{0}.[N]A}{N}\tag{5}\\ -(5)\vdash\texttt{\ttgp{0}{0}.[N]A.[N]A}\tag{6} -\end{gather} -And another: -\begin{gather} -(5),\,(1)\vdash\ConfReq{\ttgp{0}{0}.[N]A.[N]A}{N}\tag{7}\\ -(7)\vdash\texttt{\ttgp{0}{0}.[N]A.[N]A.[N]A}\tag{8} -\end{gather} -This continues forever, and we can derive an infinite set of type parameters of arbitrary length. In fact, a consequence of the type parameter being well-founded is that \emph{any} infinite set of type parameters necessarily contains elements of arbitrary length. For any fixed length $n$, there are only finitely many type parameters of length $\le n$, thus almost all elements of our infinite set have length $> n$. This is true for any chosen $n\in\mathbb{N}$. - -In general, a generic signature might have infinitely many equivalence classes, or an equivalence class containing infinitely many type parameters, or both. This is explored further in Section~\ref{type parameter graph} and Section~\ref{recursive conformances}. Well-founded orders will again appear when we introduce rewrite systems in Section~\ref{rewritesystemintro}, and use them to compute reduced types. -\section{Generic Signature Queries}\label{genericsigqueries} -The generics implementation provides a set of entry points for the rest of the compiler to ask questions about the type parameters of generic signatures. Collectively these are called \IndexDefinition{generic signature query}generic signature queries. An description of their implementation will come together in Chapter~\ref{propertymap}. For now, we will define the semantics of these generic signature queries using \index{derived requirement}derived requirements, and look at some examples. Table~\ref{genericsigquerytable} lists the generic signature queries, with an informal grouping into categories: predicates, properties, reduced types, and a combined query to return all properties of a type parameter. +We define the associated type order in terms of the protocol order. -\begin{table}\captionabove{Generic signature queries}\label{genericsigquerytable} -\begin{center} -\begin{tabular}{ll} -\toprule -\textbf{Predicates}&\texttt{isValidTypeParameter()}\\ -&\texttt{requiresProtocol()}\\ -&\texttt{requiresClass()}\\ -&\texttt{isConcreteType()}\\ -\midrule -\textbf{Properties}&\texttt{getRequiredProtocols()}\\ -&\texttt{getSuperclassBound()}\\ -&\texttt{getConcreteType()}\\ -&\texttt{getLayoutConstraint()}\\ -\midrule -\textbf{Reduced types}&\texttt{areReducedTypeParametersEqual()}\\ -&\texttt{isReducedType()}\\ -&\texttt{getReducedType()}\\ -\midrule -\textbf{Combined}&\texttt{getLocalRequirements()}\\ -\bottomrule -\end{tabular} -\end{center} -\end{table} +\IndexDefinition{associated type order}% +\begin{algorithm}[Associated type order]\label{associated type order}% +Takes associated type declarations $\texttt{A}_1$ and $\texttt{A}_2$ as input, and returns one of ``$<$'', ``$>$'' or ``$=$'' as output. +\begin{enumerate} +\item First, compare their names lexicographically. Return the result if it is ``$<$'' or ``$>$''. Otherwise, both associated types have the same name, so keep going. +\item If $\texttt{A}_1$ is a root associated type and $\texttt{A}_2$ is not, return ``$<$''. +\item If $\texttt{A}_2$ is a root associated type and $\texttt{A}_1$ is not, return ``$>$''. +\item Otherwise, $\texttt{A}_1$ and $\texttt{A}_2$ are both root associated types. Compare the parent protocols of $\texttt{A}_1$~and~$\texttt{A}_2$ using \AlgRef{linear protocol order} and return the result if it is ``$<$'' or ``$>$''. +\item Otherwise, we have two associated types with the same name in the same protocol. If $\texttt{A}_1$ and $\texttt{A}_2$ are the same associated type declaration, return ``$=$''. +\item Otherwise, the program is invalid. As with the protocol order, any tie-breaker can be used, such as source location. +\end{enumerate} +\end{algorithm} -\paragraph{Predicate queries} -The simplest of all queries are the binary predicates, which respond with \texttt{true} or \texttt{false}. Each one takes a type parameter, and tries to derive either the validity of the type parameter itself, or a requirement involving this type parameter. -\begin{description} -\item [\texttt{isValidTypeParameter()}] \IndexDefinition{isValidTypeParameter()@\texttt{isValidTypeParameter()}}takes a type parameter \texttt{T}, and answers if \texttt{T} is a \index{valid type parameter}valid type parameter in this generic signature. +We now have enough to order all type parameters. The type parameter order does not distinguish \index{sugared type}type sugar, so it outputs ``$=$'' if and only if \texttt{T} and \texttt{U} are canonically equal. -\item [\texttt{requiresProtocol()}] \IndexDefinition{requiresProtocol()@\texttt{requiresProtocol()}}takes a type parameter \texttt{T} and protocol \texttt{P}, and answers if the \index{conformance requirement}conformance requirement $\ConfReq{T}{P}$ can be derived from this generic signature. +\begin{algorithm}[Type parameter order]\label{type parameter order} +Takes type parameters \texttt{T} and \texttt{U} as input, and returns one of ``$<$'', ``$>$'' or ``$=$'' as output. +\begin{enumerate} +\item If \texttt{T} is a generic parameter type and \texttt{U} is a \index{dependent member type}dependent member type, then $|\texttt{T}|<|\texttt{U}|$, so return ``$<$''. +\item If \texttt{T} is a dependent member type and \texttt{U} is a generic parameter type, then $|\texttt{T}|>|\texttt{U}|$, so return ``$>$''. +\item If \texttt{T} and \texttt{U} are both generic parameter types, compare them using \AlgRef{generic parameter order} and return the result. +\item Otherwise, both are dependent member types. Recursively compare the base type of \texttt{T} with the base type of \texttt{U}, and return the result it is ``$<$'' or ``$>$''. -\item [\texttt{requiresClass()}] \IndexDefinition{requiresClass()@\texttt{requiresClass()}}takes a type parameter \texttt{T} and answers if the layout requirement $\ConfReq{T}{AnyObject}$ can be derived from this generic signature, meaning that \texttt{T} has a single retainable pointer representation. +\item Otherwise, \texttt{T} and \texttt{U} have canonically equal base types. -\item [\texttt{isConcreteType()}] \IndexDefinition{isConcreteType()@\texttt{isConcreteType()}}takes a type parameter \texttt{T} and answers if a same-type requirement between \texttt{T} and a concrete type can be derived from this generic signature. -\end{description} +\item If \texttt{T} is \index{bound dependent member type}bound and \texttt{U} is \index{unbound dependent member type}unbound, return ``$<$''. +\item If \texttt{T} is unbound and \texttt{U} is bound, return ``$>$''. +\item If \texttt{T} and \texttt{U} are both unbound dependent member types, compare their names lexicographically and return the result. +\item If \texttt{T} and \texttt{U} are both bound dependent member types, compare their associated type declarations using \AlgRef{associated type order} and return the result. +\end{enumerate} +\end{algorithm} + +\begin{definition} +Let $G$ be a generic signature. A valid type parameter \texttt{T} of $G$ is a \emph{reduced type parameter} if $G\vdash\SameReq{T}{U}$ implies that $\texttt{T}\leq\texttt{U}$ for all~\texttt{U}. +\end{definition} + +The type parameter order is a linear order, so if we can identify a reduced type parameter in some equivalence class, we know that it must be unique. However, we haven't seen how to calculate the reduced type parameter yet. Indeed, we haven't even established that an equivalence class of type parameters \emph{has} a minimum element. For example, the set of \index{integers}integers~$\mathbb{Z}$ certainly does not, under the usual linear order: +\[\cdots < -2 < -1 < 0 < 1 < 2 < \cdots\] +We wish to rule out this possibility: +\begin{definition} +A linear order with domain $S$ is \IndexDefinition{well-founded order}\emph{well-founded} if every non-empty subset of $S$ contains a minimum element. + +In the more general case of a partial order, this is not the right definition, because a set consisting of two incomparable elements does not have a minimum element. For a \index{partial order}partial order, the correct condition is that the domain $S$ must not contain an \emph{infinite descending chain}, which is an infinite sequence of elements $x_i\in S$ such that: +\[x_1 > x_2 > x_3 > \ldots \] +For a linear order the two conditions are equivalent. +\end{definition} + +Any partial order on a finite set is always well-founded. The usual linear order on the \index{natural numbers}natural numbers~$\mathbb{N}$ is also well-founded (essentially by \emph{definition} of $\mathbb{N}$). + +\begin{proposition}\label{well founded type order} The type parameter order of \AlgRef{type parameter order} is well-founded. +\end{proposition} +\begin{proof} +Let $S$ be any set of type parameters, possibly infinite. We will demonstrate that $S$ contains a minimum element. + +First, define the set $L(S)\subseteq\mathbb{N}$ by taking the length of each type parameter in $S$. While~$L(S)$ might be an infinite set, it must have a minimum element under the linear order on~$\mathbb{N}$, say~$n\in L(S)$. Let~$S_n\subseteq S$ be the subset of $S$ containing only type parameters of length~$n$. If~$S_n$ has a minimum element, it will also be the minimum element of~$S$. + +On the other hand, a set of type parameters of fixed length must be finite. If our entire program and all imported modules declare a total of $g$~distinct generic parameters and $a$~associated types, then the number of type parameters of length~$n$, generated by all generic signatures, cannot exceed $g(2a)^n$. (We count each associated type twice, to account for bound and unbound type parameters, but this exact number is irrelevant; the important fact is that it's finite.) + +Thus, the finite set~$S_n$ contains a minimum element, and so does~$S$, and since~$S$ was chosen arbitrarily, we conclude that the type parameter order is well-founded. +\end{proof} + +The type parameter order plays an important role in the Swift \index{ABI}ABI. For example, the \index{mangling}mangled symbol name of a generic function incorporates the function's parameter types and return type, and the mangler takes reduced types to ensure that cosmetic changes to the declaration have no effect on binary compatibility. + +To define reduced type parameters, we compared elements in a \emph{single} equivalence class. We also use the type parameter order to establish an order relation \emph{between} equivalence classes, by comparing their reduced types. We will extend this to a partial order on \emph{requirements} in \AlgRef{requirement order}. This requirement order surfaces in the Swift ABI: +\begin{enumerate} +\item The calling convention for a generic function passes a witness table for each explicit \index{conformance requirement}conformance requirement in the function's \index{generic signature}generic signature, sorted in this order. +\item The in-memory layout of a \index{witness table}witness table stores an associated witness table for each associated conformance requirement in the protocol's \index{requirement signature}requirement signature, sorted in this order. +\end{enumerate} + +In \ChapRef{genericenv}, we introduce \emph{archetypes}, a self-describing representation of a reduced type parameter packaged up with a generic signature, which behaves more like a concrete type inside the compiler. An archetype thus represents an entire equivalence class of type parameters, and for this reason we will also use the equivalence class notation~$\archetype{T}$ to denote the archetype corresponding to the type parameter~\texttt{T}. + +A more complete treatment of equivalence relations and partial orders can be found in a discrete mathematics textbook, such as~\cite{grimaldi}. Something to keep in mind is that some authors say that a partial order is reflexive instead of anti-reflexive, so $\leq$ is the basic operation. This necessitates the additional condition that if both $x\leq y$ and $y\geq x$ are true, then $x=y$; without this assumption, we have a \index{preorder}\emph{preorder}. Sometimes, a linear order is also called a \emph{total order}, and a set with a well-founded order is said to be \emph{well-ordered}. The type parameter order is actually a special case of a \index{shortlex order}\emph{shortlex order}. Under reasonable assumptions, a shortlex order is always well-founded. We will see another instance of a shortlex order in \SecRef{finding conformance paths}, and then generalize the concept when we introduce the theory of string rewriting in \SecRef{rewritesystemintro}. + +\section{Generic Signature Queries}\label{genericsigqueries} + +In the implementation, we answer questions about the derived requirements and valid type parameters of a generic signature by invoking \IndexDefinition{generic signature query}\emph{generic signature queries}. These are methods on the \texttt{GenericSignature} class, used throughout the compiler. We can formalize the behavior of these queries using the notation we developed earlier in this chapter. Each query defines a mathematical \index{function}function that takes a generic signature and at least one other piece of data, and evaluates to some property of this signature. + +\paragraph{Basic queries.} We saw that the equivalence class structure of a generic signature is generated by conformance requirements, and same-type requirements between type parameters. The first four queries allow us to answer questions about this structure, and their behavior is completely defined by the derived requirements formalism. + +\begin{itemize} +\QueryDef{requiresProtocol} +{G,\,\texttt{T},\,\texttt{P}}{type parameter \texttt{T}, protocol \texttt{P}.} +{true or false: $G\vdash\ConfReq{T}{P}$?} +{Decides if a type parameter \index{conformance requirement!generic signature query}conforms to a given protocol.} + +\QueryDef{areReducedTypeParametersEqual} +{G,\,\texttt{T},\,\texttt{U}} +{type parameter \texttt{T}, type parameter \texttt{U}.} +{true or false: $G\vdash\SameReq{T}{U}$?} +{Decides if two type parameters are in the same \index{reduced type equality!generic signature query}equivalence class.} + +\QueryDef{isValidTypeParameter} +{G,\,\texttt{T}} +{type parameter \texttt{T}.} +{true or false: $G\vdash\texttt{T}$?} +{Decides if a type parameter is \index{valid type parameter!generic signature query}valid.} + +\QueryDef{getRequiredProtocols} +{G,\,\texttt{T}} +{type parameter \texttt{T}.} +{All protocols~\texttt{P} such that $G\vdash\ConfReq{T}{P}$.} +{Produces a list of all protocols this type parameter is known to conform to. +\end{itemize} + +The $\Query{getRequiredProtocols}{}$ query is used when type checking a \index{member reference expression}member reference expression like ``\texttt{foo.bar}'' where the type of ``\texttt{foo}'' is a type parameter, because we resolve ``\texttt{bar}'' by a \index{qualified lookup}qualified name lookup into this list of protocols. Qualified lookup recursively visits inherited protocols, so the list is minimal in the sense that no protocol inherits from any other. The protocols are also sorted using \AlgRef{linear protocol order}.} \begin{example} -Consider this pair of generic signatures: +Let $G$ be the generic signature from \ExRef{motivating derived reqs}: \begin{quote} \begin{verbatim} - - +<τ_0_0, τ_0_1 where τ_0_0: Sequence, τ_0_1: Sequence, + τ_0_0.Element: Equatable, + τ_0_0.Element == τ_0_1.Element> \end{verbatim} \end{quote} -\texttt{isValidTypeParameter(E)} is true in both signatures: -\begin{gather*} -\vdash \texttt{E} -\end{gather*} -\texttt{isValidTypeParameter(F)} is only true in the second signature; the first signature does not have a generic parameter \texttt{F}: -\begin{gather*} -\vdash \texttt{F} -\end{gather*} -\texttt{isValidTypeParameter(E.Element)} is true in both signatures. Note that in a multi-step derivation, we can number each step and refer to previous steps by number instead of re-stating a requirement: -\begin{gather} -\vdash \FormalReq{E: Sequence}\tag{1}\\ -(1) \vdash \texttt{E.[Sequence]Element}\tag{2} -\end{gather} -\item \texttt{isValidTypeParameter(E.Element.Element)} is only true in the second signature, as $\FormalReq{E.Element:~Sequence}$ cannot be derived in the first signature. +\begin{enumerate} +\item $\Query{requiresProtocol}{G,\, \texttt{\rT.Element},\,\texttt{Equatable}}$ is true. +\item $\Query{requiresProtocol}{G,\, \texttt{\rT.Iterator.Element},\, \texttt{Equatable}}$ is true. +\item $\Query{requiresProtocol}{G,\, \texttt{\rT.Iterator},\, \texttt{Equatable}}$ is false. +\item $\Query{areReducedTypeParametersEqual}{G,\, \texttt{\rT.Element},\, \texttt{\rU.Element}}$ is true. +\item $\Query{areReducedTypeParametersEqual}{G,\, \texttt{\rT.Iterator},\, \texttt{\rU.Iterator}}$ is false. +\item $\Query{isValidTypeParameter}{G,\, \texttt{\rT.Element}}$ is true. +\item $\Query{isValidTypeParameter}{G,\, \texttt{\rT.Iterator.Element}}$ is true. +\item $\Query{isValidTypeParameter}{G,\, \texttt{\rT.Element.Iterator}}$ is false. +\item $\Query{getRequiredProtocols}{G,\, \texttt{\rT.Iterator}}$ is $\{\texttt{IteratorProtocol}\}$. +\end{enumerate} \end{example} -\begin{example} -Consider this generic signature: +From a theoretical point of view, the two fundamental queries are $\Query{requiresProtocol}{}$ and $\Query{areReducedTypeParametersEqual}{}$. The other two, $\Query{isValidTypeParameter}{}$ and $\Query{getRequiredProtocols}{}$, while primitive in the implementation, can be formalized in terms of the others. Here is $\Query{isValidTypeParameter}{G,\,\texttt{T}}$: +\begin{itemize} +\item A \index{generic parameter type}generic parameter type is valid if it appears in~$G$. +\item An \index{unbound dependent member type}unbound dependent member type \texttt{U.A} is valid if there exists a protocol \texttt{P} in $\Query{getRequiredProtocols}{G,\,\texttt{U}}$ +such that \texttt{P} declares an associated type named~\texttt{A}. +\item A \index{bound dependent member type}bound dependent member type \texttt{U.[P]A} is valid if $\Query{requiresProtocol}{G,\,\texttt{U},\,\texttt{P}}$. +\end{itemize} +As for $\Query{getRequiredProtocols}{}$, because the ``for all'' quantifier is selecting from a finite universe of protocols, a correct but inefficient implementation would repeatedly check if $\Query{requiresProtocol}{G,\texttt{T},\,\texttt{P}}$ for each~\texttt{P} in turn. We will describe the data structure that allows this to be implemented efficiently in \ChapRef{propertymap}. + +\paragraph{Concrete types.} The next two queries concern concrete \index{same-type requirement}same-type requirements. We can ask if a type parameter is fixed to a concrete type, and then we can ask for this concrete type. (We use ``$\vdash$'' below, but recall from \SecRef{derived req} that we \index{limitation!derived requirements}don't have a complete set of inference rules to describe the implemented behavior of concrete same-type requirements.) + +\begin{itemize} +\QueryDef{isConcreteType} +{G, \texttt{T}} +{type parameter \texttt{T}.} +{true or false: exists some concrete type \texttt{X} such that $G\vdash\SameReq{T}{X}$?} +{Decides if a type parameter is fixed to a concrete type.} + +\QueryDef{getConcreteType} +{G, \texttt{T}} +{type parameter \texttt{T}.} +{some concrete type \texttt{X} such that $G\vdash\SameReq{T}{X}$.} +{Outputs the fixed concrete type of a type parameter.} +\end{itemize} + +\begin{example}\label{concrete type query example} +Let $G$ be the generic signature, \begin{quote} \begin{verbatim} -, - U: Executor, V: NSObject> +<τ_0_0 where τ_0_0: Foo, τ_0_0.[Foo]B == Int> \end{verbatim} \end{quote} -\texttt{requiresProtocol(T, Collection)} is true because the requirement is directly stated: -\begin{gather} -\vdash\ConfReq{T}{Collection}\tag{1} -\end{gather} -\texttt{requiresProtocol(T, Sequence)} is true because \texttt{Collection} inherits from \texttt{Sequence}: -\begin{gather*} -\vdash \ConfReq{Self}{Sequence}\tag{2}\\ -(1),\,(2)\vdash \ConfReq{T}{Sequence}\tag{3} -\end{gather*} -\texttt{requiresProtocol(T.Iterator, IteratorProtocol)} is true because \texttt{Sequence} makes \texttt{Iterator} conform to \texttt{IteratorProtocol}: -\begin{gather} -\vdash \ConfReq{Self.Iterator}{IteratorProtocol}_\texttt{Sequence}\tag{4}\\ -(3),\,(4)\vdash \FormalReq{T.Iterator:~IteratorProtocol}\tag{5} -\end{gather} -\texttt{requiresClass(U)} is true because \texttt{Executor} is defined as a \index{class-constrained protocol}class-constrained protocol in the standard library: -\begin{gather} -\vdash \ConfReq{U}{Executor}\tag{6}\\ -\vdash \ConfReq{Self}{AnyObject}_\texttt{Executor}\tag{7}\\ -(6),\,(7)\vdash \ConfReq{U}{AnyObject}\tag{8} -\end{gather} -\texttt{requiresClass(V)} is true because \texttt{NSObject} is a class. This proof requires derivation kinds we haven't introduced, because we must say things about concrete types. So keep in mind the below is hand-waving, for now: -\begin{gather} -\vdash\FormalReq{V:~NSObject}\tag{9}\\ -\vdash\FormalReq{NSObject:~AnyObject}\qquad\mbox{(*)}\tag{10}\\ -(1),\,(2)\vdash\FormalReq{V:~AnyObject}\tag{11} -\end{gather} -\texttt{isConcreteType(T.Element)} is true because the requirement is directly stated: -\begin{gather} -\vdash\FormalReq{T.Element == Array}\tag{12} -\end{gather} -\texttt{isConcreteType(T.Iterator.Element)} is also true because it is implied by the same-type requirement in \texttt{Sequence}: -\begin{gather} -\vdash\FormalReq{Self.Element == Self.Iterator.Element}_\texttt{Sequence}\tag{13}\\ -(3),(13)\vdash\FormalReq{T.Element == T.Iterator.Element}\tag{14}\\ -(14)\vdash\FormalReq{T.Iterator.Element == T.Element}\tag{15}\\ -(15),(12)\vdash\FormalReq{T.Iterator.Element == Array}\tag{16} -\end{gather} +with protocol \texttt{Foo} as below: +\begin{Verbatim} +protocol Foo { + associatedtype A where A == Array + associatedtype B +} +\end{Verbatim} +\begin{enumerate} +\item $\Query{isConcreteType}{G,\,\texttt{\rT.[Foo]A}}$ is true. +\item $\Query{getConcreteType}{G,\,\texttt{\rT.[Foo]A}}$ is \texttt{Array<\rT.[Foo]B>}. +\item $\Query{getConcreteType}{G,\,\texttt{\rT.[Foo]B}}$ is \texttt{Int}. +\end{enumerate} \end{example} -\IndexDefinition{getRequiredProtocols()@\texttt{getRequiredProtocols()}} -\IndexDefinition{getSuperclassBound()@\texttt{getSuperclassBound()}} -\IndexDefinition{getConcreteType()@\texttt{getConcreteType()}} -\IndexDefinition{getLayoutConstraint()@\texttt{getLayoutConstraint()}} -\Index{AnyObject@\texttt{AnyObject}} -\index{superclass requirement} -\index{layout requirement} -\paragraph{Property queries} -The next set of queries derive more complex properties that are not just true/false predicates. -\begin{description} -\item [\texttt{getRequiredProtocols()}] takes a type parameter \texttt{T}, and returns a list of all protocols \texttt{P} such that a conformance requirement $\ConfReq{T}{P}$ can be derived from this signature. The list is minimal in the sense that no protocol inherits from any other protocol in the list, and the elements are sorted in canonical protocol order (Definition~\ref{linear protocol order}). -\item [\texttt{getSuperclassBound()}] takes a type parameter \texttt{T}. If a superclass requirement $\ConfReq{T}{C}$ can be derived from this signature, returns the class type \texttt{C}; otherwise, returns the empty type. -\item [\texttt{getConcreteType()}] takes a type parameter \texttt{T}. If a concrete type requirement $\FormalReq{T == C}$ can be derived from this signature, returns the type \texttt{C}; otherwise, returns the empty type. -\item [\texttt{getLayoutConstraint()}] takes a type parameter \texttt{T}. If a layout requirement $\ConfReq{T}{L}$ can be derived from this signature, returns the layout constraint \texttt{L}, otherwise returns the empty layout constraint. - -The \texttt{AnyObject} layout constraint is the only one that can be explicitly written in source. A second kind of layout constraint, \texttt{\_NativeClass}, can be derived from a superclass requirement whose superclass is a native Swift class, meaning a class not inheriting from \texttt{NSObject}. The \texttt{\_NativeClass} layout constraint implies the \texttt{AnyObject} layout constraint. - -The two differ in how reference counting operations on their instances are lowered in code generation; arbitrary class instances use the \index{Objective-C}Objective-C runtime entry points for retain and release operations, whereas native class instances use a more efficient calling convention. -\end{description} - -\begin{example} -Assume we have a generic class declaration \verb|class G {}|. Then, in the following generic signature, \texttt{getSuperclassBound(T)} is \texttt{G}: +\paragraph{Reduced types.} We will now generalize \index{reduced type parameter}reduced type equality from \SecRef{type params} to an equivalence relation on all interface types. In the previous example, the first invocation of $\Query{getConcreteType}{}$ returned a concrete type containing a type parameter \texttt{\rT.[Foo]B}, which was itself fixed to the concrete type~\texttt{Int}. We can write down three interface types that all abstract over the same concrete replacement type in our generic signature: +\begin{gather*} +\texttt{\rT.A}\\ +\texttt{Array<\rT.[Foo]B>}\\ +\texttt{Array} +\end{gather*} +This suggests that if one interface type can be obtained from another by replacing any type parameters that are fixed to concrete types, they ought to be considered equivalent. Now, consider this signature, with the same protocol~\texttt{Foo}: \begin{quote} \begin{verbatim} -> +<τ_0_0, τ_0_1 where τ_0_0: Foo, τ_0_1 == τ_0_0.[Foo]B> \end{verbatim} \end{quote} -\end{example} +Here, instead of completely fixing it to \texttt{Int}, we just ask that the concrete replacement type for~\texttt{\rT.[Foo]B} is canonically equal to the concrete replacement type for~\rU. In this generic signature, the following three interface types abstract over the same concrete replacement type: +\begin{gather*} +\texttt{\rT.A}\\ +\texttt{Array<\rT.[Foo]B>}\\ +\texttt{Array<\rU>} +\end{gather*} +This suggests a process where we replace each type parameter within an interface type with its reduced type parameter. We assume we have a subroutine to compute the reduced type parameter of an equivalence class already, so we can iterate our simplifying transformations until fixed point: -\begin{example}\label{concrete type query example} -In the following generic signature, \texttt{getConcreteType(T.Index)} is \texttt{Int}: +\begin{algorithm}[Reduced type of interface type]\label{reduced type algorithm} +Takes a generic signature~$G$, and an interface type~\texttt{T}, as input. Outputs the reduced type of~\texttt{T}. +\begin{enumerate} +\item If \texttt{T} is a type parameter: +\begin{enumerate} +\item If $\Query{isConcreteType}{G,\,\texttt{T}}$: call $\Query{getConcreteType}{G,\,\texttt{T}}$, apply the algorithm recursively, and return the result. +\item Otherwise, find the reduced type parameter~$\texttt{T}^\prime$ in the equivalence class of~\texttt{T} and return $\texttt{T}^\prime$. +\end{enumerate} +\item Otherwise, \texttt{T} is a concrete type. If \texttt{T} does not contain any child types, return \texttt{T}. +\item Otherwise, recursively reduce each child type of~\texttt{T}, and construct a new type from these reduced child types together with any non-type attributes of~\texttt{T}. +\end{enumerate} +\end{algorithm} + +As its name suggests, the prior algorithm outputs a special kind of interface type: + +\begin{definition} +An interface type is a \IndexDefinition{reduced type}\emph{reduced type} in case the following two conditions hold for each type parameter contained in the interface type: +\begin{enumerate} +\item Every such type parameter is a reduced type parameter. +\item No such type parameter is fixed to a concrete type. +\end{enumerate} +\end{definition} + +To guarantee termination, we must rule out self-referential same-type requirements like $\SameReq{\rT}{Array<\rT>}$, otherwise we'll get stuck reducing \rT, \texttt{Array<\rT>}, \texttt{Array>}, and so on. We'll describe how in \SecRef{subst simplification}, but for now we just assume they don't appear. + +A \index{fully-concrete type!reduced type}fully-concrete type (one without type parameters) is trivially a reduced type. In the implementation, we say that a reduced type must also be canonical, so it cannot contain \index{sugared type!not reduced}sugared types. This gives us the \index{reduced type equality!on interface types}reduced type equality relation on interface types: two interface types are equivalent under this relation if they have canonically equal reduced types. The following two generic signature queries pertain to reduced types: + +\begin{itemize} +\QueryDef{isReducedType} +{G,\,\texttt{T}} +{interface type \texttt{T}} +{true or false: is \texttt{T} canonically equal to its reduced type?} +{Decides if an interface type is already a reduced type.} + +\QueryDef{getReducedType} +{G,\,\texttt{T}} +{interface type \texttt{T}} +{the reduced type of \texttt{T}.} +{Computes the reduced type of an interface type using \AlgRef{reduced type algorithm}. This will output a canonical type, so it will not contain type sugar.} +\end{itemize} + +\begin{example} +Equivalence of interface types is a \index{coarser relation}\emph{coarser} relation than equivalence of type parameters, because if two type parameters are equivalent as type parameters, they must also be equivalent as interface types. The converse does not hold. Let~$G$ be the generic signature: \begin{quote} \begin{verbatim} -> +<τ_0_0, τ_0_1 where τ_0_0 == Int, τ_0_1 == Int> \end{verbatim} \end{quote} -Proving this by hand is tricky, and demonstrates the expressivity of generic signature queries. First, we derive $\FormalReq{T.Indices:~Collection}$: -\begin{gather} -\vdash \ConfReq{T}{Collection}\tag{1}\\ -\vdash \FormalReq{Self.Indices:~Collection}_\texttt{Collection}\tag{2}\\ -(1),\,(2)\vdash\FormalReq{T.Indices:~Collection}\tag{3} -\end{gather} -Next, we need a same-type requirement between \texttt{T.Index} and \texttt{T.Indices.Element}: -\begin{gather} -\vdash \FormalReq{Self.Index == Self.Indices.Element}_\texttt{Collection}\tag{4}\\ -(1),\,(4)\vdash \FormalReq{T.Index == T.Indices.Element}\tag{5} -\end{gather} -Now, we're going to use the \textsc{Member} derivation step, by which same-type requirements recursively apply to dependent member types: -\[\ConfReq{T}{P},\,\FormalReq{T == U}\vdash\FormalReq{T.[P]A == U.[P]A}\] -Consider these two requirements, the first minimal, and the second derived: -\begin{gather*} -\FormalReq{T.Indices == Range}\\ -\FormalReq{T.Indices:~Collection} -\end{gather*} -According to the above, we can derive a new same-type requirement with left hand-side \texttt{T.[P]A}, and the right hand side a certain substitution of the concrete type \texttt{Range}. Specifically, it is the result of replacing \texttt{Self} with \texttt{Range} in the dependent member type \texttt{Self.[Sequence]Element}. - -What we actually want to do is apply a substitution map to a dependent member type; this will be formalized in Section~\ref{abstract conformances}. For now, it is enough to know that \texttt{Range} conforms to \texttt{Sequence}. We're interested in the \emph{type witness} for the \texttt{Element} associated type in this conformance. The conformance is defined in the standard library. It has a conditional requirement: -\begin{Verbatim} -extension Range: Collection where Element: Strideable {...} -\end{Verbatim} -It happens that \texttt{Int} conforms to \texttt{Strideable}, thus \texttt{Range} satisfies the conditional requirements of this conformance: -\begin{Verbatim} -extension Int: Strideable {...} -\end{Verbatim} -The \texttt{Element} associated type is witnessed by the \texttt{Element} generic parameter of \texttt{Range}, which in the case of \texttt{Range}, is \texttt{Int}. Thus, we complete our derivation: -\begin{gather} -\vdash \FormalReq{T.Indices == Range}\tag{6}\\ -(1),\,(5)\vdash \FormalReq{T.Indices.Element == Int}\tag{7}\\ -(6),\,(7)\vdash \FormalReq{T.Index == Int}\tag{8} -\end{gather} -The final result then, is that \texttt{T.Index} is fixed to the concrete type \texttt{Int}. +\begin{enumerate} +\item $\Query{areReducedTypeParametersEqual}{G,\,\rT,\,\rU}$ is false. +\item $\Query{getReducedType}{G,\,\rT}$ is \texttt{Int}. +\item $\Query{getReducedType}{G,\,\rU}$ is \texttt{Int}. +\end{enumerate} +Thus, \rT\ and \rU\ are equivalent as interface types, but not as type parameters. \end{example} -\paragraph{Reduced type queries} -The next three generic signature queries compute \IndexSource{reduced type}reduced types. To test two arbitrary types for reduced type equality, apply \texttt{getReducedType()} to each and compare the results for canonical type equality. -\begin{description} -\item [\texttt{areReducedTypeParametersEqual()}] \IndexDefinition{areReducedTypeParametersEqual()@\texttt{areReducedTypeParametersEqual()}}takes two type parameters \texttt{T} and \texttt{U} and answers if the same-type requirement $\FormalReq{T == U}$ can be derived from this signature. Unlike the next two queries, it only operates on type parameters and does not produce a useful result if one or the other type parameter is fixed to a concrete type. - -\item [\texttt{isReducedType()}] \IndexDefinition{isReducedType()@\texttt{isReducedType()}}answers if an arbitrary type is a reduced type, by checking if any type parameters it contains are reduced types. Non-canonical types are never considered reduced. Applying \texttt{getReducedType()} to a type for which \texttt{isReducedType()} returns true will return the type unchanged. +\paragraph{Other requirements.} The final set of queries concern superclass and layout requirements. Once again, we note that ``$\vdash$'' is aspirational, because not all implemented behaviors of these requirement kinds are described by our formal system. -\item [\texttt{getReducedType()}] \IndexDefinition{getReducedType()@\texttt{getReducedType()}}computes the reduced type of an arbitrary interface type, replacing any type parameters it contains with their reduced type. Passing the result of \texttt{getReducedType()} to \texttt{isReducedType()} will always return \texttt{true}. -\end{description} +\begin{itemize} +\QueryDef{getSuperclassBound} +{G, \texttt{T}} +{type parameter \texttt{T}.} +{some concrete class type \texttt{C} such that $G\vdash\ConfReq{T}{C}$.} +{Outputs the \index{superclass requirement!generic signature query}superclass bound, if the type parameter has one.} + +\QueryDef{requiresClass} +{G, \texttt{T}} +{type parameter \texttt{T}.} +{true or false: $G\vdash\ConfReq{T}{AnyObject}$?} +{Decides if \texttt{T} has a \index{AnyObject}\index{layout requirement!generic signature query}single retainable pointer representation.} + +\QueryDef{getLayoutConstraint} +{G, \texttt{T}} +{type parameter \texttt{T}.} +{some layout constraint \texttt{L} such that $\texttt{T}\vdash\ConfReq{T}{L}$.} +{Outputs the \index{layout requirement!generic signature query}layout constraint, if the type parameter has one.} + +The \Index{AnyObject@\texttt{AnyObject}}\texttt{AnyObject} layout constraint is the only one that can be explicitly written in source. A second kind of layout constraint, \texttt{\_NativeClass}, is implied by a superclass bound being a native Swift class, meaning a class not inheriting from \texttt{NSObject}. The \texttt{\_NativeClass} layout constraint in turn implies the \texttt{AnyObject} layout constraint. + +The two differ in how reference counting operations are lowered in \index{IRGen}IRGen. Classes of unknown ancestry use the \index{Objective-C}Objective-C runtime entry points, whereas native class instances use different entry points in the Swift runtime. +\end{itemize} \begin{example} -In the generic signature of Example~\ref{concrete type query example}, \texttt{getReducedType(T.Index)} returns \texttt{Int}, just like \texttt{getConcreteType(T.Index)}. In fact, \texttt{getConcreteType()} is just a ``weaker'' form of \texttt{getReducedType()}; it does not guarantee that every type parameter appearing inside the concrete type is reduced. - -Now take the generic signature \verb|| with this protocol: +Let $G$ be the generic signature, +\begin{quote} +\begin{verbatim} +<τ_0_0, τ_0_1, τ_0_2 where τ_0_0: Form, τ_0_1: Shape, τ_0_2: Entity> +\end{verbatim} +\end{quote} +together with the three declarations: \begin{Verbatim} -protocol P { - associatedtype A - associatedtype B where A == Array -} +class Shape {} +protocol Form: AnyObject {} +protocol Entity: Shape, Form {} \end{Verbatim} -The reduced type of \texttt{T.[P]A} is \texttt{Array}: -\begin{gather} -\vdash \ConfReq{T}{P}\tag{1}\\ -\vdash \FormalReq{Self.[P]A == Array}_\texttt{P}\tag{2}\\ -(1),\,(2)\vdash \FormalReq{T.[P]A == Array}\tag{3}\\ -\vdash \FormalReq{T.[P]B == Int}\tag{4}\\ -(4)\vdash \FormalReq{Array == Array}\tag{5}\\ -(3),\,(4)\vdash\FormalReq{T.[P]A == Array}\tag{6} -\end{gather} -Note that we first derive a same-type requirement between \texttt{T.[P]A} and \texttt{Array}, but the latter type is not reduced, because \texttt{T.[P]B} is also fixed to a concrete type. + +\begin{enumerate} +\item $\Query{getSuperclassBound}{G,\,\texttt{\rT}}$ is null. +\item $\Query{getSuperclassBound}{G,\,\texttt{\rU}}$ is \texttt{Shape}. +\item $\Query{getSuperclassBound}{G,\,\texttt{\rV}}$ is \texttt{Shape}. +\item $\Query{requiresClass}{G,\,\texttt{\rT}}$ is true. +\item $\Query{requiresClass}{G,\,\texttt{\rU}}$ is true. +\item $\Query{requiresClass}{G,\,\texttt{\rV}}$ is true. +\end{enumerate} +We can write down a derivation of $\Query{requiresClass}{G,\,\texttt{\rT}}$: +\begin{gather*} +\ConfStep{\rT}{Form}{1}\\ +\AssocLayoutStep{1}{\rT}{2} +\end{gather*} +However, we cannot write down a derivation of $\Query{requiresClass}{G,\,\texttt{\rU}}$, even though we \emph{should} be able to. The implementation says this derived requirement holds, because any concrete replacement type for \rU\ that satisfies the superclass requirement must also satisfy the layout requirement. We are unable to make this inference within our formal system, so it is \emph{incomplete}. In this case, the missing rule is one that allows us to derive $\ConfReq{T}{AnyObject}$ from a superclass requirement $\ConfReq{T}{C}$. A formal description of the missing rules remains a work in progress. \end{example} -\paragraph{Combined queries} +\paragraph{Sugared types.} To avoid printing canonical generic parameter types like \ttgp{d}{i} in \index{diagnostic!printing generic parameter type}diagnostics, a special query can be used to transform a canonical type into a sugared type, using the generic parameter names of a given generic signature. + +\begin{itemize} +\QueryDef{getSugaredType} +{G, \texttt{T}} +{interface type \texttt{T}.} +{a canonically equal interface type, written using the sugared types of $G$.} +{} +\end{itemize} -\index{local requirements} -\IndexDefinition{getLocalRequirements()@\texttt{getLocalRequirements()}} +\paragraph{Combined queries.} To simplify archetype construction inside a generic environment (\ChapRef{genericenv}), we use a special entry point to perform multiple lookups at once. Its behavior is otherwise completely determined by the other queries. -The \texttt{getLocalRequirements()} query builds a single structure from the result of several queries against the same type parameter, to simplify archetype construction inside a generic environment (Chapter~\ref{genericenv}): -\begin{verbatim} -getReducedType() -getRequiredProtocols() -getSuperclassBound() -getLayoutConstraint() -\end{verbatim} +\begin{itemize} +\QueryDef{getLocalRequirements} +{G, \texttt{T}} +{type parameter \texttt{T}.} +{a \index{local requirements}single structure with the result of several queries.} +{This outputs all known data points about the equivalence class of~\texttt{T}: +\begin{gather*} +\Query{getReducedType}{G,\,\texttt{T}}\\ +\Query{getRequiredProtocols}{G,\,\texttt{T}}\\ +\Query{getSuperclassBound}{G,\,\texttt{T}}\\ +\Query{getLayoutConstraint}{G,\,\texttt{T}} +\end{gather*}} +\end{itemize} \section{Source Code Reference}\label{genericsigsourceref} @@ -791,14 +1269,14 @@ \section{Source Code Reference}\label{genericsigsourceref} \index{declaration context} \apiref{DeclContext}{class} -See also Section~\ref{declarationssourceref} and Section~\ref{genericdeclsourceref}. +See also \SecRef{declarationssourceref}. \begin{itemize} \item \texttt{getGenericSignatureOfContext()} returns the generic signature of the innermost generic context, or the empty generic signature if there isn't one. \end{itemize} \index{generic context} \apiref{GenericContext}{class} -See also Section~\ref{genericdeclsourceref}. +See also \SecRef{declarationssourceref}. \begin{itemize} \item \texttt{getGenericSignature()} returns the declaration's generic signature, computing it first if necessary. If the declaration does not have a generic parameter list or trailing \texttt{where} clause, returns the generic signature of the parent context. \end{itemize} @@ -847,7 +1325,7 @@ \section{Source Code Reference}\label{genericsigsourceref} \item \texttt{print()} prints the generic signature, with various options to control the output. \item \texttt{dump()} prints the generic signature, meant for use from the debugger or ad-hoc print debug statements. \end{itemize} -Also see Section~\ref{buildinggensigsourceref}. +Also see \SecRef{buildinggensigsourceref}. \IndexSource{generic signature query} \apiref{GenericSignatureImpl}{class} @@ -864,13 +1342,14 @@ \section{Source Code Reference}\label{genericsigsourceref} \IndexSource{areReducedTypeParametersEqual()@\texttt{areReducedTypeParametersEqual()}} \IndexSource{isReducedType()@\texttt{isReducedType()}} \IndexSource{getReducedType()@\texttt{getReducedType()}} +\IndexSource{getSugaredType()@\texttt{getSugaredType()}} \begin{itemize} \item \texttt{isEqual()} checks if two generic signatures are canonically equal. \item \texttt{getSugaredType()} given a type containing canonical type parameters that is understood to be written with respect to this generic signature, replaces the generic parameter types with their ``sugared'' forms, so that the name is preserved when the type is printed out to a string. \item \texttt{forEachParam()} invokes a callback on each generic parameter of the signature; the callback also receives a boolean indicating if the generic parameter type is reduced or not---a generic parameter on the left hand side of a same-type requirement is not reduced. \item \texttt{areAllParamsConcrete()} answers if all generic parameters are fixed to concrete types via same-type requirements, which makes the generic signature somewhat like an empty generic signature. Fully-concrete generic signatures are lowered away at the SIL level. \end{itemize} -The generic signature queries from Section~\ref{genericsigqueries} are methods on \texttt{GenericSignatureImpl}: +The generic signature queries from \SecRef{genericsigqueries} are methods on \texttt{GenericSignatureImpl}: \begin{itemize} \item Predicate queries: \begin{itemize} @@ -913,7 +1392,7 @@ \section{Source Code Reference}\label{genericsigsourceref} \IndexSource{layout requirement} \IndexSource{same-type requirement} \apiref{Requirement}{class} -A generic requirement. See also Section \ref{type resolution source ref}~and~\ref{buildinggensigsourceref}. +A generic requirement. See also \SecRef{type resolution source ref} and \SecRef{buildinggensigsourceref}. \begin{itemize} \item \texttt{getKind()} returns the \texttt{RequirementKind}. \item \texttt{getSubjectType()} returns the subject type. @@ -935,7 +1414,7 @@ \section{Source Code Reference}\label{genericsigsourceref} \IndexSource{protocol declaration} \IndexSource{class-constrained protocol} \apiref{ProtocolDecl}{class} -See also Section~\ref{genericdeclsourceref}. +See also \SecRef{declarationssourceref}. \begin{itemize} \item \texttt{getRequirementSignature()} returns the protocol's requirement signature, first computing it, if necessary. \item \texttt{requiresClass()} answers if the protocol is a class-constrained protocol. @@ -948,7 +1427,7 @@ \section{Source Code Reference}\label{genericsigsourceref} \item \texttt{getRequirements()} returns an array of \texttt{Requirement}. \item \texttt{getTypeAliases()} returns an array of \texttt{ProtocolTypeAlias}. \end{itemize} -Also see Section~\ref{buildinggensigsourceref}. +Also see \SecRef{buildinggensigsourceref}. \IndexSource{protocol type alias} \index{underlying type} @@ -962,7 +1441,7 @@ \section{Source Code Reference}\label{genericsigsourceref} \IndexSource{type parameter} \IndexSource{interface type} \apiref{TypeBase}{class} -See also Section~\ref{typesourceref}. +See also \SecRef{typesourceref}. \begin{itemize} \item \texttt{isTypeParameter()} answers if this type is a type parameter; that is, a generic parameter type, or a \texttt{DependentMemberType} whose base is another type parameter. \item \texttt{hasTypeParameter()} answers if this type is itself a type parameter, or if it contains a type parameter in structural position. For example, \texttt{Array<\ttgp{0}{0}>} will answer \texttt{false} to \texttt{isTypeParameter()}, but \texttt{true} to \texttt{hasTypeParameter()}. @@ -981,9 +1460,9 @@ \section{Source Code Reference}\label{genericsigsourceref} \IndexSource{type declaration} \IndexSource{protocol order} \apiref{TypeDecl}{class} -See also Section~\ref{declarationssourceref}. +See also \SecRef{declarationssourceref}. \begin{itemize} -\item \texttt{compare()} compares two protocols by the protocol order (Definition~\ref{linear protocol order}), returning one of the following: +\item \texttt{compare()} compares two protocols by the protocol order (\DefRef{linear protocol order}), returning one of the following: \begin{itemize} \item $-1$ if this protocol precedes the given protocol, \item 0 if both protocol declarations are equal, @@ -994,7 +1473,7 @@ \section{Source Code Reference}\label{genericsigsourceref} \IndexSource{type parameter order} \IndexSource{generic parameter order} \apiref{compareDependentTypes()}{function} -Implements the type parameter order (Algorithm~\ref{type parameter order}), returning one of the following: +Implements the type parameter order (\AlgRef{type parameter order}), returning one of the following: \begin{itemize} \item $-1$ if the left hand side precedes the right hand side, \item 0 if the two type parameters are equal as canonical types, diff --git a/docs/Generics/chapters/introduction.tex b/docs/Generics/chapters/introduction.tex index cfe9362c3e99c..656103604155f 100644 --- a/docs/Generics/chapters/introduction.tex +++ b/docs/Generics/chapters/introduction.tex @@ -12,24 +12,24 @@ \chapter{Introduction}\label{roadmap} \item Abstraction over concrete types with generic parameters should only impose a cost across module boundaries, or in other situations where type information is not available at compile time. \end{enumerate} -\noindent Our high-level approach can be summarized as follows: +\noindent The high-level design can be summarized as follows: \begin{enumerate} \item The interface between a generic function and its caller is mediated by \textbf{generic requirements}. The generic requirements describe the behavior of the generic parameter types inside the function body, and the generic arguments at the call site are checked against the function's generic requirements at compile time. \item Generic functions receive \textbf{runtime type metadata} for each generic argument from the caller. Type metadata defines operations to abstractly manipulate values of their type without knowledge of their concrete layout. -\item Runtime type metadata is constructed for each type in the language. The \textbf{runtime type layout} of a generic type is computed recursively from the type metadata of the generic arguments. Generic types always store their contents without \index{boxing}boxing or indirection. +\item Runtime type metadata is constructed for each type in the language. The \textbf{runtime type layout} of a generic type is computed recursively from the type metadata of the generic arguments. Generic types always store their contents directly, without indirection through a heap-allocated \index{boxing}box. \item The optimizer can generate a \textbf{specialization} of a generic function in the case where the definition is visible at the call site. This eliminates the overhead of runtime type metadata and abstract value manipulation. \end{enumerate} -A way to approach to compiler design is that a compiler is \emph{a library for implementing the target language}. A well-designed set of domain objects facilitates the introduction of language features that compose existing functionality in new ways. The generics implementation has four core domain objects: \emph{generic signatures}, \emph{substitution maps}, \emph{requirement signatures}, and \emph{conformances}. As you will see, they are defined as much by their inherent structure, as their relationships with each other. Subsequent chapters will dive into all the details, but first, we're going to look at a series of worked examples to help you understand the big picture. +The author's philosophy is that \textsl{a compiler is a library for modeling the concepts of the target language}. From this vantage point, the design of the compiler's domain objects becomes the primary concern. The implementation of generics in Swift defines four fundamental domain objects: \emph{generic signatures}, \emph{substitution maps}, \emph{requirement signatures}, and \emph{conformances}. As we will see, they are understood as much by their inherent structure, as their relationships with each other. Subsequent chapters will dive into all the details, but first, we're going to present a series of worked examples. -\section{Generic Functions} +\section{Functions} Consider these two rather contrived function declarations: \begin{Verbatim} func identity(_ x: Int) -> Int { return x } func identity(_ x: String) -> String { return x } \end{Verbatim} -Apart from the parameter and return type, both have the same exact definition, and indeed you can write the same function for any concrete type. Our aesthetic sense might lead us to replace both with a single generic function: +Both have the same exact definition apart from the parameter and return type, and indeed you can write the same function for any concrete type. Our aesthetic sense might lead us to replace both with a single generic function: \begin{Verbatim} func identity(_ x: T) -> T { return x } \end{Verbatim} @@ -67,11 +67,11 @@ \section{Generic Functions} \index{name lookup} \index{expression} \index{parser} -\paragraph{Parsing} Figure~\ref{identity ast} shows the abstract syntax tree produced by the parser before type checking. The key elements: +\paragraph{Parsing.} \FigRef{identity ast} shows the abstract syntax tree produced by the parser before type checking. The key elements: \begin{enumerate} \item The \emph{generic parameter list} \texttt{} introduces a single \emph{generic parameter declaration} named \texttt{T}. As its name suggests, this declares the generic parameter type \texttt{T}, scoped to the entire source range of this function. -\item The \emph{type representation} \texttt{T} appears twice, first in the parameter declaration \verb|_ x: T| and then as return type of \verb|identity(_:)|. A type representation is the purely syntactic form of a type. The parser does not perform name lookup, so the type representation stores the identifier \texttt{T} and does not refer to the generic parameter declaration of \texttt{T} in any way. -\item The function body contains an expression referencing \texttt{x}. Again, the parser does not perform name lookup, so this is just the identifier \texttt{x} and is not associated with the parameter declaration \verb|_ x: T|. +\item The \emph{type representation} \texttt{T} appears twice, first in the declaration of the parameter ``\verb|_ x: T|'' and then again, as the return type. A type representation is the purely syntactic form of a type. The parser does not perform name lookup, so the type representation stores the identifier \texttt{T} and does not refer to the generic parameter declaration of \texttt{T} in any way. +\item The function body contains an expression referencing \texttt{x}. Again, the parser does not perform name lookup, so this is just the identifier \texttt{x} and is not associated with the parameter declaration ``\verb|_ x: T|''. \end{enumerate} \index{generic parameter type} @@ -80,30 +80,25 @@ \section{Generic Functions} \index{type} \index{interface type} \index{generic function type} -\paragraph{Type checking} Some additional structure is formed during type checking: +\paragraph{Type checking.} The type checker constructs the \emph{interface type} of the function declaration from the following: \begin{enumerate} -\item The generic parameter declaration \texttt{T} declares the generic parameter type \texttt{T}. Types are distinct from type declarations in Swift; some types denote a \emph{reference} to a type declaration, and some are \emph{structural} (such as function types or tuple types). -\item The type checker constructs a \emph{generic signature} for our function declaration. The generic signature has the printed representation \texttt{} and contains the single generic parameter type \texttt{T}. This is the simplest possible generic signature, apart from the empty generic signature of a non-generic declaration. +\item A \emph{generic signature}, to introduce the generic parameter type \texttt{T}. In our case, the generic signature has the printed representation \texttt{}. This is the simplest generic signature, apart from the empty generic signature of a non-generic declaration. We'll see more interesting generic signatures soon. -\item The type checker performs \emph{type resolution} to transform the type representation \texttt{T} appearing in our parameter declaration and return type into a semantic \emph{type}. Type resolution queries name lookup for the identifier \texttt{T} at the source location of each type representation, which finds the generic parameter declaration \texttt{T} in both cases. This type declaration declares the generic parameter type \texttt{T}, which becomes the resolved type. +\item The \emph{interface type} of the parameter ``\verb|_ x: T|'', which is declared to be the type representation \texttt{T} in source. The compiler component responsible for resolving this to a semantic type is called \emph{type resolution}. In this case, we perform name lookup inside the lexical scope defined by the function, which finds the generic parameter type~\texttt{T}. -\index{expression} -\item There is now enough information to form the function's \emph{interface type}, which is the type of a reference to this function from expression context. The interface type of a generic function declaration is a \emph{generic function type}, composed from the function's generic signature, parameter types, and return type: +\item The function's return type, which also resolves to the generic parameter type~\texttt{T}. +\end{enumerate} +All of this is packaged up into a \emph{generic function type} which completely describes the type checking behavior of a reference to this function: \begin{quote} \begin{verbatim} (T) -> T \end{verbatim} \end{quote} -\end{enumerate} -The final step is the type checking of the function's body. The expression type checker queries name lookup for the identifier \texttt{x}, which finds the parameter declaration \verb|_ x: T|. +The name ``\texttt{T}'' is of no semantic consequence beyond name lookup. We will learn that the \emph{canonical type} for the above erases the generic parameter type \texttt{T} to the notation \ttgp{0}{0}. More generally, each generic parameter type is uniquely identified within its lexical scope by its \emph{depth} and \emph{index}. -\index{archetype type} -\index{primary archetype type} -While the type of our function parameter is the generic parameter type \texttt{T}, inside the body of a generic function it becomes a different kind of type, called a \emph{primary archetype}. The distinction isn't terribly important right now, and it will be covered in Chapter~\ref{genericenv}. It suffices to say that we'll use the notation $\archetype{T}$ for the primary archetype corresponding to the generic parameter type \texttt{T}. +Having computed the interface type of the function, the type checker moves on to the function's body. The type of the return statement's expression must match the return type of the function declaration. When we type check the return expression \texttt{x}, we use name lookup to find the parameter declaration ``\verb|_ x: T|''. The interface type of this parameter declaration is the generic parameter type \texttt{T}, which is understood relative to the function's generic signature. The type assigned to the expression, however, is the \index{archetype type}\index{primary archetype type}\emph{primary archetype} corresponding to \texttt{T}, denoted $\archetype{T}$. We will learn later that an archetype is a self-describing form of a type parameter which behaves like a concrete type. -With that out of the way, the expression type checker assigns the type $\archetype{T}$ to the expression \texttt{x} appearing in the return statement. As expected, this matches the declared return type of the function. - -\paragraph{Code generation} +\paragraph{Code generation.} We've now successfully type checked our function declaration. How might we generate code for it? Recall the two concrete implementations that we folded into our single generic function: \begin{Verbatim} func identity(_ x: Int) -> Int { return x } @@ -113,63 +108,63 @@ \section{Generic Functions} \begin{enumerate} \item The first function receives and returns the \texttt{Int} value in a machine register. The \texttt{Int} type is \emph{trivial},\footnote{Or POD, for you C++ folks.} meaning it can be copied and moved at will. \item The second function is trickier. A \texttt{String} is stored as a 16-byte value in memory, and contains a pointer to a reference-counted buffer. When manipulating values of a non-trivial type like \texttt{String}, memory ownership comes into play. - -The standard ownership semantics for a Swift function call are defined such that the caller retains ownership over the parameter values passed into the callee, while the callee transfers ownership of the return value to the caller. This means that the \verb|identity(_:)| function cannot just return the value \texttt{x}; instead, it must first create a logical copy of \texttt{x} that it owns, and then return this owned copy. This is achieved by incrementing the string value's buffer reference count via a call to a runtime function. \end{enumerate} -More generally, every Swift type has a size and alignment, and defines three fundamental operations that can be performed on all values of that type: moving the value, copying the value, and destroying the value. A move is semantically equivalent to, but more efficient than, copying a value followed by destroying the old copy. (Non-copyable types were recently introduced in \cite{se0390}; the runtime metadata of a non-copyable type does not define a copy function. We won't talk about non-copyable types in this book; it suffices to say that the compiler statically checks that copy operations on them are not performed, and at least at the time of writing, the generics implementation does not yet allow abstracting over non-copyable types.) -With a trivial type, moving or copying a value simply copies the value's bytes from one memory location to another, and destroying a value does nothing. With a reference type, these operations update the reference count. Copying a reference increments the reference count on its heap-allocated backing storage, and destroying a reference decrements the reference count, deallocating the backing storage when the reference count reaches zero. Even more complex behaviors are also possible; a struct might contain a mix of trivial types and references, for example. Weak references and existential types also have non-trivial value operations. +The standard ownership semantics for a Swift function call are defined such that the caller retains ownership over the parameter values passed into the callee, while the callee transfers ownership of the return value to the caller. Thus the implementation of \verb|identity(_:)| must create a logical copy of \texttt{x} and then move this copy back to the caller, and do this in a manner that abstracts over all possible concrete types. -\index{runtime type metadata} -As the joke goes, every problem in computer science can be solved with an extra level of indirection. The calling convention for a generic function takes \emph{runtime type metadata} for every generic parameter in the function's generic signature. Every type in the language has a reified representation as runtime type metadata, storing the type's size and alignment together with function pointers implementing the move, copy and destroy operations. The generated code for a generic function abstractly manipulates values of generic parameter type using the runtime type metadata provided by the caller. An important property of runtime type metadata is \emph{identity}; two pointers to runtime type metadata are equal if and only if they represent the same type in the language. +The calling convention for a generic function passes \index{runtime type metadata}\emph{runtime type metadata} for each generic parameter in the function's generic signature. Runtime type metadata describes the size and alignment of a concrete type, and provides implementations of the \emph{move}, \emph{copy} and \emph{destroy} operations. + +We move and copy values of trivial type by copying bytes; destroy does nothing in this case. With a reference type, the value is a reference-counted pointer; copy and destroy operations update the reference count, while a move leaves the reference count unchanged. The value operations for structs and enums are defined recursively from their members. Finally, weak references and existential types also have non-trivial value operations. + +For copyable types, a move is semantically equivalent to a copy followed by a destroy, only more efficient. Traditionally, all types in the language were copyable. \IndexSwift{5.9}Swift 5.9 introduced \index{noncopyable type}\emph{noncopyable types} \cite{se0390}, and \IndexSwift{6.0}Swift 6 extended generics to work with noncopyable types \cite{se0427}. We will not discuss noncopyable types in this book. \begin{MoreDetails} -\item Types: Chapter~\ref{types} -\item Function declarations: Section~\ref{func decls} -\item Generic parameter lists: Chapter~\ref{generic declarations} -\item Type resolution: Chapter~\ref{typeresolution} +\item Types: \ChapRef{types} +\item Function declarations: \SecRef{function decls} +\item Generic parameter lists: \SecRef{generic params} +\item Archetypes: \ChapRef{genericenv} +\item Type resolution: \ChapRef{typeresolution} \end{MoreDetails} \index{call expression} \index{expression} -\paragraph{Substitution maps} Let us now turn our attention to the callers of generic functions. A \emph{call expression} references a \emph{callee} together with a list of arguments. The callee is some other expression with a function type. Some possible callees include references to named function declarations, type expressions (which invokes a constructor), function parameters and local variables of function type, and results of other calls which return functions. In our example, we might call the \verb|identity(_:)| function as follows: +\paragraph{Substitution maps.} Let us now turn our attention to the callers of generic functions. A \emph{call expression} brings together a \emph{callee} and a list of argument expressions. A callee is just an expression of function type. This function type's parameters must match the argument expressions, and its return type is then the type of the call expression. Some possible callees include an expression that names an existing function declaration, type expressions (which is sugar for invoking a constructor member of the type), function parameters and local variables of function type, and even other calls whose result has function type. In our example, we might call the \verb|identity(_:)| function as follows: \begin{Verbatim} identity(3) identity("Hello, Swift") \end{Verbatim} -The callee here is a direct reference to the declaration of \verb|identity(_:)|. In Swift, calls to generic functions never specify their generic arguments explicitly; instead, the type checker infers them from the types of call argument expressions. A reference to a named generic function stores a \emph{substitution map} mapping each generic parameter type of the callee's generic signature to the inferred generic argument, also called the \emph{replacement type}. +In Swift, calls to generic functions do not specify their generic arguments explicitly; the type checker infers them from the types of call argument expressions. Generic arguments are encoded in a \index{substitution map}\emph{substitution map}, which assigns a \emph{replacement type} to each generic parameter type of the callee's generic signature. -The generic signature of \verb|identity(_:)| has a single generic parameter type. The two references to \verb|identity(_:)| have different substitution maps; the first substitution map has the replacement type \texttt{Int}, and the second \texttt{String}. We will use the following notation for these substitution maps: +The generic signature of \verb|identity(_:)| has a single generic parameter type. Thus each of its substitution maps holds a single concrete type. We now introduce some notation. Here are two possible substitution maps, corresponding to the two calls shown above: \[ -\SubstMap{\SubstType{T}{Int}} +\Sigma_1 := \SubstMap{\SubstType{T}{Int}} \qquad -\SubstMap{\SubstType{T}{String}} +\Sigma_2 := \SubstMap{\SubstType{T}{String}} +\] +We can form the return type of a call by taking the function declaration's return type, and applying the corresponding substitution map. We denote this application using the ``$\otimes$'' operator. For now, this simply extracts the concrete type stored therein: +\[\texttt{T} \otimes \Sigma_1 = \texttt{Int} +\qquad +\texttt{T} \otimes \Sigma_2 = \texttt{String} \] -We can apply a substitution map to the interface type of our function declaration to get the \emph{substituted type} of the callee: -\[\texttt{ (T) -> T} \otimes \SubstMap{\SubstType{T}{Int}} = \texttt{(Int) -> Int}\] - -Substitution maps also play a role in code generation. When generating a call to a generic function, the compiler emits code to realize the runtime type metadata for each replacement type in the substitution map. The types \texttt{Int} and \texttt{String} are \emph{nominal types} defined in the standard library. These types are non-generic and have a fixed layout, so their runtime type metadata can be recovered by taking the address of a constant symbol exported by the standard library. -\index{structural type} -\index{metadata access function} -Structural types are slightly more complicated. Suppose we were instead compiling a call to \verb|identity(_:)| where the replacement type for \texttt{T} was some function type, say \verb|(Int, String) -> Float|. Function types can have arbitrary parameter and return types. Therefore, function type metadata is \emph{instantiated} by calling a \emph{metadata access function} implemented in the Swift runtime. The metadata access function takes metadata for the parameter types and return type, constructs metadata representing the function type, and caches the result for future accesses. Runtime type metadata for other structural types, like tuples and metatypes, is constructed in a similar manner. +Substitution maps play a role in code generation. When calling a generic function, the compiler must realize the runtime type metadata for each replacement type in the substitution map of the call. In our example, the types \texttt{Int} and \texttt{String} are \emph{nominal types} defined in the standard library. They are non-generic and have a fixed layout, so their runtime type metadata is accessed by calling a function, exported by the standard library, that returns the address of a constant symbol. \begin{MoreDetails} -\item Substitution maps: Chapter~\ref{substmaps} +\item Substitution maps: \ChapRef{substmaps} \end{MoreDetails} \index{inlinable function} -\paragraph{Specialization} Reification of runtime type metadata and the indirect manipulation of values incurs a performance penalty. As an alternative, if the definition of a generic function is visible at the call site, the optimizer can generate a \emph{specialization} of the generic function by cloning the definition and applying the substitution map to all types appearing in the function's body. Definitions of generic functions are always visible to the specializer within their defining module. Shared library developers can also opt-in to exporting the body of a function across module boundaries with the \texttt{@inlinable} attribute. +\paragraph{Specialization.} Reification of runtime type metadata and the subsequent indirect manipulation of values incurs a performance penalty. As an alternative, if the definition of a generic function is visible at the call site, the optimizer can generate a \emph{specialization} of the generic function by cloning the definition and applying the substitution map to all types appearing in the function's body. Definitions of generic functions are always visible to the specializer within their defining module. Shared library developers can also opt-in to exporting the body of a function across module boundaries with the \texttt{@inlinable} attribute. \begin{MoreDetails} -\item \texttt{@inlinable} attribute: Section~\ref{module system} +\item \texttt{@inlinable} attribute: \SecRef{module system} \end{MoreDetails} -\section{Generic Types} +\section{Nominal Types} \index{struct declaration} \index{stored property declaration} -For our next example, consider this simple generic struct storing two values of the same type: +For our next example we consider a simple generic struct declaration: \begin{Verbatim} struct Pair { let first: T @@ -181,66 +176,48 @@ \section{Generic Types} } } \end{Verbatim} -This struct declaration contains three members: two stored property declarations, and a constructor declaration. Recall that declarations have an \emph{interface type}, which is the type of a reference to the declaration from expression context. The interface type of \texttt{first} and \texttt{second} is the generic parameter type \texttt{T}. +A struct declaration is an example of a \emph{nominal type declaration}. A \emph{generic nominal type}, like \texttt{Pair} or \texttt{Pair} is a reference to a generic nominal type declaration together with a list of \index{generic argument}generic argument types. -\index{metatype type} -\index{expression} -When a type declaration is referenced from expression context the result is a value representing the type, and the type of this value is a metatype type, so the interface type of \texttt{Pair} is the metatype type \texttt{Pair.Type}. +The in-memory layout of a struct value is determined by the interface types of its stored properties. Our \texttt{Pair} struct declares two stored properties, \texttt{first} and \texttt{second}, both with interface type \texttt{T}. Thus the layout of a \texttt{Pair} depends on the layout of the generic parameter type~\texttt{T}. -\index{declared interface type} -\index{generic nominal type} -Type declarations also have a more primitive notion of a \emph{declared interface type}, which is the type assigned to a reference to the declaration from type context. The declared interface type of \texttt{Pair} is the \emph{generic nominal type} \texttt{Pair}. The interface type of a type declaration is the metatype of its declared interface type. +The generic nominal type \texttt{Pair}, formed by taking the generic parameter type as the argument, is called the \emph{declared interface type} of \texttt{Pair}. The type \texttt{Pair} is a \emph{specialized type} of the declared interface type, \texttt{Pair}. We can obtain \texttt{Pair} from \texttt{Pair} by applying a substitution map: +\[\texttt{Pair} \otimes \SubstMap{\SubstType{T}{Int}} = \texttt{Pair}\] -\index{context substitution map} -Instances of \texttt{Pair} store their fields inline without \index{boxing}boxing, and the layout of \texttt{Pair} depends on the generic parameter \texttt{T}. If you declare a local variable whose type is the generic nominal type \texttt{Pair}, the compiler can directly compute the type's layout to determine the size of the stack allocation: +This ``factorization'' is our first example of what is in fact an algebraic identity. The \index{context substitution map}\emph{context substitution map} of a generic nominal type is the substitution map formed from its generic arguments. This has the property that applying the context substitution map to the declared interface type recovers the original generic nominal type. Suppose we declare a local variable of type \texttt{Pair}: \begin{Verbatim} let twoIntegers: Pair = ... \end{Verbatim} -To compute the layout, the compiler first factors the type \texttt{Pair} into the application of a substitution map to the declared interface type: -\[\Type{Pair} = \Type{Pair} \otimes \SubstMap{\SubstType{T}{Int}}\] -The compiler then computes the substituted type of each stored property by applying this substitution map to each stored property's interface type, which is \texttt{T}: -\[\Type{T} \otimes \SubstMap{\SubstType{T}{Int}} = \Type{Int}\] -Therefore both fields of \texttt{Pair} have a substituted type of \texttt{Int}. The \texttt{Int} type has a size of 8 bytes and an alignment of 8 bytes, from which we derive that \texttt{Pair} has a size of 16 bytes and alignment of 8 bytes. +The compiler must allocate storage on the stack for this value. We take the context substitution map, and apply it to the interface type of each stored property: +\[\texttt{T} \otimes \SubstMap{\SubstType{T}{Int}} = \texttt{Int}\] +We see that a value of type \texttt{Pair} stores two consecutive values of type \texttt{Int}, which gives \texttt{Pair} a size of 16 bytes and alignment of 8 bytes. Since \texttt{Pair} is trivial, the stack allocation does not require any special cleanup once we exit its scope. -\index{metadata access function} -However, the layout is not always known at compile time, in which case we need the runtime type metadata for \texttt{Pair}. When compiling the declaration of \texttt{Pair}, the compiler emits a \emph{metadata access function} which takes the type metadata for \texttt{T} as an argument. The metadata access function calculates the layout of \texttt{Pair} for this \texttt{T} with the same algorithm as the compiler, but at runtime, and caches the result. +Now, we complete our local variable declaration by writing down an \index{expression}\index{initial value expression}initial value expression which calls the constructor: +\begin{Verbatim} +let twoIntegers: Pair = Pair(first: 1, second: 2) +\end{Verbatim} +While the value has the type \texttt{Pair} at the call site, inside the constructor the value being initialized is of type \texttt{Pair}. We construct the runtime type metadata for \texttt{Pair} by calling the \index{metadata access function}\emph{metadata access function} for \texttt{Pair} with the runtime type metadata for \texttt{Int} as an argument. The metadata for \texttt{Pair} has two parts: -Note that the runtime type metadata for \texttt{Pair} has two parts: \begin{enumerate} -\item A common prefix present in all runtime type metadata, which includes the total size and alignment of a value. -\item A private area specific to specializations of \texttt{Pair}, which contains the \emph{field offset vector} storing the offset of each stored property of this specialization of \texttt{Pair}. +\item A common prefix present in all runtime type metadata, which includes the total size and alignment of a value, and implementations of the move, copy and destroy operations. +\item A private area specific to the declaration of \texttt{Pair} itself, which stores the runtime type metadata for \texttt{T}, followed by the \emph{field offset vector}, storing the offset of each stored property of this specialization of \texttt{Pair}. \end{enumerate} -The first part comes into play if we call our \verb|identity(_:)| function with a value of type \texttt{Pair}. The generated code for the call invokes a metadata access function for \texttt{Pair} with the metadata for \texttt{Int} as an argument, and passes the resulting metadata for \texttt{Pair} to \verb|identity(_:)|. The implementation of \verb|identity(_:)| doesn't know that it is dealing with a \texttt{Pair}, but it uses the provided metadata to abstractly manipulate the value. +The metadata access function for a generic type takes the metadata for each generic argument, and calculates the offset of each stored property, also obtaining the size and alignment of the entire value. The move, copy and destroy operations of the aggregate type delegate to the corresponding operations in the generic argument metadata. The constructor of \texttt{Pair} then uses the runtime type metadata for both \texttt{Pair} and \texttt{T} to correctly initialize the aggregate value from its two constituent parts. -\index{initial value expression} -\index{expression} -The second part is used by the implementation of \verb|Pair.init(first:second:)|. The constructor does not have a generic parameter list of its own, but it is nested inside of a generic type, so it inherits the generic signature of the type, which is \texttt{}. The interface type of this constructor is the generic function type: -\begin{quote} -\begin{verbatim} - (T, T) -> Pair -\end{verbatim} -\end{quote} -Recall our declaration of the \texttt{twoIntegers} variable. Let's complete the declaration by writing down an initial value expression which calls the constructor: -\begin{Verbatim} -let twoIntegers: Pair = Pair(first: 1, second: 2) -\end{Verbatim} -At the call site, we have full knowledge of the layout of \texttt{twoIntegers}. However, the implementation of \texttt{Pair.init} only knows that it is working with a \texttt{Pair}, and not a \texttt{Pair}. The generated code for the constructor calls the metadata access function for \texttt{Pair} with the provided metadata for \texttt{T}. Since it knows it is working with a \texttt{Pair}, it can look inside the private area to get the field offset of \texttt{first} and \texttt{second}, and assign the two parameters into the \texttt{first} and \texttt{second} stored properties of \texttt{self}. +\index{structural type} +Structural types, such as function types, tuple types and metatypes, are similar to generic nominal types in that we call a metadata access function to obtain runtime type metadata for them, but this time, the metadata access function is part of the Swift runtime. For example, to construct metadata for the tuple type \texttt{(Int, Pair)}, we first call the metadata access function for \texttt{Pair} to get \texttt{Pair}, then call an entry point in the runtime to obtain \texttt{(Int, Pair)}. \begin{MoreDetails} -\item Type declarations: Section~\ref{type declarations} -\item Context substitution map: Section~\ref{contextsubstmap} +\item Declarations: \ChapRef{decls} +\item Context substitution map: \SecRef{contextsubstmap} +\item Structural types: \SecRef{more types} \end{MoreDetails} \section{Protocols} -Neither the \verb|identity(_:)| function nor the \texttt{Pair} type state any generic requirements, so in reality they can't do anything with their generic values except pass them around, which the compiler expresses in terms of the fundamental value operations---move, copy and destroy. +Our \verb|identity(_:)| and \texttt{Pair} declarations both abstract over arbitrary concrete types, but in turn, this limits their generic parameter \texttt{T} to the common capabilities shared by all types---the move, copy and destroy operations. By stating \emph{generic requirements}, a generic declaration can impose various restrictions on the concrete types used as generic arguments, which in turn endows its generic parameter types with new capabilities provided by those concrete types. -\index{requirement} -\index{conformance requirement} -\index{opaque parameter} -\Index{where clause@\texttt{where} clause} -We can do more with our generic parameter types by imposing generic requirements on them. One of the most commonly-used and fundamental requirement kinds is the \emph{protocol conformance requirement}, which states that the replacement type for a generic requirement must conform to the given protocol. +A \emph{protocol} abstracts over the capabilities of a concrete type. By stating a \index{conformance requirement}\emph{conformance requirement} between a generic parameter type and protocol, a generic declaration can require that its generic argument is a concrete type that \emph{conforms} to this protocol: \begin{Verbatim} protocol Shape { func draw() @@ -252,212 +229,225 @@ \section{Protocols} } } \end{Verbatim} -The \verb|drawShapes(_:)| function takes an array of values whose type conforms to \texttt{Shape}. You can also write the declaration of \verb|drawShapes(_:)| using a trailing \texttt{where} clause, or avoid the explicit generic parameter list altogether and declare an \emph{opaque parameter type} instead: -\begin{Verbatim} -func drawShapes(_ shapes: Array) where S: Shape -func drawShapes(_ shapes: Array) -\end{Verbatim} -\index{generic signature} -The generic signatures we've seen previously were rather trivial, only storing a single generic parameter type. More generally, a generic signature actually consists of a list of generic parameter types together with a list of requirements. Irrespective of the surface syntax, the generic signature of \verb|drawShapes(_:)| will have a single requirement. We will use the following notation for generic signatures with requirements: +The \verb|drawShapes(_:)| function takes an array of values, all of the same type, which must conform to \texttt{Shape}. So far we only encountered the \index{generic signature}generic signature \texttt{}. More generally, a generic signature lists one or more generic parameter types, together with their requirements. The generic signature of \verb|drawShapes(_:)| has the single requirement $\ConfReq{S}{Shape}$. We will use the following notation for generic signatures with requirements: \begin{quote} \begin{verbatim} \end{verbatim} \end{quote} -The interface type of \verb|drawShapes(_:)| is a generic function type incorporating this generic signature: +The interface type of \verb|drawShapes(_:)| incorporates this generic signature into a generic function type: \begin{quote} \begin{verbatim} (Array) -> () \end{verbatim} \end{quote} -Inside the body of \verb|drawShapes(_:)|, the \texttt{shape} local variable bound by the \texttt{for}~loop is a value of type $\archetype{S}$ (remember, generic parameter types become archetype types inside the function body; but as before, the distinction doesn't matter right now): +We can also write the \verb|drawShapes(_:)| function to state the conformance requirement with a trailing \texttt{where} clause, or we can avoid naming the generic parameter \texttt{S} by using an \emph{opaque parameter type} instead: +\begin{Verbatim} +func drawShapes(_ shapes: Array) where S: Shape +func drawShapes(_ shapes: Array) +\end{Verbatim} +All three forms of \verb|drawShapes(_:)| are ultimately equivalent, because they define the same generic signature, up to the choice of generic parameter name. In general, when there is more than one way to spell the same underlying language construct due to syntax sugar, the semantic objects ``desugar'' these differences into the same uniform representation. +\begin{MoreDetails} +\item Protocols: \SecRef{protocols} +\item Requirements: \SecRef{requirements} +\item Generic signatures: \ChapRef{genericsig} +\end{MoreDetails} + +\paragraph{Qualified lookup.} +Once we have a generic signature, we can type check the body of \verb|drawShapes(_:)|. The \texttt{for}~loop introduces a local variable ``\texttt{shape}'' of type $\archetype{S}$ (we re-iterate that the generic parameter type \texttt{S} is represented as the archetype $\archetype{S}$ inside a function body, but the distinction doesn't matter right now). This variable is referenced inside the \texttt{for} loop by the \index{member expression}\emph{member expression} ``\texttt{shape.draw}'': \begin{Verbatim}[firstnumber=6] for shape in shapes { shape.draw() } \end{Verbatim} - Since \texttt{S} is subject to the conformance requirement $\FormalReq{S:~Shape}$, the caller must provide a concrete replacement type for \texttt{S} conforming to \texttt{Shape}. So while we don't know the concrete type of our \texttt{shape} local variable at compile time, we do know that it implements this \texttt{draw()} method of the \texttt{Shape} protocol. Thus, a \index{qualified lookup}\emph{qualified lookup} of the identifier \texttt{draw} with a base type of $\archetype{S}$ will find the declaration of the \texttt{draw()} method of \texttt{Shape}, as a consequence of the conformance requirement. -\index{witness table} -How does the compiler generate code for the call \verb|shape.draw()|? Once again, we need to introduce some indirection. For each conformance requirement in the generic signature of a generic function, the generic function receives a \emph{witness table} from the caller. The layout of a witness table is determined by the protocol's requirements; a method becomes an entry storing a function pointer. To call our protocol method, the compiler loads the function pointer from the witness table, and invokes it with the argument value of \texttt{shape}. +Our generic signature has the conformance requirement $\ConfReq{S}{Shape}$, so the caller must provide a replacement type for \texttt{S} conforming to \texttt{Shape}. We'll return to the caller's side of the equation shortly, but inside the callee, the requirement also tells us that the archetype $\archetype{S}$ conforms to \texttt{Shape}. To resolve the member expression, the type checker performs a \index{qualified lookup}\emph{qualified lookup} of the identifier \texttt{draw} with a base type of $\archetype{S}$. A qualified lookup into an archetype checks each protocol the archetype conforms to, so we find and return the \texttt{draw()} method of the \texttt{Shape} protocol. -Note that \verb|drawShapes(_:)| operates on a homogeneous array of shapes. While the array contains an arbitrary number of elements, \verb|drawShapes(_:)| only receives a single runtime type metadata for \texttt{S}, and one witness table for the conformance requirement \verb|S: Shape|, which together describe all elements of the array. +How does the compiler generate code for the call \verb|shape.draw()|? Together with the runtime type metadata for \texttt{S}, the calling convention for \verb|drawShapes(_:)| passes an additional argument, corresponding to the conformance requirement $\ConfReq{S}{Shape}$. This argument is the \index{witness table}\emph{witness table} for the conformance. The layout of a witness table is determined by the protocol's members; a witness table for a conformance to \texttt{Shape} has a single entry, the implementation of the \texttt{draw()} method. To call \texttt{shape.draw()}, we load the function pointer from the witness table, and invoke it with the value of \texttt{shape}. \begin{MoreDetails} -\item Protocols: Section~\ref{protocols} -\item Constraint types: Section~\ref{constraints} -\item Trailing \texttt{where} clauses: Section~\ref{trailing where clauses} -\item Opaque parameters: Section~\ref{opaque parameters} -\item Name lookup: Section~\ref{name lookup} +\item Name lookup: \SecRef{name lookup} \end{MoreDetails} - -\index{conformance} -\index{normal conformance} -\index{conforming type} -\paragraph{Conformances} We can write a struct declaration conforming to \texttt{Shape}: +\paragraph{Conformances.} This \texttt{Circle} type states a \index{conformance}\emph{conformance} to the \texttt{Shape} protocol: \begin{Verbatim} struct Circle: Shape { let radius: Double func draw() {...} } \end{Verbatim} -The declaration of \texttt{Circle} states a \emph{conformance} to the \texttt{Shape} protocol in its inheritance clause. The type checker constructs an object called a \emph{normal conformance}, which records the mapping from the protocol's requirements to the members of the conforming type which \emph{witness} those requirements. - -When the compiler generates the code for the declaration of \texttt{Circle}, it emits a witness table for each normal conformance defined on the type declaration. In our case, there is just a single requirement \texttt{Shape.draw()}, witnessed by the method \texttt{Circle.draw()}. The witness table for this conformance references the witness (indirectly, because the witness is always wrapped in a \emph{thunk}, which is a small function which shuffles some registers around and then calls the actual witness. This must be the case because protocol requirements use a slightly different calling convention than ordinary generic functions). +The \index{conformance checker}\emph{conformance checker} ensures that the conforming type \texttt{Circle} declares a \emph{witness} for the \texttt{draw()} method of \texttt{Shape}, and records this fact in a \index{normal conformance}\emph{normal conformance}. We denote this normal conformance by $\ConfReq{Circle}{Shape}$. When generating code for the declaration of \texttt{Circle}, we also emit the witness table for the normal conformance $\ConfReq{Circle}{Shape}$. This witness table contains a pointer to the implementation of \texttt{Circle.draw()}. -Now, let's look at a call to \verb|drawShapes(_:)| with an array of circles: +Now, let's call \verb|drawShapes(_:)| with an array of circles and look at the substitution map for the call: \begin{Verbatim} drawShapes([Circle(radius: 1), Circle(radius: 2)]) \end{Verbatim} -Recall that a reference to a generic function declaration comes with a substitution map. Substitution maps store a replacement type for each generic parameter of a generic signature, so our substitution map maps \texttt{S} to the replacement type \texttt{Circle}. When the generic signature has conformance requirements, the substitution map also stores a conformance for each conformance requirement. This is the ``proof'' that the concrete replacement type actually conforms to the protocol. +When the callee's generic signature has conformance requirements, the substitution map must store a conformance for each conformance requirement. This is the ``proof'' that the \index{conforming type}concrete replacement type actually conforms to the protocol, as required. We denote a substitution map with conformances as follows: +\[\SubstMapC{\SubstType{S}{Circle}}{\SubstConf{S}{Circle}{Shape}}\] +To find the normal conformance, the type checker performs a \index{global conformance lookup}\emph{global conformance lookup} with the concrete type and protocol: +\[\protosym{Shape}\otimes\texttt{Circle}=\ConfReq{Circle}{Shape}\] -\index{global conformance lookup} -The type checker finds conformances by \emph{global conformance lookup}. The call to \verb|drawShapes(_:)| will only type check if the replacement type conforms to \texttt{Shape}; the type checker rejects a call that provides an array of integers for example, because there is no conformance of \texttt{Int} to \texttt{Shape}.\footnote{Of course, you could define this conformance with an extension.} +When generating code for the call to \verb|drawShapes(_:)|, we visit each entry in the substitution map, emitting a reference to runtime type metadata for each replacement type, and a reference to the witness table for each conformance. In our case, we pass the runtime type metadata for \texttt{Circle} and witness table for $\ConfReq{Circle}{Shape}$. -We will use the following notation for substitution maps storing a conformance: -\[\SubstMapC{\SubstType{S}{Circle}}{\SubstConf{S}{Circle}{Shape}}\] +\begin{MoreDetails} +\item Conformances: \ChapRef{conformances} +\item Conformance lookup: \SecRef{conformance lookup} +\end{MoreDetails} + +\paragraph{Existential types.} +Note that \verb|drawShapes(_:)| operates on a \emph{homogeneous} array of shapes. While the array contains an arbitrary number of elements, \verb|drawShapes(_:)| only receives a single runtime type metadata for \texttt{S}, and one witness table for the conformance requirement $\ConfReq{S}{Shape}$, which together describe the behavior of each element in the array. If we instead want a function taking a \emph{heterogeneous} array of shapes, we can use an \emph{existential type} as the element type of our array: +\begin{Verbatim} +func drawShapes(_ shapes: Array) { + for shape in shapes { + shape.draw() + } +} +\end{Verbatim} -When emitting code to call to a generic function, the compiler looks at the substitution map and emits a reference to runtime type metadata for each replacement type, and a reference to the witness table for each conformance. In our case, \verb|drawShapes(_:)| takes a single runtime type metadata and a single witness table for the conformance. (The contents of the witness table were emitted when compiling the declaration of \texttt{Circle}; compiling the substitution map references this existing witness table.) +This function uses the \texttt{Shape} protocol in a new way. The \emph{existential type} \texttt{any Shape} is a box for storing a value of some concrete type, together with the concrete type's runtime type metadata and a witness table describing the conformance. Observe the difference between the type \texttt{Array} from the previous variant of \verb|drawShapes(_:)|, and \texttt{Array} here. Every element of the latter has its own runtime type metadata and witness table, allowing us to mix multiple kinds of shapes in one array. Existential types built up from the core primitives of the generics system; we will learn about them in \PartRef{part features} of this book. \begin{MoreDetails} -\item Conformances: Chapter~\ref{conformances} -\item Conformance lookup: Section~\ref{conformance lookup} +\item Existential types: \ChapRef{existentialtypes} \end{MoreDetails} -\index{identifier type representation} -\index{associated type declaration} -\paragraph{Associated types} Perhaps the simplest example of a protocol with an associated type is the \texttt{Iterator} protocol in the standard library. This protocol abstracts over an iterator which produces elements of a type that depends on the conformance: +\section{Associated Types} + +The standard library's \texttt{IteratorProtocol} declares an \index{associated type declaration}associated type. This allows us to abstract over iterators whose element type depends on the conformance: \begin{Verbatim} protocol IteratorProtocol { associatedtype Element mutating func next() -> Element? } \end{Verbatim} -One of the protocol requirements is now an associated type, so every conforming type must fulfill this requirement with its own member type named \texttt{Element}. This member type is the called the \emph{type witness} for the associated type \texttt{Element}, and it is recorded as part of a conformance to \texttt{IteratorProtocol}. +A conforming type must declare a member type named \texttt{Element}, and a \texttt{next()} method returning an optional value of this type. This member type, which can be a type alias or nominal type, is the \emph{type witness} for the associated type \texttt{Element}. -Consider a generic function which returns the first element produced by an iterator: +We declare a \texttt{Nat} type conforming to \texttt{IteratorProtocol} with an \texttt{Element} type of \texttt{Int}, for generating an infinite stream of consecutive \index{natural numbers}natural numbers: \begin{Verbatim} -func firstElement(_ iter: inout I) -> I.Element { - return iter.next()! +struct Nat: IteratorProtocol { + typealias Element = Int + var x = 0 + + mutating func next() -> Int? { + defer { x += 1 } + return x + } } \end{Verbatim} -A caller must provide a concrete replacement type for \texttt{I}, together with a conformance to satisfy the requirement $\FormalReq{I:~IteratorProtocol}$. The return type, \texttt{I.Element}, abstractly represents this type witness in the conformance. - -The return type of our function is written with an \emph{identifier type representation} \texttt{I.Element} having two components, ``\texttt{I}'' and ``\texttt{Element}''. Type resolution resolves this type representation by performing a qualified lookup of \texttt{Element} on the base type \texttt{I}. Qualified lookup looks inside the protocol as a consequence of the conformance requirement, and finds the associated type declaration \texttt{Element}. - -A reference to an associated type declaration resolves to a \index{dependent member type}\emph{dependent member type}. Dependent member types are built from a base type, in our case \texttt{I}, and an associated type declaration, in our case \texttt{Element}. We will denote this dependent member type as \verb|I.[IteratorProtocol]Element| to make explicit the fact that a name lookup has resolved the \index{identifier}identifier \texttt{Element} to a specific associated type in a protocol, despite this not being valid syntax in the language. - -The interface type of \verb|firstElement(_:)| is therefore this generic function type: -\begin{quote} -\begin{verbatim} - -(inout I) -> I.[IteratorProtocol]Element -\end{verbatim} -\end{quote} +We say that \texttt{Int} is the \emph{type witness} for \texttt{[IteratorProtocol]Element} in the conformance $\ConfReq{Nat}{IteratorProtocol}$. We can express this in our type substitution algebra, using the type witness \emph{projection} operation on normal conformances: +\[\AssocType{[IteratorProtocol]Element}\otimes \ConfReq{Nat}{IteratorProtocol} = \texttt{Int}\] +In this case, we could have omitted the declaration of the type alias, relying on \index{associated type inference}\emph{associated type inference} to deduce the type witness. Also, more accurately, the type witness here is the \texttt{Element} \index{type alias type}\emph{type alias type}, whose \index{underlying type}underlying type is \texttt{Int}. A type alias type is an example of a \index{sugared type}\emph{sugared type}, equivalent semantically to its underlying type. The \index{optional sugared type}\emph{optional type} \texttt{Int?} is another sugared type, equivalent to \texttt{Optional}. \begin{MoreDetails} -\item Associated types: Section~\ref{protocols} -\item Identifier type representations: Section \ref{identtyperepr} +\item Type witnesses: \SecRef{type witnesses} \end{MoreDetails} -\index{type parameter} -\paragraph{Type parameters} -Just as a generic parameter type represents a replacement type directly provided by the caller, a dependent member type represents a type witness which can be recovered from a conformance. The two are very much alike, so say \emph{type parameter} to mean a generic parameter type or a dependent member type. The generic signature of \verb|firstElement(_:)| has two valid type parameters: -\begin{quote} -\begin{verbatim} -I -I.[IteratorProtocol]Element -\end{verbatim} -\end{quote} -Note that the base type of a dependent member type can be any type parameter, not just a generic parameter type; that is, dependent member types can ``nest'' arbitrarily. To be valid, the base type must conform to the protocol declaring the associated type, either via an explicit conformance requirement, or as a consequence of other requirements. - -As with generic parameter types, dependent member types become \index{primary archetype type}primary archetypes in the body of a generic function; we can reveal a little more about the structure of primary archetypes now, and say that a primary archetype packages a type parameter together with a generic signature. While a type parameter is just a name which can only be understood in relation to a generic signature, an archetype inherently ``knows'' what requirements it is subject to. +\paragraph{Dependent member types.} This function reads a pair of elements from an iterator: +\begin{Verbatim} +func readTwo(_ iter: inout I) -> Pair { + return Pair(first: iter.next()!, second: iter.next()!) +} +\end{Verbatim} +The return type is the generic nominal type \texttt{Pair}, obtained by applying the declaration of \texttt{Pair} to the generic argument \texttt{I.Element}. The generic argument is a \index{dependent member type}\emph{dependent member type}, built from the base type \texttt{I} together with the associated type declaration \texttt{[IteratorProtocol]Element}. This dependent member type represents the type witness in the conformance $\ConfReq{I}{IteratorProtocol}$. -\begin{figure}\captionabove{Three levels} -\begin{tikzpicture} -\tikzstyle{arrow} = [->,>=stealth] +Suppose we call \verb|readTwo(_:)| with a value of type \texttt{Nat}: +\begin{Verbatim} +var iter = Nat() +print(readTwo(&iter)) +\end{Verbatim} +The substitution map for the call stores the replacement type \texttt{Nat} and the conformance of \texttt{Nat} to \texttt{IteratorProtocol}. We'll call this substitution map~$\Sigma$: +\[\Sigma := \SubstMapLongC{\SubstType{I}{Nat}}{\SubstConf{I}{Nat}{IteratorProtocol}}\] +The type of the call expression is then $\texttt{Pair}\otimes\Sigma$. Applying a substitution map to a generic nominal type recursively substitutes the generic arguments. Since the dependent member type \texttt{I.Element} abstracts over the type witness in the conformance, we could guess $\texttt{I.Element}\otimes\Sigma=\texttt{Int}$. We will eventually understand the next equation, to relate dependent member type substitution with type witness projection: +\begin{gather*} +\texttt{I.Element}\otimes\Sigma\\ +\qquad {} = \AssocType{[IteratorProtocol]Element}\otimes \ConfReq{I}{IteratorProtocol}\otimes\Sigma\\ +\qquad {} = \AssocType{[IteratorProtocol]Element}\otimes \ConfReq{Nat}{IteratorProtocol}\\ +\qquad {} = \texttt{Int} +\end{gather*} +We can finally say that $\texttt{Pair}\otimes\Sigma=\texttt{Pair}$ is the return type of our call to \verb|readTwo(_:)|. -\node (IdentTypeRepr) [draw=black] {type representation}; -\node (DependentMemberType) [draw=black,below=of IdentTypeRepr] {interface type}; -\node (Archetype) [draw=black,below=of DependentMemberType] {contextual type}; +\begin{MoreDetails} +\item Dependent member type substitution: \SecRef{abstract conformances}, \ChapRef{conformance paths} +\end{MoreDetails} -\draw [arrow, label="type resolution"] (IdentTypeRepr) -- (DependentMemberType); -\draw [arrow, label="mapping into environment"] (DependentMemberType) -- (Archetype); +\paragraph{Bound and unbound.} +We now briefly introduce a concept that will later become important in our study of type substitution. If we need to make the associated type declaration explicit, we use the notation \verb|I.[IteratorProtocol]Element|, despite this not being valid language syntax. This is the \index{bound dependent member type}\emph{bound} form of a dependent member type. To transform the syntactic type representation \texttt{I.Element} into a bound dependent member type, type resolution queries qualified lookup on the base type \texttt{I}, which is known to conform to \texttt{IteratorProtocol}; thus, the lookup finds the associated type declaration \texttt{Element}. For our current purposes, it is more convenient to use the \index{unbound dependent member type}\emph{unbound} notation for a dependent member type, written in the source language style \texttt{I.Element}. -\node (Text1) [right=of IdentTypeRepr] {appears in parsed syntax tree before type checking}; -\node (Text2) [right=of DependentMemberType] {describes types of declarations}; -\node (Text3) [right=of Archetype] {describes types of expressions inside a function}; -\end{tikzpicture} -\end{figure} +\begin{MoreDetails} +\item Member type representations: \SecRef{identtyperepr} +\end{MoreDetails} -\index{call expression} -\index{expression} -Inside the body of \verb|firstElement(_:)|, the result of the call expression \verb|iter.next()!| is the optional type \texttt{$\archetype{I.Element}$?}, which is force-unwrapped to yield the archetype type $\archetype{I.Element}$. To manipulate a value of the element type abstractly, the compiler must be able to recover its runtime type metadata. +\paragraph{Type parameters.} Generic parameter types and dependent member types are the two kinds of \index{type parameter}\emph{type parameters}. The generic signature of \verb|readTwo(_:)| defines two type parameters, \texttt{I} and \texttt{I.Element}. -While metadata for generic parameters is passed in directly, for dependent member types the metadata is recovered from one or more witness tables provided by the caller. A witness table for a conformance to \texttt{IteratorProtocol} stores two entries, one for each of the protocol's requirements: -\begin{itemize} -\item A metadata access function to witness the \texttt{Element} associated type. -\item A function pointer to witness the \texttt{next()} protocol requirement. -\end{itemize} +As with generic parameter types, dependent member types become \index{primary archetype type}primary archetypes in the body of a generic function. We can reveal a little more about the structure of primary archetypes now, and say that a primary archetype packages a type parameter together with a generic signature. While a type parameter is just a \emph{name} which can only be understood in relation to some generic signature, an archetype inherently ``knows'' what requirements it is subject to. \begin{MoreDetails} -\item Type parameters: Section~\ref{derived req} -\item Archetypes: Chapter~\ref{genericenv} +\item Type parameters: \SecRef{fundamental types} +\item Primary archetypes: \SecRef{archetypesubst} \end{MoreDetails} -\paragraph{Type witnesses} When a concrete type conforms to a protocol, the normal conformance stores a \index{type witness}\emph{type witness} for each of the protocol's associated types; this information is populated by the type checker during conformance checking. +\paragraph{Code generation.} +Inside the body of \verb|readTwo(_:)|, the \index{expression}\index{call expression}call expression \verb|iter.next()!| has the type \texttt{$\archetype{I.Element}$?}, which is force-unwrapped to yield the type $\archetype{I.Element}$. To manipulate a value of this type abstractly, we need its runtime type metadata. -\begin{listing}\captionabove{Iterator producing the natural numbers}\label{natural numbers listing} -\begin{Verbatim} -struct Nat: IteratorProtocol { - typealias Element = Int - var x = 0 - - mutating func next() -> Int? { - defer { x += 1 } - return x - } -} -\end{Verbatim} -\end{listing} -\index{underlying type} -\index{associated type inference} -Listing~\ref{natural numbers listing} shows a type that conforms to \texttt{IteratorProtocol} by producing an infinite stream of \index{natural numbers}incrementing integers. -Here, the associated type \texttt{Element} is witnessed by a type alias declaration with an underlying type of \texttt{Int}. This matches the return type of \texttt{NaturalNumbers.next()}. Indeed, we can omit the type alias entirely in this case, and instead rely on \emph{associated type inference} to derive it from the interface type of the witness. +We recover the runtime type metadata for an associated type from the witness table at run time, in the same way that dependent member type substitution projects a type witness from a conformance at compile time. -Suppose we call \verb|firstElement(_:)| with a value of type \texttt{NaturalNumbers}: +A witness table for a conformance to \texttt{IteratorProtocol} consists of a metadata access function to witness the \texttt{Element} associated type, and a function pointer to witness the \texttt{next()} method. The witness table for the conformance $\ConfReq{Nat}{IteratorProtocol}$ therefore references the runtime type metadata for \texttt{Int}, defined by the standard library. + +\paragraph{Same-type requirements.} To introduce another fundamental requirement kind, we compose \texttt{Pair} and \texttt{IteratorProtocol} in a new way, writing a function that takes two iterators and reads an element from each one: \begin{Verbatim} -var iter = Nat() -print(firstElement(&iter)) +func readTwoParallel(_ i: I, _ j: J) -> Pair + where I: IteratorProtocol, J: IteratorProtocol, + I.Element == J.Element { + return Pair(first: i.next()!, second: j.next()!) +} \end{Verbatim} -The substitution map for the call stores the replacement type \texttt{NaturalNumbers} and the conformance of \texttt{NaturalNumbers} to \texttt{IteratorProtocol}. We'll call this substitution map $S$: -\[S := \SubstMapLongC{\SubstType{I}{Nat}}{\SubstConf{I}{Nat}{IteratorProtocol}}\] -To compute the substituted type of the call, we apply $S$ to the interface type of our function, \Type{(I) -> I.[IteratorProtocol]Element}: -\begin{itemize} -\item Substitution of the function parameter type \texttt{I} is straightforward; the replacement type \texttt{Nat} is stored directly in the substitution map: -\[\Type{I}\otimes S = \Type{Nat}\] -\item Substitution of the return type \verb|I.[IteratorProtocol]Element| proceeds in a manner that is analogous to how the generated code for our function is able to recover the type metadata for an associated type from a witness table at run time. The normal conformance $\ConfReq{Nat}{IteratorProtocol}$ corresponds to the conformance requirement $\ConfReq{I}{IteratorProtocol}$, and we can get it out of the substitution map: -\[\ConfReq{I}{IteratorProtocol}\otimes S = \ConfReq{Nat}{IteratorProtocol}\] -In this conformance, the type witness for the \verb|Element| associated type is \verb|Int|: -\[\pi_{\texttt{IteratorProtocol}}\Type{Element}\otimes \ConfReq{Nat}{IteratorProtocol} = \Type{Int}\] -\end{itemize} -The substituted function type for the call is therefore: +The generic signature of our \verb|readTwoParallel(_:)| states the \emph{same-type requirement} $\SameReq{I.Element}{J.Element}$: \begin{quote} \begin{verbatim} -(inout Nat) -> Int + \end{verbatim} \end{quote} +This generic signature defines four type parameters, \texttt{I}, \texttt{J}, \texttt{I.Element}, and \texttt{J.Element}, where the final two abstract over the same concrete type, forming an \index{equivalence class}\emph{equivalence class}. It is often convenient to refer to an entire equivalence class by a representative type parameter. To make this choice in a deterministic fashion, we \emph{order} type parameters and say the smallest type parameter inside an equivalence class is a \index{reduced type}\emph{reduced type}. In our example, \texttt{I.Element} is the reduced type of \texttt{J.Element}. + +In the presence of same-type requirements, an archetype represents a reduced type parameter (and thus an entire equivalence class of type parameters). In the body of \verb|readTwoParallel(_:)|, the expressions \verb|i.next()!| and \verb|j.next()!| both receive the type $\archetype{I.Element}$, and the call to the \texttt{Pair} constructor is made with this substitution map: +\[\SubstMap{\SubstType{T}{$\archetype{I.Element}$}}\] \begin{MoreDetails} -\item Type witnesses: Section~\ref{type witnesses} -\item Dependent member type substitution: Section~\ref{abstract conformances} and Chapter~\ref{conformance paths} +\item Reduced types: \SecRef{reduced types} +\item The type parameter graph: \SecRef{type parameter graph} \end{MoreDetails} -\index{associated conformance}% -\index{requirement signature}% -\Index{protocol Self type@protocol \texttt{Self} type}% -\paragraph{Associated conformances} Protocols can also impose requirements on their associated types. The \texttt{Sequence} protocol in the standard library is one such example: +\paragraph{Checking generic arguments.} +When type checking a call to \verb|readTwoParallel(_:)|, we must ensure the same-type requirement is satisfied. Suppose we define two new iterator types, \texttt{BabyNames} and \texttt{CatNames}, both witnessing the \texttt{Element} associated type with \texttt{String}, and we call \verb|readTwoParallel(_:)| with these types: +\begin{Verbatim} +var i = BabyNames() +var j = CatNames() +print(readTwoParallel(&i, &j)) +\end{Verbatim} +We're making the call with this substitution map: +\[ +\Sigma := \SubstMapLongC{\SubstType{I}{BabyNames}\\ +\SubstType{J}{CatNames}}{ +\SubstConf{I}{BabyNames}{IteratorProtocol}\\ +\SubstConf{J}{CatNames}{IteratorProtocol}} +\] +The type checker applies $\Sigma$ to both sides of the same-type requirement, which gives us $\texttt{I.Element}\otimes\Sigma=\texttt{String}$ and $\texttt{J.Element}\otimes\Sigma=\texttt{String}$. The result is \texttt{String} in both cases, so we get the following \index{substituted requirement}\emph{substituted requirement}: +\[\SameReq{I.Element}{J.Element}\otimes\Sigma=\SameReq{String}{String}\] +The substituted requirement is satisfied so the code is well-typed, and we see that the call returns \texttt{Pair}. Suppose instead we had substituted \texttt{I} with \texttt{Nat}: +\begin{Verbatim} +var i = Nat() +var j = CatNames() +print(readTwoParallel(&i, &j)) // error +\end{Verbatim} +In this case, we get the substituted requirement $\SameReq{Int}{String}$ which is unsatisfied, so the call is malformed and the type checker must \index{diagnostic!unsatisfied requirement}diagnose an error. + +\begin{MoreDetails} +\item Checking generic arguments: \SecRef{checking generic arguments} +\end{MoreDetails} + +\section{Associated Requirements} + +Protocols can impose \index{associated requirement}\emph{associated requirements} on their associated types. A conforming type must then satisfy these requirements. This capability is what gives Swift generics much of their distinctive flavor. Perhaps the simplest example is the \texttt{Sequence} protocol in the standard library, which abstracts over types that can produce a fresh iterator on demand: \begin{Verbatim} protocol Sequence { associatedtype Element @@ -467,85 +457,102 @@ \section{Protocols} func makeIterator() -> Iterator } \end{Verbatim} -There are two requirements here: -\begin{enumerate} -\item The conformance requirement $\ConfReq{Iterator}{IteratorProtocol}$, which is written as a constraint type in the inheritance clause of the \texttt{Iterator} associated type. -\item The same-type requirement $\FormalReq|Element == Iterator.Element|$, written in a trailing \texttt{where} clause. -\end{enumerate} -Requirements on the generic parameters of a generic function or generic type are collected in the declaration's generic signature. A protocol analogously has a \emph{requirement signature} which collects the requirements imposed on its associated types. A protocol always declares a single generic parameter named \texttt{Self}, and our notation for a requirement signature looks like a generic signature over the protocol \texttt{Self} type: +This protocol states two associated requirements: +\begin{itemize} +\item The \index{associated conformance requirement}conformance requirement $\ConfReq{Self.Iterator}{IteratorProtocol}$, stated using the sugared form, as a constraint type in the inheritance clause of the \texttt{Iterator} associated type. +\item The \index{associated same-type requirement}same-type requirement $\SameReq{Self.Element}{Self.Iterator.Element}$, which we state in a trailing \texttt{where} clause attached to the associated type. +\end{itemize} +Associated requirements are like the requirements in a generic signature, except they are rooted in the \Index{protocol Self type@protocol \texttt{Self} type}protocol \texttt{Self} type. Once again, there are multiple equivalent syntactic forms for stating them. For example, we could write out the conformance requirement explicitly, and the above \texttt{where} clause could be attached to the protocol itself for the same semantic effect. + +The associated requirements of a protocol are recorded by the protocol's \index{requirement signature}\emph{requirement signature}. The \texttt{Sequence} protocol has the following requirement signature: \begin{quote} \begin{verbatim} - + \end{verbatim} \end{quote} -The conformance requirement $\ConfReq{Self.[Sequence]Iterator}{IteratorProtocol}$ is an \emph{associated conformance requirement}, and associated conformance requirements appear in protocol witness tables. Therefore a witness table for a conformance to \texttt{Sequence} has \emph{four} entries: +Now consider this generic signature, and call it $G$: +\begin{quote} +\begin{verbatim} + +\end{verbatim} +\end{quote} +We can informally describe $G$ by looking at its equivalence classes: +\begin{itemize} +\item \texttt{T}, which conforms to \texttt{Sequence}. +\item \texttt{U}, which conforms to \texttt{Sequence}. +\item \texttt{T.Element}, \texttt{U.Element}, \texttt{T.Iterator.Element} and \texttt{U.Iterator.Element} are all in one equivalence class. +\item \texttt{T.Iterator}, which conforms to \texttt{IteratorProtocol}. +\item \texttt{U.Iterator}, which conforms to \texttt{IteratorProtocol}. +\end{itemize} +To make this kind of analysis precise, we will develop a theory of \index{derived requirement}\emph{derived requirements} to reason about requirements that are not explicitly stated, but are logical consequences of other requirements. The following are some interesting derived requirements of $G$: +\begin{gather*} +\ConfReq{T.Iterator}{IteratorProtocol}\\ +\ConfReq{U.Iterator}{IteratorProtocol}\\ +\SameReq{T.Element}{T.Iterator.Element}\\ +\SameReq{U.Element}{U.Iterator.Element}\\ +\SameReq{T.Iterator.Element}{U.Iterator.Element} +\end{gather*} + +At this point, it's worth clarifying that type parameters have a recursive structure; the base type of \texttt{U.Iterator.Element} is another dependent member type, \texttt{U.Iterator}. The theory of derived requirements will also describe the \emph{valid type parameters} of a generic signature. + +\paragraph{Conformances.} +A normal conformance stores the \index{associated conformance}\emph{associated conformance} for each associated conformance requirement of its protocol. In the runtime representation, there is a corresponding entry in the witness table for each associated conformance requirement. A witness table for \texttt{Sequence} has four entries: \begin{enumerate} \item A metadata access function to witness the \texttt{Element} associated type. \item A metadata access function to witness the \texttt{Iterator} associated type. -\item A witness table access function to witness the associated conformance requirement \verb|Iterator: IteratorProtocol|. -\item A function pointer to the witness the \texttt{makeIterator()} protocol requirement. +\item A \index{witness table}\emph{witness table access function} to witness the associated conformance requirement \ConfReq{Self.Iterator}{IteratorProtocol}. +\item A function pointer to witness the \texttt{makeIterator()} protocol method. \end{enumerate} +Protocol inheritance is the special case of an associated conformance requirement with a subject type of \texttt{Self}. The standard library \texttt{Collection} protocol inherits from \texttt{Sequence}, so the associated conformance requirement $\ConfReq{Self}{Sequence}$ appears in the requirement signature of \texttt{Collection}. Starting from a conformance $\ConfReq{Array}{Collection}$, we can get at the conformance to \texttt{Sequence} via \emph{associated conformance projection}: +\[\AssocConf{Self}{Sequence}\otimes\ConfReq{Array}{Collection}=\ConfReq{Array}{Sequence}\] +Similarly, at runtime, starting from a witness table for a conformance to \texttt{Collection}, we can recover a witness table for a conformance to \texttt{Sequence}, and thus all of the metadata described above. We will take a closer look at the other associated requirements of the \texttt{Collection} protocol in due time. This will lead us into the topic of \emph{recursive} associated conformance requirements, which as we show, enable the type substitution algebra to encode any computable function. + +\begin{MoreDetails} +\item Requirement signatures: \SecRef{requirement sig} +\item Derived requirements: \SecRef{derived req} +\item Associated conformances: \SecRef{associated conformances} +\item Recursive conformances: \SecRef{recursive conformances} +\end{MoreDetails} + +\section{Related Work} -\index{abstract conformance} -\paragraph{Abstract conformances} -Let's define a \verb|firstElementSeq(_:)| function which operates on a sequence.\footnote{We could give both functions the same name and take advantage of function overloading, but for clarity we're not going to do that.} We can call the \verb|makeIterator()| protocol requirement to create an iterator for our sequence, and then hand off this iterator to the \verb|firstElement(_:)| function we defined previously: +A calling convention centered on a runtime representation of types was explored in a 1996 paper \cite{intensional}. Swift protocols are similar in spirit to \index{Haskell}Haskell type classes, described in \cite{typeclass}, and subsequently \cite{typeclasshaskell} and \cite{peytonjones1997type}. Swift witness tables follow the ``dictionary passing'' implementation strategy for type classes. Two papers subsequently introduced type classes with associated types, but without associated requirements. Using our Swift lingo, the first paper defined a formal model for associated types witnessed by \index{nested type declaration}nested nominal types \cite{assoctype}. In this model, every \index{type witness}type witness is a distinct concrete type. To translate their example into Swift: \begin{Verbatim} -func firstElementSeq(_ sequence: S) -> S.Element { - var iter = sequence.makeIterator() - return firstElement(&iter) +protocol ArrayElem { + associatedtype Array + func index(_: Array, _: Int) -> Self } \end{Verbatim} -The substitution map for the call to \verb|firstElement(_:)| is interesting. The argument \texttt{iter} has the type $\archetype{S.Element}$, which becomes the replacement type for the generic parameter $\Type{I}$ of \verb|firstElement(_:)|. Recall that this substitution map also needs to store a conformance. Since the conforming type is an archetype and not a concrete type, global conformance lookup returns an \emph{abstract conformance}. So our substitution map looks like this: -\[\SubstMapLongC{\SubstType{I}{$\archetype{S.Iterator}$}}{\SubstConf{I}{$\archetype{S.Iterator}$}{IteratorProtocol}}\] +The second paper described \emph{associated type synonyms} \cite{synonyms}. This is the special case of an associated type witnessed by a type alias in Swift. Again, we translate their example: +\begin{Verbatim} +protocol Collects { + associatedtype Elem + static var empty: Self { get } + func insert(_: Elem) -> Self + func toList() -> [Elem] +} +\end{Verbatim} +Other relevant papers from the Haskell world include \cite{schrijvers2008type} and \cite{kiselyov2009fun}. -When generating code for the call, we need to emit runtime type metadata for \texttt{I} as well as a witness table for $\ConfReq{I}{IteratorProtocol}$. Both of these are recovered from the witness table for the conformance $\ConfReq{S}{Sequence}$ that was passed by the caller of \verb|firstElementSeq(_:)|: -\begin{enumerate} -\item The replacement type for \texttt{I} is $\archetype{S.Iterator}$. Runtime type metadata for this type is recovered by calling the metadata access function for the \texttt{Iterator} associated type stored in the $\ConfReq{S}{Sequence}$ witness table. -\item The conformance for $\ConfReq{I}{IteratorProtocol}$ is an abstract conformance. We know $\archetype{S.Iterator}$ conforms to \verb|IteratorProtocol| because of a requirement in the \texttt{Sequence} protocol. The witness table for this conformance can be recovered by calling the witness table access function for the $\ConfReq{Iterator}{IteratorProtocol}$ associated conformance that is stored in our $\ConfReq{S}{Sequence}$ witness table. -\end{enumerate} +\paragraph{C++.} While \index{C++}C++ templates are synonymous with ``generic programming'' to many programmers, C++ is somewhat unusual compared to most languages with parametric polymorphism, because templates are fundamentally syntactic in nature. Compiling a template declaration only does some minimal amount of semantic analysis, with most type checking deferred to until \emph{after} template expansion. There is no formal notion of requirements on template parameters, so whether template expansion succeeds or fails at a given expansion point depends on how the template's body uses the substituted template parameters. -Recall that the shape of the substitution map is determined by the generic signature of the callee. In our earlier examples, the replacement types and conformances were fully concrete, which allowed us to emit runtime type metadata and witness tables for a call by referencing global symbols. +The inherent flexibility of C++ templates enables some advanced metaprogramming techniques \cite{gregor}. On the other hand, a template declaration's body must be visible from each expansion point, so this model is fundamentally at odds with separate compilation. Undesirable consequences include libraries where large parts must be implemented in header files, and cryptic error messages on template expansion failure. -More generally, the replacement types and conformances are defined in terms of the type parameters of the caller's generic signature. This makes sense, because we start with the runtime type metadata and witness tables received by the caller, from which we recover the runtime metadata and witness tables required by the callee. Here, the caller is \verb|firstElementSeq(_:)| and the callee is \verb|firstElement(_:)|. +Swift's generics model was in many ways inspired by ``C++0x concepts,'' a proposal to extend the C++ template metaprogramming model with \emph{concepts}, a form of type classes with associated types (\cite{concepts}, \cite{essential}). Concepts could declare their own associated requirements, but the full ramifications were perhaps not yet apparent to the authors when they wrote this remark: +\begin{quote} +\textsl{``Concepts often include requirements on associated types. For example, a container's associated iterator \texttt{A} would be required to model the \texttt{Iterator} concept. This form of concept composition is slightly different from refinement but close enough that we do not wish to clutter our presentation [\ldots]''} +\end{quote} -\section{Language Comparison} +\paragraph{Rust.} +\index{Rust}Rust generics are separately type checked, but Rust does not define a calling convention for unspecialized generic code, so there is no separate compilation. Instead, the implementation of a generic function is \emph{specialized}, or \emph{monomorphized}, for every unique set of generic arguments. -\cite{intensional} -\cite{typeclass} -\cite{typeclasshaskell} -\cite{assoctype} -\cite{synonyms} -\cite{Milewski_2015} -\cite{peytonjones1997type} -\cite{concepts} -\cite{essential} -\cite{schrijvers2008type} -\cite{kiselyov2009fun} +Rust's \emph{traits} are similar to Swift's protocols; traits can declare associated types and associated conformance requirements. Rust generics also allow some kinds of abstraction not supported by Swift, such as lifetime variables, generic associated types \cite{rust_gat}, and const generics \cite{rust_const}. On the flip side, Rust does not allow same-type requirements to be stated in full generality~\cite{rust_same}. Instead, trait bounds can constrain associated types with a syntactic form resembling Swift's \index{parameterized protocol type}parameterized protocol types (\SecRef{protocols}), but we will show in \ExRef{proto assoc rule} that Swift's same-type requirements are more general. -\index{C++} -\index{Java} -\index{Rust} -Swift generics occupy a unique point in the design space, which avoids some of the tradeoffs inherent in the design of other popular languages: -\begin{itemize} -\item C++ templates do not allow for separate compilation and type checking. When a template declaration is compiled, only minimal semantic checks are performed and no code is actually generated. The body of a template declaration must be visible at each expansion point, and full semantic checks are performed after template expansion. There is no formal notion of requirements on template parameters; at a given expansion point, template expansion either succeeds or fails depending on how the substituted template parameters are used in the body of the template. -\item Rust generics are separately type checked with the use of generic requirements. Unlike C++, specialization is not part of the semantic model of the language, but it is mandated by the implementation because Rust does not define a calling convention for unspecialized generic code. After type checking, the compiler completely specializes all usages of generic definitions for every set of provided generic arguments. -\item Java generics are separately type checked and compiled. Only reference types can be used as generic arguments; primitive value types must be boxed on the heap. The implementation strategy uses a uniform runtime layout for all generic types, and generic argument types are not reified at runtime. This avoids the complexity of generic type layout at the virtual machine level, but it comes at the cost of runtime type checks and heap allocation. -\end{itemize} -We can summarize this with a table. -\begin{quote} -\begin{tabular}{|l|>{\centering}p{1.3cm}|>{\centering}p{1.3cm}|>{\centering}p{1.3cm}|>{\centering\arraybackslash}p{1.3cm}|} -\hline -&C++&Rust&Java&Swift\\ -\hline -Separate compilation&$\times$&$\times$&\checkmark&\checkmark\\ -Specialization&\checkmark&$\checkmark$&$\times$&\checkmark\\ -Generic requirements&$\times$&$\checkmark$&$\checkmark$&\checkmark\\ -Unboxed values&\checkmark&\checkmark&$\times$&\checkmark\\ -\hline -\end{tabular} -\end{quote} -This should not to be interpreted as a slight on these languages; each one is interesting in its own way. The inherent flexibility of C++ templates allows for a certain style of metaprogramming which can be difficult to express in other languages; if you want to learn more, a comprehensive reference book about C++ templates was co-authored by one of the primary developers of the Swift compiler \cite{gregor}. Rust generics are the closest to Swift's at the language level, but the core algorithms are rather different than what you will see in Part~\ref{part rqm} of this book; Rust uses a Prolog-like solver which is described in~\cite{rust_chalk}. Finally, Java generics make heavy use of existential types and are somewhat unique in that they support \emph{variance}, or in other words, explicitly-defined subtyping relationships between different instantiations of the same generic type \cite{java_faq}. +Rust's ``\texttt{where} clause elaboration'' is more limited than Swift's derived requirements formalism, and associated requirements sometimes need to be re-stated in the \texttt{where} clause of a generic declaration \cite{rust_bug}. An early attempt at formalizing Rust traits appears in a PhD dissertation from 2015 \cite{Milewski_2015}. A more recent effort is ``Chalk,'' an implementation of a \index{Prolog}Prolog-like solver based on Horn clauses~\cite{rust_chalk}. + +\paragraph{Java.} +\index{Java}Java generics are separately type checked and compiled. In Java, only reference types can be used as generic arguments; primitive value types must be boxed first. Generic argument types are not reified at runtime, and values of generic parameter type are always represented as object pointers. This avoids the complexity of dependent layout, but comes at the cost of more runtime type checks and heap allocation. Java generics are based on existential types and also support \emph{variance}, sort of user-defined subtyping relation between different instantiations of the same generic type \cite{java_faq}. -\end{document} \ No newline at end of file +\end{document} diff --git a/docs/Generics/chapters/math-summary.tex b/docs/Generics/chapters/math-summary.tex new file mode 100644 index 0000000000000..2eb5b0db1d1cf --- /dev/null +++ b/docs/Generics/chapters/math-summary.tex @@ -0,0 +1,61 @@ +\documentclass[../generics]{subfiles} + +\begin{document} + +\chapter{Mathematics Summary}\label{math summary} + +The Greek alphabet is sometimes used for variable names and other notation. This book only needs a handful of Greek letters: lowercase $\alpha$~(alpha), $\beta$~(beta), $\gamma$~(gamma), $\varepsilon$~(epsilon), $\pi$~(pi), $\sigma$~(sigma), $\uptau$~(tau) and $\varphi$~(phi); and uppercase $\Sigma$ (sigma). + +The ``='' operator means two existing things are equivalent. The ``:='' operator defines the new thing on the left in terms of the existing thing on the right. + +\IndexDefinition{set} +\IndexDefinition{natural numbers} +\IndexDefinition{integers} +\IndexDefinition{empty set} +A \emph{set} is a collection of elements without regard to order or duplicates. Sets can be finite or infinite. A finite set can be specified by listing its elements in any order, for example $\{a,\,b,\,c\}$. The empty set \index{$\varnothing$}\index{$\varnothing$!z@\igobble|seealso{empty set}}$\varnothing$ is the unique set with no elements. The set of \emph{natural numbers} \index{$\mathbb{N}$}\index{$\mathbb{N}$!z@\igobble|seealso{natural numbers}}$\mathbb{N}:=\{0,\,1,\,2,\,\ldots\}$ is the infinite set with zero and its \index{successor}successors. The set of \emph{integers} \index{$\mathbb{Z}$}\index{$\mathbb{N}$!z@\igobble|seealso{integers}}$\mathbb{Z}$ also contains the negative whole numbers. + +The notation \index{$\in$}\index{$\in$!z@\igobble|seealso{set}}$x\in S$ means ``$x$ is an element of a set $S$,'' and \index{$\notin$}$x\notin S$ is its negation, which is ``$x$ is \emph{not} an element of $S$.'' Properties of sets can be stated using \emph{existential} quantification (``there exists (at least one) $x\in S$ such that $x$ has this property\ldots'') or \emph{universal} quantification (``for all $x\in S$, the following property is true of $x$\ldots''). + +\IndexDefinition{subset}% +\IndexDefinition{proper subset}% +A set $X$ is a \emph{subset} of another set $Y$, written as \index{$\subseteq$}\index{$\subseteq$!z@\igobble|seealso{subset}}$X\subseteq Y$, if for all $x\in X$, it is also true that $x\in Y$. If $X\subseteq Y$ and $Y\subseteq X$, then $X=Y$. If $X\subseteq Y$ and $X\neq Y$, then $X$ is a \emph{proper} subset of $Y$, written as \index{$\subsetneq$}\index{$\subsetneq$!z@\igobble|seealso{proper subset}}$X\subsetneq Y$. The \IndexDefinition{union}\emph{union} \index{$\cup$}\index{$\cup$!z@\igobble|seealso{union}}$X\cup Y$ is the set of all elements belonging to either $X$ or $Y$. The \IndexDefinition{intersection}\emph{intersection} \index{$\cap$}\index{$\cap$!z@\igobble|seealso{intersection}}$X\cap Y$ is the set of all elements belonging to both $X$ and $Y$. The \IndexDefinition{set difference}\emph{difference} \index{$\setminus$}\index{$\setminus$!z@\igobble|seealso{set difference}}$X\setminus Y$ is the set of all elements of $X$ not also in $Y$. + +\IndexDefinition{Cartesian product} +\IndexDefinition{ordered pair} +\IndexDefinition{ordered tuple} +The \emph{Cartesian product} of two sets $X$ and $Y$, denoted \index{$\times$}\index{$\times$!z@\igobble|seealso{Cartesian product}}$X\times Y$, is the set of all \emph{ordered pairs} $(x,y)$ where $x\in X$, $y\in Y$. Note that the ordered pair $(x,y)$ is not the same as the set $\{x,y\}$, because $(x,y)\neq(y,x)$. The Cartesian product construction generalizes to any finite number of sets, to give \emph{ordered tuples} or \emph{sequences}. + +\IndexDefinition{binary operation} +\IndexDefinition{homomorphism} +\index{mapping|see{function}} +\IndexDefinition{function} +A \emph{function} (or \emph{mapping}) $f\colon X\rightarrow Y$ assigns to each $x\in X$ a unique element $f(x)\in Y$. If the sets $X$ and $Y$ are equipped with some kind of additional structure (which will be explicitly defined), then $f$ is a \emph{homomorphism} if it preserves this structure. + +A function $f\colon X\times Y\rightarrow Z$ defined on the Cartesian product can be thought of as taking a pair of values $x\in X$, $y\in Y$ to an element $f(x,y)\in Z$. A \emph{binary operation} is a function named by a symbol like $\otimes$, $\star$ or $\cdot$ defined on the Cartesian product of two sets. The application of a binary operation is denoted by writing the symbol in between the two elements, like $x\otimes y$. + +The \emph{cardinality} of a finite set $S$, denoted $|S|$, is the number of elements in $S$. The notation $|x|$ sometimes denotes certain other functions taking values in $\mathbb{N}$, such as the length of a sequence; this will always be explicitly defined when needed. + +The more formal sections in this book are written in ``Euclidean style'', where the text is further subdivided into numbered statements of the below kinds: +\begin{itemize} +\item A \textbf{Definition} introduces terminology or notation. +\item An \textbf{Example} shows how this terminology and notation arises in practice. +\item A \textbf{Proposition} states a logical consequence of one or more definitions. +\item A \textbf{Lemma} is an intermediate proposition in service of proving another theorem. +\item A \textbf{Theorem} is a ``deeper'' proposition which is more profound in some sense. +\item A \textbf{Corollary} is an immediate consequence of an earlier theorem. +\item An \textbf{Algorithm} is a step-by-step description of a computable function, defined in terms of inputs and outputs. +\end{itemize} + +The mathematical concepts we introduce along the way: +\begin{itemize} +\item Formal systems (\SecRef{derived req}). +\item Equivalence relations (\SecRef{type params}, \SecRef{rewrite graph}). +\item Partial and linear orders (\SecRef{reduced types}, \SecRef{rewritesystemintro}, \SecRef{reduction order}). +\item Proof by induction (\SecRef{generic signature validity}). +\item Directed graphs (\SecRef{type parameter graph}, \SecRef{finding conformance paths}, \SecRef{recursive conformances}, \SecRef{protocol component}). +\item Computability theory (\SecRef{tag systems}, \SecRef{word problem}). +\item Finitely-presented monoids and string rewriting (\ChapRef{monoids}, \ChapRef{completion}). +\item Category theory is briefly mentioned (\SecRef{submapcomposition}). +\end{itemize} + +\end{document} diff --git a/docs/Generics/chapters/monoids.tex b/docs/Generics/chapters/monoids.tex index 041cc7771c6e4..af2ca1f7c1012 100644 --- a/docs/Generics/chapters/monoids.tex +++ b/docs/Generics/chapters/monoids.tex @@ -69,7 +69,7 @@ \chapter{Monoids}\label{monoids} \begin{proof} \index{induction} \index{proof by induction|see{induction}} -We show this to be true using induction, a fundamental property of the arithmetic of $\mathbb{N}$ we encountered in Section~\ref{generic signature validity}. First, we prove the base case, that $x^m\cdot x^0=x^{m+0}$. Then, we assume the inductive hypothesis, $x^m\cdot x^{n-1}=x^{m+n-1}$, and show that $x^m\cdot x^n=x^{m+n}$. By induction over $\mathbb{N}$, this proves the result for all $n\in\mathbb{N}$. +We show this to be true using induction, a fundamental property of the arithmetic of $\mathbb{N}$ we encountered in \SecRef{generic signature validity}. First, we prove the base case, that $x^m\cdot x^0=x^{m+0}$. Then, we assume the inductive hypothesis, $x^m\cdot x^{n-1}=x^{m+n-1}$, and show that $x^m\cdot x^n=x^{m+n}$. By induction over $\mathbb{N}$, this proves the result for all $n\in\mathbb{N}$. The base case follows from the definition of $x^0$, the identity element axiom of $M$, and the fact that $m=m+0$ in $\mathbb{N}$: \[x^m\cdot x^0=x^m\cdot\varepsilon=x^m=x^{m+0}\] @@ -122,7 +122,7 @@ \chapter{Monoids}\label{monoids} \end{example} \begin{example} \index{commutative operation}% -The free monoid over two elements $\{a,b\}^*$ consists of all finite strings made up of $a$ and $b$. Two typical elements of $\{a,b\}^*$ are $abba$ and $ba$, and $abba\cdot ba=abbaba$. Note that $abba\cdot ba\neq ba\cdot abba$, so unlike $\{a\}^*$, the free monoid over two elements is no longer commutative: in general, $x\cdot y \neq y \cdot x$. However, Proposition~\ref{monoid exp law} still holds; for example, if we let $x:=abb$, a quick computation shows that $x^2\cdot x^3=x^3\cdot x^2=x^5$. +The free monoid over two elements $\{a,b\}^*$ consists of all finite strings made up of $a$ and $b$. Two typical elements of $\{a,b\}^*$ are $abba$ and $ba$, and $abba\cdot ba=abbaba$. Note that $abba\cdot ba\neq ba\cdot abba$, so unlike $\{a\}^*$, the free monoid over two elements is no longer commutative: in general, $x\cdot y \neq y \cdot x$. However, \PropRef{monoid exp law} still holds; for example, if we let $x:=abb$, a quick computation shows that $x^2\cdot x^3=x^3\cdot x^2=x^5$. \end{example} \section{Finitely-Presented Monoids}\label{finitely presented monoids} @@ -136,7 +136,7 @@ \section{Finitely-Presented Monoids}\label{finitely presented monoids} A few words on notation: \begin{itemize} \item If we have two finite sets $A := \{a_1,\ldots, a_m\}$ and $R := \{(u_1,v_1),\,\ldots,\,(u_n,v_n)\}$ where $u_i$, $v_i\in A^*$, we can also write the monoid as $\AR$. -\item The relations here are not to be confused with the concept of a relation over a set from Section~\ref{typeparams}. Indeed, the full set of relations $R$, being a finite subset of $A^*\times A^*$, can be thought of as a relation over $A^*$, but we won't use this interpretation here and leave the two concepts wholly separate. +\item The relations here are not to be confused with the concept of a relation over a set from \SecRef{type params}. Indeed, the full set of relations $R$, being a finite subset of $A^*\times A^*$, can be thought of as a relation over $A^*$, but we won't use this interpretation here and leave the two concepts wholly separate. \item We only use $x=y$ to mean that $x$ and $y$ are equal in $A^*$; that is, they have the same exact spelling as terms. To denote equivalence with respect to the set of relations $R$, we always write $x\sim y$. \end{itemize} @@ -200,15 +200,15 @@ \section{Finitely-Presented Monoids}\label{finitely presented monoids} In fact, no ``smaller'' presentation exists. It is perhaps surprising to constrast this with $(\mathbb{N},\,+,\,0)$, which just the free monoid with one generator. \end{example} -\section{The Rewrite Graph}\label{rewrite graph} -Having informally sketched out finitely-presented monoids, we now wish to make precise what is meant when we say that a relation like $ab\sim\varepsilon$ allows us to transform $baaba$ into $baa$ and then $baa$ into $babaa$. We follow \cite{SQUIER1994271} in associating a directed graph with the description of a finitely-presented monoid $\AR$. While the aforesaid paper calls this graph the \index{derivation graph|see{rewrite graph}}derivation graph, we prefer to call it the \emph{rewrite graph}, to avoid confusion with derivations in the sense of Section~\ref{derived req}. +\section{The Term Equivalence Relation}\label{rewrite graph} +Having informally sketched out finitely-presented monoids, we now wish to make precise what is meant when we say that a relation like $ab\sim\varepsilon$ allows us to transform $baaba$ into $baa$ and then $baa$ into $babaa$. We follow \cite{SQUIER1994271} in associating a directed graph with the description of a finitely-presented monoid $\AR$. While the aforesaid paper calls this graph the \index{derivation graph|see{rewrite graph}}derivation graph, we prefer to call it the \emph{rewrite graph}, to avoid confusion with derivations in the sense of \SecRef{derived req}. -The \IndexDefinition{rewrite graph}rewrite graph explicitly encodes transformations of terms as paths in this graph. The vertices are the terms of the free monoid $A^*$, and the edge set in defined in such a way that two vertices $x$ and $y$ are joined by a path if $x\sim y$ under the set of relations $R$. This provides a more constructive definition than the classical approach of starting from the intersection of all equivalence relations that contain $R$. The rewrite graph also forms the theoretical basis of the rewrite system minimization algorithm in Chapter~\ref{rqm minimization}. We begin by defining the structure of the edge set. +The \IndexDefinition{rewrite graph}rewrite graph explicitly encodes transformations of terms as paths in this graph. The vertices are the terms of the free monoid $A^*$, and the edge set in defined in such a way that two vertices $x$ and $y$ are joined by a path if $x\sim y$ under the set of relations $R$. This provides a more constructive definition than the classical approach of starting from the intersection of all equivalence relations that contain $R$. The rewrite graph also forms the theoretical basis of the rewrite system minimization algorithm in \ChapRef{rqm minimization}. We begin by defining the structure of the edge set. \begin{definition} Given an alphabet $A$ and a set of relations $R$, a \IndexDefinition{rewrite step}\emph{rewrite step} is an \index{ordered tuple}ordered 4-tuple of terms such that $x$, $y\in A^*$ and either $(u,\,v)$ or $(v,\,u)\in R$. We write this rewrite step as \index{$\Rightarrow$}\index{$\Rightarrow$!z@\igobble|seealso{rewrite path, rewrite step}}$x(u\Rightarrow v)y$. The terms $x$ and $y$ are called the left and right \emph{whiskers}, respectively. If $x=\varepsilon$, we write the rewrite step as $(u\Rightarrow v)y$, and if $y=\varepsilon$, we similarly write it as $x(u\Rightarrow v)$. Of course, it may be the case that $x=y=\varepsilon$, in which case the rewrite step is simply denoted $(u\Rightarrow v)$. \end{definition} -Intuitively, the rewrite step $x(u\Rightarrow v)y$ represents the transformation of $xuy$ into $xvy$ via the relation $u\sim v$; it leaves the whiskers $x$ and $y$ unchanged, while rewriting $u$ to $v$ ``in the middle.'' We can draw a picture: +The rewrite step $x(u\Rightarrow v)y$ represents the transformation of $xuy$ into $xvy$ via the relation $u\sim v$; it leaves the whiskers $x$ and $y$ unchanged, while rewriting $u$ to $v$ ``in the middle.'' We can draw a picture: \[ \begin{array}{|c|c|c|} \hline @@ -317,7 +317,7 @@ \section{The Rewrite Graph}\label{rewrite graph} \item If the rewrite path contains at least one rewrite step, the source of the first rewrite step must equal the initial term: $\Src(s_1)=t$. \item If the rewrite path contains at least two steps, the source of each subsequent step must equal the destination of the preceding step: $\Src(s_{i+1})=\Dst(s_i)$ for $0ab>aab>aaab>aaaab>\cdots\] -Algorithm~\ref{term reduction algo} does not directly involve a reduction order, but if the initial rewrite rules are oriented, it always outputs a special kind of rewrite path. +\AlgRef{term reduction algo} does not directly involve a reduction order, but if the initial rewrite rules are oriented, it always outputs a special kind of rewrite path. \begin{definition} A \IndexDefinition{positive rewrite step}rewrite step $s := x(u\Rightarrow v)y$ is \emph{positive} if $v some P {...} \end{Verbatim} -Opaque return types were first introduced in Swift 5.1 \cite{se0244}. The feature was generalized to allow occurrences of \texttt{some} structurally nested in other types, as well as multiple occurrences of \texttt{some}, in Swift 5.7 \cite{se0328}. +Opaque return types were first introduced in \IndexSwift{5.1}Swift 5.1 \cite{se0244}. The feature was generalized to allow occurrences of \texttt{some} structurally nested in other types, as well as multiple occurrences of \texttt{some}, in \IndexSwift{5.7}Swift 5.7 \cite{se0328}. -The type that follows \texttt{some} is a constraint type, as defined in Section~\ref{constraints}. The underlying type is inferred from \texttt{return} statements in the function body. There must be at least one return statement; if there is more than one, all must return the same concrete type. +The type that follows \texttt{some} is a constraint type, as defined in \SecRef{requirements}. The underlying type is inferred from \texttt{return} statements in the function body. There must be at least one return statement; if there is more than one, all must return the same concrete type. At the implementation level, a declaration has an associated \emph{opaque return type declaration} if \texttt{some} appears at least once in the declaration's return type. An opaque return type declaration stores three pieces of information: \begin{enumerate} @@ -100,7 +100,7 @@ \chapter{Opaque Return Types}\label{opaqueresult} \fi -\section{Opaque Archetypes}\label{opaquearchetype} +\section[]{Opaque Archetypes}\label{opaquearchetype} \ifWIP @@ -173,7 +173,7 @@ \section{Opaque Archetypes}\label{opaquearchetype} \SubstConf{\ttgp{0}{0}}{String}{Equatable} } \] -Per Algorithm~\ref{opaquearchetypesubst}, two new opaque generic environments are constructed from the opaque return type declaration of \texttt{underlyingType()} with each of the above two substitution maps. The substituted opaque archetypes are constructed by mapping the interface type \texttt{\ttgp{1}{0}} into each of the two opaque generic environments. +Per \AlgRef{opaquearchetypesubst}, two new opaque generic environments are constructed from the opaque return type declaration of \texttt{underlyingType()} with each of the above two substitution maps. The substituted opaque archetypes are constructed by mapping the interface type \texttt{\ttgp{1}{0}} into each of the two opaque generic environments. Indeed, even though the generic parameter \texttt{T} and the value \texttt{t} are completely unused in the body of the \texttt{underlyingType()} function, each call of \texttt{underlyingType()} with a different specialization produces a different type. This can be observed by noting the behavior of the \texttt{Equatable} protocol's \texttt{==} operator; it expects both operands to have the same type: \begin{Verbatim} @@ -231,7 +231,7 @@ \section{Opaque Archetypes}\label{opaquearchetype} \fi -\section{Referencing Opaque Archetypes}\label{reference opaque archetype} +\section[]{Referencing Opaque Archetypes}\label{reference opaque archetype} \ifWIP @@ -241,7 +241,7 @@ \section{Referencing Opaque Archetypes}\label{reference opaque archetype} \index{associated type inference} Opaque return types are different from other type declarations in that the \texttt{some P} syntax serves to both declare an opaque return type, and immediately reference the declared type. It is however possible to reference an opaque return type of an existing declaration from a different context. The trick is to use associated type inference to synthesize a type alias whose underlying type is the opaque return type, and then reference this type alias. This can be useful when writing tests to exercise an opaque return type showing up in compiler code paths that might not expect them. -\begin{example} The normal conformance \texttt{ConcreteP:\ P} in Listing~\ref{reference opaque return type} shows how an opaque archetype can witness an associated type requirement. The method \texttt{ConcreteP.f()} witnesses the protocol requirement \texttt{P.f()}. The return type of \texttt{ConcreteP.f()} is a tuple type of two opaque archetypes, and the type witnesses for the \texttt{X} and \texttt{Y} associated types are inferred to be the first and second of these opaque archetypes, respectively. Associated type inference synthesizes two type aliases, \texttt{ConcreteP.X} and \texttt{ConcreteP.Y}, which are then referenced further down in the program: +\begin{example} The normal conformance \texttt{ConcreteP:\ P} in \ListingRef{reference opaque return type} shows how an opaque archetype can witness an associated type requirement. The method \texttt{ConcreteP.f()} witnesses the protocol requirement \texttt{P.f()}. The return type of \texttt{ConcreteP.f()} is a tuple type of two opaque archetypes, and the type witnesses for the \texttt{X} and \texttt{Y} associated types are inferred to be the first and second of these opaque archetypes, respectively. Associated type inference synthesizes two type aliases, \texttt{ConcreteP.X} and \texttt{ConcreteP.Y}, which are then referenced further down in the program: \begin{enumerate} \item The global variable \texttt{mince} has an explicit type \texttt{(ConcreteP.X,~ConcreteP.Y)}. \item The function \texttt{pie()} declares a same-type requirement whose right hand side is the type alias \texttt{ConcreteP.X}. @@ -286,7 +286,7 @@ \section{Referencing Opaque Archetypes}\label{reference opaque archetype} The mangled name unambiguously identifies the owner declaration. The index identifies a specific opaque archetype among several when the owner declaration's return type contains multiple occurrences of \texttt{some}. The identifier is ignored; in the Swift language grammar, a type attribute must apply to some underlying type representation, so by convention module interface printing outputs ``\texttt{\_\_}'' as the underlying type representation. \begin{example} -Listing~\ref{reference opaque return type from interface} shows the generated module interface for Listing~\ref{reference opaque return type}, with some line breaks inserted for readability. +\ListingRef{reference opaque return type from interface} shows the generated module interface for \ListingRef{reference opaque return type}, with some line breaks inserted for readability. \begin{listing}\captionabove{References to opaque return types in a module interface}\label{reference opaque return type from interface} \begin{Verbatim} public protocol P { @@ -319,14 +319,14 @@ \section{Referencing Opaque Archetypes}\label{reference opaque archetype} A direct reference to a substituted opaque archetype is written like a generic argument list following the identifier. The generic arguments correspond to the flattened list of generic parameters in the generic signature of the opaque archetype's owner declaration. \begin{example} -In Listing~\ref{substituted opaque archetype reference}, the conformance is declared on the \texttt{Derived} class, but the type witness for \texttt{X} is an opaque archetype from a method on \texttt{Outer.Inner}. The superclass type of \texttt{Derived} is \texttt{Outer.Inner}, so a substitution map is applied to the opaque archetype: +In \ListingRef{substituted opaque archetype reference}, the conformance is declared on the \texttt{Derived} class, but the type witness for \texttt{X} is an opaque archetype from a method on \texttt{Outer.Inner}. The superclass type of \texttt{Derived} is \texttt{Outer.Inner}, so a substitution map is applied to the opaque archetype: \[ \SubstMap{ \SubstType{T}{Int}\\ \SubstType{U}{String} } \] -In the module interface file, this prints as the generic argument list \texttt{}, as shown in Listing~\ref{substituted opaque archetype reference interface}. +In the module interface file, this prints as the generic argument list \texttt{}, as shown in \ListingRef{substituted opaque archetype reference interface}. \end{example} \begin{listing}\captionabove{Source code with a substituted opaque archetype as a type witness}\label{substituted opaque archetype reference} \begin{Verbatim} @@ -373,7 +373,7 @@ \section{Referencing Opaque Archetypes}\label{reference opaque archetype} \end{Verbatim} \end{listing} -\section{Runtime Representation} +\section[]{Runtime Representation} At runtime, an instance of an opaque archetype must be manipulated abstractly, similar to a generic parameter. This mechanism allows the underlying type of an opaque return type to change without breaking callers in other modules. @@ -401,7 +401,7 @@ \section{Runtime Representation} \fi -\section{Source Code Reference} +\section[]{Source Code Reference} \iffalse @@ -433,4 +433,4 @@ \section{Source Code Reference} \fi -\end{document} \ No newline at end of file +\end{document} diff --git a/docs/Generics/chapters/preface.tex b/docs/Generics/chapters/preface.tex index b9021f985438b..35f12e3a375fc 100644 --- a/docs/Generics/chapters/preface.tex +++ b/docs/Generics/chapters/preface.tex @@ -7,133 +7,71 @@ \chapter*{Preface} % Emit this before the first citation to customize bibliography \bstctlcite{IEEEexample:BSTcontrol} -\lettrine{T}{his is a book} about the implementation of generic programming, also known as parametric polymorphism, in the \index{Swift}Swift compiler. You won't learn how to \emph{write} generic code in Swift here; the best reference for that is, of course, the official language guide \cite{tspl}. +\lettrine{T}{his is a book} about the implementation of generic programming---also known as parametric polymorphism---in the \index{Swift}Swift compiler. You won't learn how to \emph{write} generic code in Swift here; the best reference for that is, of course, the official language guide \cite{tspl}. This book is intended mainly for Swift compiler developers who interact with the generics implementation, other language designers who want to understand how Swift evolved, Swift programmers curious to peek under the hood, and finally, mathematicians interested in a practical application of string rewriting and the Knuth-Bendix completion procedure. -This work began as a paper about the ``Requirement Machine,'' a redesign of the core algorithms in Swift generics which shipped with Swift~5.6. After completing the first draft of the paper, I realized that a comprehensive reference guide for the entire generics implementation would be more broadly useful to the community. I worked backwards, filling in the gaps and revising subsequent sections until reaching a fixed point, hopefully converging on something approximating a coherent and self-contained treatment of this cross-section of the compiler. +From the compiler developer's point of view, the \emph{user} is the developer writing the code being compiled. The declarations, types, statements and expressions appearing in the user's program are \emph{values} that the compiler must analyze and manipulate. I assume some basic familiarity with these concepts, and compiler construction in general. For background reading, I recommend \cite{muchnick1997advanced}, \cite{cooper2004engineering}, \cite{craftinginterpreter}, and \cite{incrementalracket}. -I wrote this book with several overlapping but distinct audiences in mind: +This book is divided into five parts. \PartRef{part fundamentals} gives a high-level overview of the Swift compiler architecture, and describes how types and declarations, and specifically, generic types and declarations, are modeled in the compiler. \begin{itemize} -\item Swift compiler developers who must interact with the generics implementation while working on other language features in the compiler. -\item Swift compiler developers wishing to improve to the generics implementation itself. -\item Language designers curious to understand how Swift generics are implemented and how the language evolved to have the feature set it does today. -\item Swift programmers who would simply like to peek under the hood. +\item \ChapRef{roadmap} summarizes every key concept in the generics implementation with a series of worked examples, and surveys capabilities for generic programming found in other programming languages. +\item \ChapRef{compilation model} covers Swift's compilation model and module system as well as the \emph{request evaluator}, which adds an element of lazy evaluation to the typical ``compilation pipeline'' of parsing, type checking and code generation. +\item \ChapRef{types} describes how the compiler models the \emph{types} of values declared by the source program. Types form a miniature language of their own, and we often find ourselves taking them apart and re-assembling them in new ways. Generic parameter types, dependent member types, and generic nominal types are the three fundamental kinds; others are also summarized. +\item \ChapRef{decls} is about \emph{declarations}, the building blocks of Swift code. Functions, structs, and protocols are examples of declarations. There is a common syntax for stating generic parameters and \emph{requirements}, shared by generic declarations of all kinds. Protocols can declare associated types and impose \emph{associated requirements} on their associated types in a similar manner. \end{itemize} +\PartRef{part blocks} focuses on the core \emph{semantic} objects in the generics implementation. To grasp the mathematical asides, it helps to have had some practice working with definitions and proofs, at the level of an introductory course in calculus, linear algebra or combinatorics. A summary of basic math appears in \AppendixRef{math summary}. +\begin{itemize} +\item \ChapRef{genericsig} defines the \emph{generic signature}, which collects the generic parameters and explicit requirements of a generic declaration. The explicit requirements of a generate signature generate a set of \emph{derived requirements} and \emph{valid type parameters}, which explains how we type check code \emph{inside} a generic declaration. This formalism is realized in the implementation via \emph{generic signature queries}. -%You should ideally have some familiarity with compiler design, the Swift language itself, and some abstract algebra. However, being an expert in all three is certainly not required and will not preclude you from making the most of this book. I try to give enough context and citations so you at least know what to look for to close any gaps in your understanding. +\item \ChapRef{substmaps} defines the \emph{substitution map}, a mapping from generic parameter types to replacement types. The \emph{type substitution algebra} will explain the operations of type substitution and substitution map composition. This algebra determines how we type check a \emph{reference} to (or a \emph{specialization} of) a generic declaration. -\paragraph{History} \index{history}In my opinion, to truly understand a piece of code you need historical context. Instead of just explaining how the compiler works today, occasional asides also give brief history lessons about how things came to be. Starting with Swift~2.2, the design of the Swift language has been guided by the Swift evolution process, where language changes are pitched, debated, and formalized in the open. I will cite the relevant Swift evolution proposals where possible: following the URLs linked from the bibliography will allow you to read the proposal and associated discussion. +\item \ChapRef{conformances} defines the \emph{conformance}, a description of how a concrete type fulfills the requirements of a protocol, in particular its associated types. In the type substitution algebra, conformances are to protocols what substitution maps are to generic signatures. -\paragraph{Limitations} -\IndexDefinition{limitation} -Every complex system has imperfections. If you look up ``limitation'' in the index, you will see where various unimplemented corners and theoretical problems in the Swift generics system are explained. Some could be resolved with a little work, others are more fundamental, and a few are open research problems. +\item \ChapRef{genericenv} defines \emph{generic environments} and \emph{archetypes}, two abstractions used throughout the compiler. Also describes the \emph{type parameter graph} that gives us an intuitive visualization of a generic signature. +\end{itemize} +\PartRef{part odds and ends} further develops the theory of derived requirements and type substitution: +\begin{itemize} +\item \ChapRef{typeresolution} details how \emph{type resolution} uses name lookup and type substitution to resolve syntactic type representations to semantic types. -\paragraph{C++} -The Swift compiler is written in \index{C++}C++. To maintain distance between essential and incidental complexity, concepts are described without direct reference to the source code. Instead, each chapter ends with a ``Source Code Reference'' section, structured somewhat like an API reference, which translates what was previously explained into code. You can skip this material if you're only interested in a high-level overview. No knowledge of C++ is assumed outside of these sections. +\item \ChapRef{extensions} discusses extension declarations, which add members and conformances to existing types. Extensions can also declare \emph{conditional conformances}, which have some interesting behaviors. -\section*{Chapter Overview} +\item \ChapRef{building generic signatures} details the process for building a generic signature from syntactic representations. This collects a list of user-written requirements and transforms them into an equivalent list of \emph{minimal} requirements. -This book has three parts, each made up of several chapters. Part~\ref{part fundamentals} is most relevant if your immediate goal is to understand how the generics implementation is seen from the rest of the compiler. Part~\ref{part odds and ends} dives deeper inside the generics implementation, and shows how various language features are built up from the core concepts described in the first part. Part~\ref{part rqm} builds out the theory of rewrite systems and applies it to show how the compiler implements generic signature queries and requirement minimization. +\item \ChapRef{conformance paths} completes the type substitution algebra with \emph{conformance paths}. The concept of a \emph{recursive conformance} is explored, and finally, the type substitution algebra is shown to be Turing-complete. +\end{itemize} +\PartRef{part features} is currently unfinished. It will describe opaque return types (\ChapRef{opaqueresult}), existential types (\ChapRef{existentialtypes}), and subclassing (\ChapRef{classinheritance}). -Chapter~\ref{roadmap} gives a high-level overview of the generics implementation, serving as a roadmap for what follows. The remaining chapters are organized by topic, but also by ``iterative deepening.'' There is some inherent circularity, so it difficult to completely present the material in a linear fashion. Often a detail is glossed over, introduced as a black box with a well-defined interface, while a full accounting of the machinery inside the box is left for a future chapter. The below chapter overview is thus intentionally non-linear, weaving a web of connections instead of following the outward spiral of the main narrative. +\PartRef{part rqm} describes The Requirement Machine, a \emph{decision procedure} for the derived requirements formalism. The original contribution here is that generic signature queries and requirement minimization are problems in the theory of \emph{string rewrite systems}: +\begin{itemize} +\item \ChapRef{rqm basic operation} describes the entry points for generic signature queries and requirement minimization, and how they recursively build a \emph{requirement machine} for a generic signature from the requirement machines of its \emph{protocol components}. -Chapter~\ref{genericsig} defines the generic signature, which combines a list of generic parameter types and generic requirements. This is perhaps the most central concept in the generics implementation. +\item \ChapRef{monoids} introduces finitely-presented monoids and string rewrite systems, and the \emph{word problem}, which asks if two terms are equivalent in this system. The connection with Swift is that every finitely-presented monoid can be encoded in the form of a generic signature, and thus, the derived requirements formalism can encode the word problem. -One of the major themes in this book is the formalism for type substitution. Chapter~\ref{substmaps} introduces substitution maps, which are constructed from generic signatures. You will learn that the replacement for a generic parameter type is stored directly in a substitution map, while the replacement for a dependent member type is derived from a conformance. Chapter~\ref{conformances} discusses how concrete types conform to protocols in detail, focusing on associated types and the role played by conformances in type substitution. +\item \ChapRef{symbols terms rules} describes the construction of a finitely-presented monoid from a list of generic requirements. This demonstrates that the word problem can encode derived requirements; we've reduced each problem to the other. -You will see that conformances can point to other conformances, and in the most general case, type substitution must first compute something known as a conformance path, which begins with one of the conformances stored by the substitution map and follows a series of steps to find another conformance. In Part~\ref{part odds and ends}, Chapter~\ref{conformance paths} explains the theory behind conformance paths, and presents the problem of finding a conformance path as a search problem in an infinite lazily-constructed conformance path graph. This chapter ends with a complete summary of the type substitution algebra thus far, which at this point has been fully developed. +\item \ChapRef{completion} describes the Knuth-Bendix algorithm, which attempts to solve the word problem by constructing a \emph{convergent} rewrite system. This is the core of the decision procedure. Many reasonable generic signatures can be presented by convergent rewrite systems, but you'll also see an example that cannot. -The construction of the conformance path graph asks certain questions about the type parameters of a generic signature. Section~\ref{genericsigqueries} introduces these generic signature queries, but the process of describing their implementation begins in Part~\ref{part rqm} with Chapter~\ref{rqm basic operation}, which builds a requirement machine from a list of generic requirements and protocols. Chapter~\ref{monoids} explains the theory of finitely-presented monoids and rewrite systems, on which the requirement machine is based. Chapter~\ref{symbols terms rules} defines the requirement machine rewrite system, showing how type parameters and generic requirements map to terms and rules. Chapter~\ref{completion} describes the Knuth-Bendix algorithm, which attempts to construct a ``well-behaved'' rewrite system from an arbitrary set of rules. A series of worked examples reveal how rewrite rules for different generic requirements relate. Finally, Chapter~\ref{propertymap} describes the construction of a property map from a rewrite system, and how the property map can answer generic signature queries. +\item \ChapRef{propertymap} is unfinished. It will describe the construction of a \emph{property map} from a rewrite system; the property map answers generic signature queries. -How the compiler actually builds generic signatures in the first place is intervowen with the above. Chapter~\ref{generic declarations} covers the syntactic blocks: generic parameter lists, \texttt{where} clauses, protocols, and associated types. There's more in Part~\ref{part odds and ends}. To construct generic requirements from syntactic representations, the compiler must first construct types from syntactic representations. Chapter~\ref{typeresolution} focuses on this type resolution procedure. Chapter~\ref{building generic signatures} explains how the various entry points for building generic signatures collect generic requirements to feed into the requirement minimization algorithm. As with generic signature queries, this algorithm is initially introduced as a black box with a well-defined interface. This connects to Part~\ref{part rqm}, since requirement minimization is implemented by the very same rewrite system as generic signature queries. Chapter~\ref{rqm minimization} presents the requirement minimization algorithm. +\item \ChapRef{concrete conformances} is unfinished. It will describe the handling of concrete types in the requirement machine rewrite system. -The remaining chapters fill in various gaps: -\begin{itemize} -\item Chapter~\ref{compilation model} covers Swift's compilation model and how it differs from the typical ``compilation pipeline'' of parsing, type checking and code generation. +\item \ChapRef{rqm minimization} is unfinished. It will present the algorithm for rewrite system minimization, which is the final step in building a new generic signature. +\end{itemize} -\item Chapter \ref{types}~and~\ref{decls} define how the compiler models types and declarations, which are central to the language. Swift programmers might want to compare their mental model of the language with that of the compiler by reading through the enumeration of the different kinds of types and declarations here. -\item Chapter~\ref{genericenv} explains generic environments and archetypes, two abstractions used throughout the compiler. +The Swift compiler is written in \index{C++}C++. To avoid incidental complexity, concepts are described without direct reference to the source code. Instead, some chapters end with a \textbf{Source Code Reference} section, structured like an API reference. You can skip this material if you're only interested in the theory. No knowledge of C++ is assumed outside of these sections. -\item Chapter~\ref{extensions} describes extensions, and in particular how constrained extensions and conditional conformances intersect with generics. +Occasional \IndexDefinition{history}historical asides talk about how things came to be. Starting with Swift~2.2, the design of the Swift language has been guided by the Swift evolution process, where language changes are pitched, debated, and formalized in the open \cite{evolution}. I will cite Swift evolution proposals when describing various language features. You will find a lot of other interesting material in the bibliography as well, not just evolution proposals. -\item Chapter~\ref{opaqueresult}, \ref{existentialtypes} and \ref{classinheritance} are about opaque return types, existential types, and class inheritance. These are largely independent of the rest. +This book does not say much about the runtime side of the separate compilation of generics, except for a brief overview of the model in relation to the type checker in \ChapRef{roadmap}. To learn more, I recommend watching a pair of LLVM Developer's Conference talks: \cite{llvmtalk} which gives a summary of the whole design, and \cite{cvwtalk} which describes some recent optimizations. -\item Chapter~\ref{concrete conformances} is perhaps the trickiest of all, describing the handling of concrete types in the requirement machine rewrite system. -\end{itemize} - -\section*{Mathematical Preliminaries} - -While mathematical notation can be quite intimidating to the uninitiated, static type systems are difficult to discuss at any level of detail without introducing at least a little bit of math. An introductory course in calculus, linear algebra or combinatorics provides sufficient background to understand the material in this book. If you lack this level of knowledge, that's okay; you can still follow along without missing too much. - -The Greek alphabet is used in mathematics for variable names and other notation. This book only needs a handful of letters: lowercase $\varepsilon$ (``epsilon''), $\pi$ (``pi''), $\sigma$ (``sigma''), $\uptau$ (``tau'') and $\varphi$ (``phi''); and uppercase $\Sigma$ (``sigma''). - -The equals sign ``='' means two things are already known to be equivalent in some sense. The colon-equals ``:='' means the thing on the left hand side is being \emph{defined} to be the same as the thing on the right. - -\IndexDefinition{set} -\IndexDefinition{natural numbers} -\IndexDefinition{empty set} -A \emph{set} is a collection of elements without regard to order or duplicates. Sets can be finite or infinite. A finite set can be specified by listing its elements in any order, for example $\{a,\,b,\,c\}$. The empty set \index{$\varnothing$}\index{$\varnothing$!z@\igobble|seealso{empty set}}$\varnothing$ is the unique set with no elements. The set of \emph{natural numbers} \index{$\mathbb{N}$}\index{$\mathbb{N}$!z@\igobble|seealso{natural numbers}}$\mathbb{N}$ is the infinite set of all non-negative integers, including zero: $\mathbb{N}:=\{0,\,1,\,2,\,\ldots\}$. - -The notation \index{$\in$}\index{$\in$!z@\igobble|seealso{set}}$x\in S$ means ``$x$ is an element of a set $S$,'' and \index{$\not\in$}$x\not\in S$ is its negation, which is ``$x$ is \emph{not} an element of $S$.'' Properties of sets can be stated using \emph{existential} quantification (``there exists (at least one) $x\in S$ such that $x$ has this property\ldots'') or \emph{universal} quantification (``for all $x\in S$, the following property is true of $x$\ldots''). - -\IndexDefinition{subset}% -\IndexDefinition{proper subset}% -A set $X$ is a \emph{subset} of another set $Y$, written as \index{$\subseteq$}\index{$\subseteq$!z@\igobble|seealso{subset}}$X\subseteq Y$, if for all $x\in X$, it is also true that $x\in Y$. Furthermore if there is at least one element $y\in Y$ such that $y\not\in X$, then $X\neq Y$, and $X$ is a \emph{proper} subset of $Y$, written as \index{$\subsetneq$}\index{$\subsetneq$!z@\igobble|seealso{proper subset}}$X\subsetneq Y$. The \IndexDefinition{union}\emph{union} \index{$\cup$}\index{$\cup$!z@\igobble|seealso{union}}$X\cup Y$ is the set of all elements belonging to either $X$ or $Y$. The \IndexDefinition{intersection}\emph{intersection} \index{$\cap$}\index{$\cap$!z@\igobble|seealso{intersection}}$X\cap Y$ is the set of all elements belonging to both $X$ and $Y$. - -\IndexDefinition{Cartesian product} -\IndexDefinition{ordered pair} -\IndexDefinition{ordered tuple} -The \emph{Cartesian product} of two sets $X$ and $Y$, denoted \index{$\times$}\index{$\times$!z@\igobble|seealso{Cartesian product}}$X\times Y$, is the set of all \emph{ordered pairs} $(x,y)$ where $x\in X$, $y\in Y$. Note that the ordered pair $(x,y)$ is not the same as the set $\{x,y\}$. The Cartesian product construction generalizes to any finite number of sets, to give \emph{ordered tuples} or \emph{sequences}. - -\IndexDefinition{binary operation} -\IndexDefinition{homomorphism} -\index{mapping|see{function}} -\IndexDefinition{function} -A \emph{function} (or \emph{mapping}) $f\colon X\rightarrow Y$ assigns to each $x\in X$ a unique element $f(x)\in Y$. If the sets $X$ and $Y$ are equipped with some kind of additional structure (which will be explicitly defined), then $f$ is a \emph{homomorphism} if it preserves this structure. - -A function $f\colon X\times Y\rightarrow Z$ defined on the Cartesian product can be thought of as taking a pair of values $x\in X$, $y\in Y$ to an element $f(x,y)\in Z$. A \emph{binary operation} is a function named by a symbol like $\otimes$, $\star$ or $\cdot$ defined on the Cartesian product of two sets. The application of a binary operation is denoted by writing the symbol in between the two elements, like $x\otimes y$. - -The \emph{cardinality} of a finite set $S$, denoted $|S|$, is the number of elements in $S$. I also use the notation $|x|$ to denote certain other functions taking values in $\mathbb{N}$, such as the length of a sequence; this will always be explicitly defined when needed. - -Most of the writing is informal, but occasionally I use ``Euclidean style'': -\begin{definition} -Introduce terminology or notation. -\end{definition} -\begin{example} -Show how this terminology and notation arises in practice. -\end{example} -\begin{proposition} -State a logical consequence of one or more definitions. -\end{proposition} -\begin{proof} -I don't even attempt to formally prove the correctness of most things, but sometimes a proof is written out if it is informative in some way. -\end{proof} -\begin{lemma} An intermediate proposition in service of proving another theorem. -\end{lemma} -\begin{theorem} A ``deeper'' proposition which is more profound in some sense. -\end{theorem} -\begin{algorithmx}[Name of algorithm] A description of the inputs and outputs, followed by the precise specification of some computable function. +Also, while most of the material should be current as of Swift~6, two recent language extensions are not covered. These features are mostly additive and can be understood by reading the evolution proposals: \begin{enumerate} -\item Print \texttt{"Hello, world"}. -\item Go back to Step~1. +\item \index{parameter pack}Parameter packs, also known as \index{variadic generics}variadic generics (\cite{se0393}, \cite{se0398}, \cite{se0399}). +\item \index{noncopyable type}Noncopyable types (\cite{se0390}, \cite{se0427}). \end{enumerate} -\end{algorithmx} -\noindent The key mathematical ideas that underpin the theory of Swift generics: -\begin{itemize} -\item Formal logic (Section~\ref{derived req}, \ref{generic signature validity}, \ref{finding conformance paths}, \ref{monoidsasprotocols}). -\item Partial orders, linear orders, and equivalence relations (Section~\ref{typeparams}, \ref{reducedtypes}, \ref{finitely presented monoids}, \ref{rewritesystemintro}, and \ref{reduction order}). -\item Category theory, but only in passing (Section~\ref{submapcomposition}). -\item Directed graphs (Section~\ref{type parameter graph}, \ref{finding conformance paths}, \ref{recursive conformances}, \ref{protocol component}, and \ref{rewrite graph}). -\item Computability theory (Section~\ref{tag systems}, \ref{word problem}). -\item Finitely-presented monoids and rewrite systems (Chapter~\ref{monoids} and \ref{completion}). -\end{itemize} - -\section*{Miscellaneous} -I'd like to thank everyone who read earlier versions of the text, pointed out typos, and asked clarifying questions. Also, the Swift generics system itself is the result of over a decade of collaborative effort by countless people. This includes compiler developers, Swift evolution proposal authors, members of the evolution community, and of course, all the users who repeatedly punched holes in our conceptual model by patiently reducing test cases and reporting mind-bending correctness bugs. This book attempts to give an overview of the sum total of all these contributions; I'm not claiming any of the design ideas or implementation techniques described here as my own. +\section*{Source Code} -It's also worth mentioning what was left out. Most of the book is current as of Swift~5.9, but I don't talk about \index{variadic generics}\index{parameter packs}variadic generics (\cite{se0393}, \cite{se0398}, \cite{se0399}). Also the low-level code generation and runtime support for generics only get a cursory mention in Chapter~\ref{roadmap} and Appendix~\ref{runtime representation}, but it really deserves a complete treatment in a future Part~IV. - -The \TeX{} source for this book is our git repository, under the same license as the rest of the codebase: +The \TeX{} code for this book lives in the Swift source repository: \begin{quote} \url{https://github.com/apple/swift/tree/main/docs/Generics} \end{quote} @@ -142,4 +80,8 @@ \section*{Miscellaneous} \url{https://download.swift.org/docs/assets/generics.pdf} \end{quote} -\end{document} \ No newline at end of file +\section*{Acknowledgments} + +I'd like to thank everyone who read earlier versions of the text, pointed out typos, and asked clarifying questions. Also, the Swift generics system itself is the result of over a decade of collaborative effort by countless people. This includes compiler developers, Swift evolution proposal authors, members of the evolution community, and all the users who reported bugs. This book attempts to give an overview of the sum total of all these contributions. + +\end{document} diff --git a/docs/Generics/chapters/property-map.tex b/docs/Generics/chapters/property-map.tex index 32d6889d651c4..167bc1b276379 100644 --- a/docs/Generics/chapters/property-map.tex +++ b/docs/Generics/chapters/property-map.tex @@ -2,19 +2,7 @@ \begin{document} -\chapter{The Property Map}\label{propertymap} - -If we already have some way to compute reduced type parameters, we can define what it means to compute a reduced type for an arbitrary type containing type parameters, as follows. -\begin{algorithm}[Computing a reduced type]\label{reducedtypealgo} -As input, takes a type \texttt{T}. Outputs the reduced type of \texttt{T}. - -\begin{enumerate} -\item If \texttt{T} does not contain any type parameters, it is already reduced. Return \texttt{T}. -\item If \texttt{T} is a type parameter but fixed to a concrete type, replace \texttt{T} with its concrete type and continue to Step~3. -\item If \texttt{T} is not a type parameter, transform \texttt{T} by recursively replacing any type parameters appearing in \texttt{T} with their reduced types, and return the transformed type. -\item The final possibility is that \texttt{T} is a type parameter, not fixed to a concrete type. The reduced type of \texttt{T} is the smallest type parameter in its equivalence class. Return this type parameter. -\end{enumerate} -\end{algorithm} +\chapter[]{The Property Map}\label{propertymap} \ifWIP Until now, you've seen how to solve the \texttt{requiresProtocol()} generic signature @@ -53,7 +41,7 @@ \chapter{The Property Map}\label{propertymap} that $V=V'$. (TODO: This needs an additional assumption about conformance-valid rules.) \end{proof} -By Theorem~\ref{propertymaptheorem}, the properties satisfied by a type term can be discovered by +By \ThmRef{propertymaptheorem}, the properties satisfied by a type term can be discovered by considering all non-empty suffixes of ${T}{\downarrow}$, and collecting rewrite rules of the form $V.\Pi\rightarrow V$ where $\Pi$ is some property-like symbol. @@ -77,7 +65,7 @@ \chapter{The Property Map}\label{propertymap} \end{listing} \begin{example}\label{propmapexample1} -Consider the protocol definitions in Listing~\ref{propmaplisting1}. These definitions are used in a couple of examples below, so let's look at the constructed rewrite system first. Protocols $\proto{P1}$ and $\proto{P2}$ do not define any associated types or requirements, so they do not contribute any initial rewrite rules. Protocol $\proto{P3}$ has two associated types $\namesym{T}$ and $\namesym{U}$ conforming to $\proto{P1}$ and $\proto{P2}$ respectively, so a pair of rules introduce each associated type, and another pair impose conformance requirements: +Consider the protocol definitions in \ListingRef{propmaplisting1}. These definitions are used in a couple of examples below, so let's look at the constructed rewrite system first. Protocols $\proto{P1}$ and $\proto{P2}$ do not define any associated types or requirements, so they do not contribute any initial rewrite rules. Protocol $\proto{P3}$ has two associated types $\namesym{T}$ and $\namesym{U}$ conforming to $\proto{P1}$ and $\proto{P2}$ respectively, so a pair of rules introduce each associated type, and another pair impose conformance requirements: \begin{align} \protosym{P3}.\namesym{T}&\Rightarrow\assocsym{P3}{T}\tag{1}\\ \protosym{P3}.\namesym{U}&\Rightarrow\assocsym{P3}{U}\tag{2}\\ @@ -105,7 +93,7 @@ \chapter{The Property Map}\label{propertymap} \end{align} Consider the type parameter $\genericparam{Self}.\namesym{A}.\namesym{U}$ in the generic signature of $\proto{P4}$. This type parameter is equivalent to $\genericparam{Self}.\namesym{A}.\namesym{T}$ via the same-type requirement in $\proto{P4}$. The associated type $\namesym{T}$ of $\proto{P3}$ conforms to $\proto{P1}$, and $\namesym{U}$ conforms to $\proto{P2}$. This means that $\genericparam{Self}.\namesym{A}.\namesym{U}$ conforms to \emph{both} $\proto{P1}$ and $\proto{P2}$. -Let's see how this fact can be derived from the rewrite system. Applying $\Lambda_{\proto{P4}}$ to $\genericparam{Self}.\namesym{A}.\namesym{U}$ produces the type term $\assocsym{P4}{A}.\assocsym{P3}{U}$. This type term can be reduced to the canonical term $\assocsym{P4}{A}.\assocsym{P3}{T}$ with a single application of Rule~9. By the result in Theorem~\ref{propertymaptheorem}, it suffices to look at rules of the form $V.\Pi\Rightarrow V$, where $V$ is some suffix of $\assocsym{P4}{A}.\assocsym{P3}{T}$. There are two such rules: +Let's see how this fact can be derived from the rewrite system. Applying $\Lambda_{\proto{P4}}$ to $\genericparam{Self}.\namesym{A}.\namesym{U}$ produces the type term $\assocsym{P4}{A}.\assocsym{P3}{U}$. This type term can be reduced to the canonical term $\assocsym{P4}{A}.\assocsym{P3}{T}$ with a single application of Rule~9. By the result in \ThmRef{propertymaptheorem}, it suffices to look at rules of the form $V.\Pi\Rightarrow V$, where $V$ is some suffix of $\assocsym{P4}{A}.\assocsym{P3}{T}$. There are two such rules: \begin{enumerate} \item Rule~3, which says that $\assocsym{P3}{T}$ conforms to $\proto{P1}$. \item Rule~14, which says that $\assocsym{P4}{A}.\assocsym{P4}{T}$ conforms to $\proto{P2}$. @@ -114,7 +102,7 @@ \chapter{The Property Map}\label{propertymap} \end{example} The above example might suggest that looking up the set of properties satisfied by a type parameter requires iterating over the list of rewrite rules, but in reality, it is possible to construct a multi-map of pairs $(V, \Pi)$ once, after the completion procedure ends. -As you saw in the example, a type term can satisfy multiple properties via different suffixes. For the material presented in Section~\ref{moreconcretetypes}, it is convenient to avoid having to take the union of sets in the lookup path. For this reason, the construction algorithm explicitly +As you saw in the example, a type term can satisfy multiple properties via different suffixes. For this reason, the construction algorithm explicitly ``inherits'' the symbols associated with a term $V$ when adding an entry for a term $UV$ that has $V$ as a suffix. As a result, the lookup algorithm only has to look for the longest suffix that appears in the multi-map to find all properties satisfied by a term. The multi-map construction and lookup can be formalized in a pair of algorithms. @@ -161,19 +149,18 @@ \chapter{The Property Map}\label{propertymap} \end{enumerate} \end{algorithm} Notice how in both algorithms, superclass and concrete type symbols are adjusted by prepending a -prefix to each substitution. This is the same operation as described in -Section~\ref{concretetypeadjust}. +prefix to each substitution. \begin{example}\label{propmapexample2} -Recall Example~\ref{propmapexample1}, where a rewrite system was constructed from Listing~\ref{propmaplisting}. The complete rewrite system contains five rewrite rules of the form $V.\Pi\Rightarrow V$: +Recall \ExRef{propmapexample1}, where a rewrite system was constructed from \ListingRef{propmaplisting}. The complete rewrite system contains five rewrite rules of the form $V.\Pi\Rightarrow V$: \begin{enumerate} \item Rule~3 and Rule~4, which state that the associated types $\namesym{T}$ and $\namesym{U}$ of $\proto{P3}$ conform to $\proto{P1}$ and $\proto{P2}$, respectively. \item Rule~7 and Rule~8, which state that the associated types $\namesym{A}$ and $\namesym{B}$ of $\proto{P4}$ both conform to $\proto{P3}$. \item Rule~13, which states that the nested type $\genericparam{A}.\genericparam{T}$ of $\proto{P4}$ also conforms to $\proto{P2}$. \end{enumerate} -The property map constructed by Algorithm~\ref{propmapconsalgo} from the above rules is shown in Table~\ref{propmapexample2table}. +The property map constructed by \AlgRef{propmapconsalgo} from the above rules is shown in Table~\ref{propmapexample2table}. \end{example} -\begin{table}\captionabove{Property map constructed from Example~\ref{propmapexample2}}\label{propmapexample2table} +\begin{table}\captionabove{Property map constructed from \ExRef{propmapexample2}}\label{propmapexample2table} \begin{center} \begin{tabular}{|l|l|} \hline @@ -190,7 +177,7 @@ \chapter{The Property Map}\label{propertymap} \end{center} \end{table} \begin{example}\label{propmapexample3} -The second example explores layout, superclass and concrete type requirements. Consider the protocol definitions in Listing~\ref{propmaplisting} together with the generic signature: +The second example explores layout, superclass and concrete type requirements. Consider the protocol definitions in \ListingRef{propmaplisting} together with the generic signature: \[\gensig{\genericsym{0}{0}}{\genericsym{0}{0}\colon\proto{P}, \genericsym{0}{0}.\namesym{B}\colon\proto{Q}}\] The three protocols $\proto{R}$, $\proto{Q}$ and $\proto{P}$ together with the generic signature generate the following initial rewrite rules: \begin{align*} @@ -241,9 +228,9 @@ \chapter{The Property Map}\label{propertymap} \end{enumerate} When constructing the property map, sorting the rules by the length of their left hand sides guarantees that Rule~6 and Rule~7 are processed before Rule~10 and Rule~14. This is important because the subject type of Rule~6 and Rule~7 ($\assocsym{P}{B}$), is a suffix of the subject type of Rule~10 and Rule~14 ($\genericsym{0}{0}.\assocsym{P}{B}$), which means that the property map entries for both Rule~10 and Rule~14 inherit the superclass and layout requirements from Rule~6 and Rule~7. Furthermore, the substitution $\sigma_0:=\assocsym{P}{A}$ in the superclass requirement is adjusted by prepending the prefix $\genericsym{0}{0}$. -The property map constructed by Algorithm~\ref{propmapconsalgo} from the above rules is shown in Table~\ref{propmapexample3table}. +The property map constructed by \AlgRef{propmapconsalgo} from the above rules is shown in Table~\ref{propmapexample3table}. In the next section, you will see how this example property map can solve generic signature queries. -\begin{table}\captionabove{Property map constructed from Example~\ref{propmapexample3}}\label{propmapexample3table} +\begin{table}\captionabove{Property map constructed from \ExRef{propmapexample3}}\label{propmapexample3table} \begin{center} \begin{tabular}{|l|l|} \hline @@ -264,29 +251,29 @@ \chapter{The Property Map}\label{propertymap} \fi -\section{Substitution Simplification}\label{subst simplification} +\section[]{Substitution Simplification}\label{subst simplification} \begin{algorithm}[Substitution simplification]\label{subst simplification algo} \end{algorithm} -\section{Property Unification} +\section[]{Property Unification} \IndexTwoFlag{debug-requirement-machine}{property-map} \IndexTwoFlag{debug-requirement-machine}{conflicting-rules} -\section{Concrete Type Unification} +\section[]{Concrete Type Unification} \IndexTwoFlag{debug-requirement-machine}{concrete-unification} -\section{Generic Signature Queries}\label{implqueries} +\section[]{Generic Signature Queries}\label{implqueries} \ifWIP -Recall the categorization of generic signature queries into predicates, properties and canonical type queries previously shown in Section~\ref{intqueries}. The predicates can be implemented in a straightforward manner using the property map. Each predicate takes a subject type parameter $T$. +Recall the categorization of generic signature queries into predicates, properties and canonical type queries previously shown in \SecRef{genericsigqueries}. The predicates can be implemented in a straightforward manner using the property map. Each predicate takes a subject type parameter $T$. -Generic signature queries are always posed relative to a generic signature, and not a protocol requirement signature, hence the type parameter $T$ is lowered with the generic signature type lowering map $\Lambda\colon\namesym{Type}\rightarrow\namesym{Term}$ (Definition~\ref{lowertypeinsig}) and not a protocol type lowering map $\Lambda_{\proto{P}}\colon\namesym{Type}\rightarrow\namesym{Term}$ for some protocol $\proto{P}$ (Definition~\ref{lowertypeinproto}). +Generic signature queries are always posed relative to a generic signature, and not a protocol requirement signature, hence the type parameter $T$ is lowered with the generic signature type lowering map $\Lambda\colon\namesym{Type}\rightarrow\namesym{Term}$ (\AlgRef{build term generic}) and not a protocol type lowering map $\Lambda_{\proto{P}}\colon\namesym{Type}\rightarrow\namesym{Term}$ for some protocol $\proto{P}$ (\DefRef{build term protocol}). -The first step is to look up the set of properties satisfied by $T$ using Algorithm~\ref{propmaplookupalgo}. Then, each predicate can be tested as follows: +The first step is to look up the set of properties satisfied by $T$ using \AlgRef{propmaplookupalgo}. Then, each predicate can be tested as follows: \begin{description} \item[\texttt{requiresProtocol()}] A type parameter $T$ conforms to a protocol $\proto{P}$ if the property map entry for some suffix of $T$ stores $\protosym{P}$ for some suffix of $T$. \index{layout constraints} @@ -301,18 +288,18 @@ \section{Generic Signature Queries}\label{implqueries} Layout symbols store a layout constraint as an instance of the \texttt{LayoutConstraint} class. The join operation used in the implementation of the \texttt{requiresClass()} query is defined in the \texttt{merge()} method on \texttt{LayoutConstraint}. -You've already seen the \texttt{requiresProtocol()} query in Chapter~\ref{protocolsasmonoids}, where it was shown that it can be implemented by checking if $\Lambda(T).\protosym{P}\downarrow\Lambda(T)$. The property map implementation is perhaps slightly more efficient, since it only simplifies a single term and not two. The $\texttt{requiresClass()}$ and $\texttt{isConcreteType()}$ queries are new on the other hand, and demonstrate the power of the property map. With the rewrite system alone, they cannot be implemented except by exhaustive enumeration over all known layout and concrete type symbols. +You've already seen the \texttt{requiresProtocol()} query in \ChapRef{symbols terms rules}, where it was shown that it can be implemented by checking if $\Lambda(T).\protosym{P}\downarrow\Lambda(T)$. The property map implementation is perhaps slightly more efficient, since it only simplifies a single term and not two. The $\texttt{requiresClass()}$ and $\texttt{isConcreteType()}$ queries are new on the other hand, and demonstrate the power of the property map. With the rewrite system alone, they cannot be implemented except by exhaustive enumeration over all known layout and concrete type symbols. -All of the subsequent examples reference the protocol definitions from Example~\ref{propmapexample3}, and the resulting property map shown in Table~\ref{propmapexample2table}. +All of the subsequent examples reference the protocol definitions from \ExRef{propmapexample3}, and the resulting property map shown in Table~\ref{propmapexample2table}. \begin{example} Consider the canonical type term $\genericsym{0}{0}.\assocsym{P}{B}$. This type parameter conforms to $\proto{Q}$ via a requirement stated in the generic signature, and also to $\proto{R}$, because $\proto{Q}$ inherits from $\proto{R}$. The $\texttt{requiresProtocol()}$ query will confirm these facts, because the property map entry for $\genericsym{0}{0}.\assocsym{P}{B}$ contains the protocol symbols $\protosym{Q}$ and $\protosym{R}$: \begin{enumerate} -\item The conformance to $\proto{Q}$ is witnessed by the rewrite rule $\genericsym{0}{0}.\assocsym{P}{B}.\protosym{Q}\Rightarrow \genericsym{0}{0}.\assocsym{P}{B}$, which is Rule~10 in Example~\ref{propmapexample2}. This is the initial rule generated by the conformance requirement. -\item The conformance to $\proto{R}$ is witnessed by the rewrite rule $\genericsym{0}{0}.\assocsym{P}{B}.\protosym{R}\Rightarrow \genericsym{0}{0}.\assocsym{P}{B}$, which is Rule~14 in Example~\ref{propmapexample2}. This rule was added by the completion procedure to resolve the overlap between Rule~10 above, which states that $\genericsym{0}{0}.\assocsym{P}{B}$ conforms to $\proto{Q}$, and Rule~1, which states that anything conforming to $\proto{Q}$ also conforms to $\proto{R}$. +\item The conformance to $\proto{Q}$ is witnessed by the rewrite rule $\genericsym{0}{0}.\assocsym{P}{B}.\protosym{Q}\Rightarrow \genericsym{0}{0}.\assocsym{P}{B}$, which is Rule~10 in \ExRef{propmapexample2}. This is the initial rule generated by the conformance requirement. +\item The conformance to $\proto{R}$ is witnessed by the rewrite rule $\genericsym{0}{0}.\assocsym{P}{B}.\protosym{R}\Rightarrow \genericsym{0}{0}.\assocsym{P}{B}$, which is Rule~14 in \ExRef{propmapexample2}. This rule was added by the completion procedure to resolve the overlap between Rule~10 above, which states that $\genericsym{0}{0}.\assocsym{P}{B}$ conforms to $\proto{Q}$, and Rule~1, which states that anything conforming to $\proto{Q}$ also conforms to $\proto{R}$. \end{enumerate} \end{example} \begin{example} This example shows the \texttt{requiresClass()} query on two different type terms. -First, consider the canonical type term $\genericsym{0}{0}.\assocsym{P}{A}$. The query returns true, because the longest suffix with an entry in the property map is $\assocsym{P}{A}$, which stores a single symbol, $\layoutsym{AnyObject}$. The corresponding rewrite rule is $\assocsym{P}{A}.\layoutsym{AnyObject}\Rightarrow\assocsym{P}{A}$, or Rule~5 in Example~\ref{propmapexample2}. This is the initial rule generated by the $\namesym{A}\colon\namesym{AnyObject}$ layout requirement in protocol $\proto{P}$. +First, consider the canonical type term $\genericsym{0}{0}.\assocsym{P}{A}$. The query returns true, because the longest suffix with an entry in the property map is $\assocsym{P}{A}$, which stores a single symbol, $\layoutsym{AnyObject}$. The corresponding rewrite rule is $\assocsym{P}{A}.\layoutsym{AnyObject}\Rightarrow\assocsym{P}{A}$, or Rule~5 in \ExRef{propmapexample2}. This is the initial rule generated by the $\namesym{A}\colon\namesym{AnyObject}$ layout requirement in protocol $\proto{P}$. Now, consider the canonical type term $\genericsym{0}{0}.\assocsym{P}{B}$. The query also returns true. Here, the longest suffix is the entire type term, because the property map stores an entry for $\genericsym{0}{0}.\assocsym{P}{B}$ with layout symbol $\layoutsym{\_NativeClass}$. This symbol satisfies \[\layoutsym{\_NativeClass}\leq\layoutsym{AnyObject},\] @@ -345,14 +332,14 @@ \section{Generic Signature Queries}\label{implqueries} \item Construct the subgraph $H\subseteq G$ generated by $P$. \item Compute the set of root nodes of $H$ (that is, the nodes with no incoming edges, or zero in-degree) to obtain the minimal set protocols of $P$. -\item Sort the elements of this set using the canonical protocol order (Definition~\ref{canonicalprotocol}) to obtain the +\item Sort the elements of this set using the canonical protocol order (\AlgRef{linear protocol order}) to obtain the final minimal canonical list of protocols from $P$. \end{enumerate} \end{definition} The second step is to define a mapping from type terms to Swift type parameters, for use by the \texttt{getSuperclassBound()} and \texttt{getConcreteType()} queries when mapping substitutions back to Swift types. \begin{algorithm} The type lifting map $\Lambda^{-1}:\namesym{Term}\rightarrow\namesym{Type}$ takes as input a type term $T$ and maps it back to a Swift type parameter. This is the inverse of the type lowering -map $\Lambda\colon\namesym{Type}\rightarrow\namesym{Term}$ from Algorithm~\ref{lowertypeinproto}. +map $\Lambda\colon\namesym{Type}\rightarrow\namesym{Term}$ from \AlgRef{build term protocol}. \begin{enumerate} \item Initialize $S$ to an empty type parameter. \item The first symbol of $T$ must be a generic parameter symbol $\tau_{d,i}$, which is mapped to a @@ -366,16 +353,16 @@ \section{Generic Signature Queries}\label{implqueries} $\namesym{A}$, or $\namesym{A}$ was declared in some protocol $\proto{Q}$ such that $\proto{P}_i$ inherits from $\proto{Q}$. In both cases, collect all associated type declarations in a list. \item If any associated type found above is a non-root associated type declaration, replace it with -its anchor (Definition~\ref{rootassoctypedef}). +its anchor (\DefRef{root associated type}). \item Pick the associated type declaration from the above set that is minimal with respect to the -associated type order (Definition~\ref{canonicaltypeorder}). +associated type order (\AlgRef{associated type order}). \end{enumerate} \end{enumerate} \end{algorithm} The third and final step before the queries themselves can be presented is the algorithm for mapping a superclass or concrete type symbol back to a Swift type. This algorithm uses the above type lifting map on type parameters appearing in substitutions. \begin{algorithm}[Constructing a concrete type from a symbol]\label{symboltotype} As input, this algorithm takes a superclass symbol $\supersym{\namesym{T}\colon\sigma_0,\ldots,\sigma_n}$ or -concrete type symbol $\concretesym{\namesym{T}\colon\sigma_0,\ldots,\sigma_n}$. This is the inverse of Algorithm~\ref{concretesymbolcons}. +concrete type symbol $\concretesym{\namesym{T}\colon\sigma_0,\ldots,\sigma_n}$. This is the inverse of \AlgRef{concretesymbolcons}. \begin{enumerate} \item Let $\pi_0,\ldots,\pi_n$ be the set of positions such that $\namesym{T}|_{\pi_i}$ is a @@ -388,7 +375,7 @@ \section{Generic Signature Queries}\label{implqueries} Now, the time has finally come to describe the implementation of the four property queries. \begin{description} -\item[\texttt{getRequiredProtocols()}] The list of protocol requirements satisfied by a type parameter $T$ is recorded in the form of protocol symbols in the property map. This list is transformed into a minimal canonical list of protocols using Definition~\ref{minimalproto}. +\item[\texttt{getRequiredProtocols()}] The list of protocol requirements satisfied by a type parameter $T$ is recorded in the form of protocol symbols in the property map. This list is transformed into a minimal canonical list of protocols using \DefRef{minimalproto}. \index{layout constraints} \index{join of layout constraints} \item[\texttt{getLayoutConstraint()}] A type parameter $T$ might be subject to multiple layout constraints, in which case the property map entry will store a list of layout constraints $L_1,\ldots,L_n$. This query needs to compute their join, which is the largest layout constraint that simultaneously satisfies all of them: @@ -400,10 +387,10 @@ \section{Generic Signature Queries}\label{implqueries} The first step is to adjust the symbol by prepending $U$ to each substitution $\sigma_i$, to produce the superclass symbol \[\supersym{\namesym{C};\,\sigma_0,\ldots,U\sigma_n}.\] -Then, Algorithm~\ref{symboltotype} can be applied to convert the symbol to a Swift type. +Then, \AlgRef{symboltotype} can be applied to convert the symbol to a Swift type. \item\texttt{getConcreteType()}: This query is almost identical to \texttt{getSuperclassBound()}; you can replace ``superclass symbol'' with ``concrete type symbol'' above. \end{description} -Note how the \texttt{getLayoutConstraint()} query deals with a multiplicity of layout symbols by computing their join, whereas the \texttt{getSuperclassBound()} and \texttt{getConcreteType()} queries just arbitrarily pick one superclass or concrete type symbol. Indeed in Section~\ref{moreconcretetypes}, you will see that picking one symbol is not always sufficient, and a complete implementation must perform joins on superclass and concrete type symbols as well, and furthermore, a situation analogous to the uninhabited layout constraint can arise, where a type parameter can be subject to conflicting superclass or concrete type requirements. For now though, the current formulation is sufficient. +Note how the \texttt{getLayoutConstraint()} query deals with a multiplicity of layout symbols by computing their join, whereas the \texttt{getSuperclassBound()} and \texttt{getConcreteType()} queries just arbitrarily pick one superclass or concrete type symbol. Indeed, we will see that picking one symbol is not always sufficient, and a complete implementation must perform joins on superclass and concrete type symbols as well, and furthermore, a situation analogous to the uninhabited layout constraint can arise, where a type parameter can be subject to conflicting superclass or concrete type requirements. For now though, the current formulation is sufficient. Now, let's look at some examples of the four property queries. Once again, these examples use the property map shown in Table~\ref{propmapexample2table}. \begin{example} @@ -413,20 +400,20 @@ \section{Generic Signature Queries}\label{implqueries} Consider the computation of the \texttt{getSuperclassBound()} query on the canonical type term $\genericsym{0}{0}.\assocsym{P}{B}$. The superclass symbol $\supersym{\namesym{Cache}\langle\sigma_0\rangle;\,\sigma_0:=\assocsym{P}{A}}$ does not need to be adjusted by prepending a prefix to each substitution term, because the property map entry is associated with the entire term $\genericsym{0}{0}.\assocsym{P}{B}$. -Applying Algorithm~\ref{symboltotype} to the superclass symbol produces the Swift type: +Applying \AlgRef{symboltotype} to the superclass symbol produces the Swift type: \[\namesym{Cache}\langle\genericsym{0}{0}.\namesym{A}\rangle\]. \end{example} \begin{example} Consider the computation of the \texttt{getConcreteType()} query on the canonical type term $\genericsym{0}{0}.\assocsym{P}{C}$. Here, the property map entry is associated with the suffix $\assocsym{P}{C}$, which means an adjustment must be performed on the concrete type symbol $\concretesym{\namesym{Array}\langle\sigma_0\rangle;\,\sigma_0:=\assocsym{P}{A}}$. The adjusted symbol is \[\concretesym{\namesym{Array}\langle\sigma_0\rangle;\,\sigma_0:=\genericsym{0}{0}\assocsym{P}{A}}.\] -Applying Algorithm~\ref{symboltotype} to the adjusted concrete type symbol produces the Swift type: +Applying \AlgRef{symboltotype} to the adjusted concrete type symbol produces the Swift type: \[\namesym{Array}\langle\genericsym{0}{0}.\namesym{A}\rangle.\] \end{example} \fi -\section{Reduced Types} +\section[]{Reduced Types} \ifWIP @@ -500,7 +487,7 @@ \section{Reduced Types} \genericsym{0}{0}.\assocsym{Q}{A}&\Rightarrow\genericsym{0}{0}.\assocsym{P}{A}\tag{7}\\ \genericsym{0}{0}.\namesym{A}&\Rightarrow\genericsym{0}{0}.\assocsym{P}{A}\tag{8} \end{align} -Now consider the type parameter $T:=\genericsym{0}{0}.\namesym{A}$. This type parameter is a canonical anchor by Definition~\ref{canonicalanchor}. Since Swift type parameters always point to an actual associated type declaration, the type term $\Lambda(T)$ is $\genericsym{0}{0}.\assocsym{Q}{A}$, and not $\genericsym{0}{0}.\assocsym{P}{A}$. However, $\genericsym{0}{0}.\assocsym{Q}{A}$ is not canonical as a term, and reduces to $\genericsym{0}{0}.\assocsym{P}{A}$ via Rule~7, therefore $T$ is a canonical anchor and yet $\Lambda(T)$ is not a canonical term. +Now consider the type parameter $T:=\genericsym{0}{0}.\namesym{A}$. This type parameter is reduced. Since Swift type parameters always point to an actual associated type declaration, the type term $\Lambda(T)$ is $\genericsym{0}{0}.\assocsym{Q}{A}$, and not $\genericsym{0}{0}.\assocsym{P}{A}$. However, $\genericsym{0}{0}.\assocsym{Q}{A}$ is not canonical as a term, and reduces to $\genericsym{0}{0}.\assocsym{P}{A}$ via Rule~7, therefore $T$ is a canonical anchor and yet $\Lambda(T)$ is not a canonical term. Essentially, the term $\genericsym{0}{0}.\assocsym{P}{A}$ is ``more canonical'' than any type parameter that can be output by $\Lambda:\namesym{Type}\rightarrow\namesym{Term}$. Protocol $\proto{P}$ does not actually define an associated type named $\namesym{A}$, therefore $\Lambda$ can only construct terms containing the symbol $\assocsym{Q}{A}$, and yet $\assocsym{P}{A}<\assocsym{Q}{A}$. @@ -519,7 +506,7 @@ \section{Reduced Types} associatedtype B } \end{Verbatim} -\begin{table}\captionabove{Property map from Example~\ref{concretecanonicalpropertymapex}}\label{concretecanonicalpropertymap} +\begin{table}\captionabove{Property map from \ExRef{concretecanonicalpropertymapex}}\label{concretecanonicalpropertymap} \begin{center} \begin{tabular}{|l|l|} \hline @@ -573,7 +560,7 @@ \section{Reduced Types} (Incidentally, can we say anything more about $t_1$? Is it the case that $t_0>t_1$? Since $x'\Rightarrow y'$ is a property-like rule, $x'$ is equal to $y'$ with a concrete type symbol appended, or in other words, $y'=vw$, so $t_1=uvw$. But $x=uv$, so $t_1=uvw$ reduces to $yw$. So indeed, the above critical pair either becomes trivial if $t_0$ can be reduced by some other rule, or it introduces the rewrite rule $t_0\Rightarrow yw$.) \begin{example} -Consider the generic signature of class $\namesym{C}$ from Listing~\ref{overlapconcreteex}: +Consider the generic signature of class $\namesym{C}$ from \ListingRef{overlapconcreteex}: \begin{listing}\captionabove{Example with overlap involving concrete type term}\label{overlapconcreteex} \begin{Verbatim} struct G {} @@ -619,7 +606,7 @@ \section{Reduced Types} This means that resolving the critical pair introduces the new rewrite rule: \[\genericsym{0}{0}.\assocsym{S}{E}.\assocsym{P}{T}.\concretesym{\namesym{G}\langle\sigma_0\rangle;\;\sigma_0:=\genericsym{0}{0}.\assocsym{S}{E}.\assocsym{P}{V}}\Rightarrow\genericsym{0}{0}.\assocsym{S}{E}.\assocsym{P}{T}. \] -Intuitively, the completion process began with the fact that +The completion process began with the fact that \[\assocsym{P}{U}==\namesym{G}\langle\assocsym{P}{V}\rangle,\] and derived that\ \[\genericsym{0}{0}.\assocsym{S}{E}.\assocsym{P}{T}==\namesym{G}\langle\genericsym{0}{0}.\assocsym{S}{E}.\assocsym{P}{T}\rangle.\] @@ -627,10 +614,10 @@ \section{Reduced Types} \[\genericsym{0}{0}.\assocsym{S}{E}.\assocsym{P}{T}==\namesym{G}\langle\assocsym{P}{T}\rangle,\] which does not make sense. \end{example} -The concrete type adjustment actually comes up again in the next chapter, during property map construction (Algorithm~\ref{propmapconsalgo}) and lookup (Algorithm~\ref{propmaplookupalgo}). +The concrete type adjustment actually comes up again in the next chapter, during property map construction (\AlgRef{propmapconsalgo}) and lookup (\AlgRef{propmaplookupalgo}). \fi -\section{Source Code Reference} +\section[]{Source Code Reference} -\end{document} \ No newline at end of file +\end{document} diff --git a/docs/Generics/chapters/rewrite-system-minimization.tex b/docs/Generics/chapters/rewrite-system-minimization.tex index 1453156502055..73f0aa7daf76a 100644 --- a/docs/Generics/chapters/rewrite-system-minimization.tex +++ b/docs/Generics/chapters/rewrite-system-minimization.tex @@ -2,7 +2,7 @@ \begin{document} -\chapter{Rewrite System Minimization}\label{rqm minimization} +\chapter[]{Rewrite System Minimization}\label{rqm minimization} \ifWIP TODO: @@ -21,7 +21,7 @@ \chapter{Rewrite System Minimization}\label{rqm minimization} \end{itemize} \fi -\section{Homotopy Reduction}\label{homotopy reduction} +\section[]{Homotopy Reduction}\label{homotopy reduction} \IndexTwoFlag{debug-requirement-machine}{homotopy-reduction} \IndexTwoFlag{debug-requirement-machine}{homotopy-reduction-detail} @@ -42,12 +42,12 @@ \section{Homotopy Reduction}\label{homotopy reduction} \end{itemize} \fi -\section{Loop Normalization} +\section[]{Loop Normalization} +\ifWIP \cite{homotopyreduction} \IndexFlag{disable-requirement-machine-loop-normalization} -\ifWIP TODO: \begin{itemize} \item collapse inverses together @@ -60,7 +60,7 @@ \section{Loop Normalization} \end{itemize} \fi -\section{The Elimination Order}\label{elimination order} +\section[]{The Elimination Order}\label{elimination order} \ifWIP TODO: @@ -71,12 +71,13 @@ \section{The Elimination Order}\label{elimination order} \end{itemize} \fi -\section{Conformance Minimization}\label{minimal conformances} +\section[]{Conformance Minimization}\label{minimal conformances} + +\ifWIP \IndexTwoFlag{debug-requirement-machine}{minimal-conformances} \IndexTwoFlag{debug-requirement-machine}{minimal-conformances-detail} -\ifWIP TODO: \begin{itemize} \item example where homotopy reduction minimizes too much @@ -90,9 +91,9 @@ \section{Conformance Minimization}\label{minimal conformances} \end{itemize} \fi -\section{Correctness}\label{minimization correctness} +\section[]{Building Requirements}\label{requirement builder} -\section{Building Requirements}\label{requirement builder} +\ifWIP \IndexFlag{requirement-machine-max-split-concrete-equiv-class-attempts} \IndexTwoFlag{debug-requirement-machine}{minimization} @@ -100,15 +101,32 @@ \section{Building Requirements}\label{requirement builder} \IndexTwoFlag{debug-requirement-machine}{redundant-rules-detail} \IndexTwoFlag{debug-requirement-machine}{split-concrete-equiv-class} -\ifWIP TODO: \begin{itemize} \item same-type requirements: connected component thing \item type alias requirements: the concrete equiv class splitting equivalent \item splitting concrete equivalence classes: describe the problem, give example, say which generic signature invariant is violated. alternative formulation would work but breaks \index{ABI}ABI, so for GSB compatibility we split equivalence classes. \end{itemize} + +The minimization algorithm outputs the ``circuit,'' +\begin{quote} +\begin{verbatim} +T.A == T.B +T.B == T.C +T.C == T.D +\end{verbatim} +\end{quote} +and not the ``star,'' +\begin{quote} +\begin{verbatim} +T.A == T.B +T.A == T.C +T.A == T.D +\end{verbatim} +\end{quote} + \fi -\section{Source Code Reference}\label{rqm minimization source ref} +\section[]{Source Code Reference}\label{rqm minimization source ref} -\end{document} \ No newline at end of file +\end{document} diff --git a/docs/Generics/chapters/runtime-representation.tex b/docs/Generics/chapters/runtime-representation.tex deleted file mode 100644 index 1f3afc67ea2ca..0000000000000 --- a/docs/Generics/chapters/runtime-representation.tex +++ /dev/null @@ -1,62 +0,0 @@ -\documentclass[../generics]{subfiles} - -\begin{document} - -\chapter{Runtime Representation}\label{runtime representation} - -\index{contextual type} -\index{fully-concrete type} -\index{runtime type metadata} -\index{witness table} -\index{IRGen} -\index{SIL} -\index{LLVM} -Conformance paths are also used when generating code for unspecialized generic functions. Way back in Chapter~\ref{roadmap}, you saw how type metadata and witness tables are reified as runtime values in unspecialized generic code. At the SIL level, operations on fully-concrete and contextual types are represented uniformly; in unspecialized generic code, the types and conformances appearing in SIL instructions can contain archetypes. - -When IRGen lowers SIL to LLVM IR, it introduces IR values to model types and conformances derived from archetypes, and emits code for constructing these values. This generated code has the same effect at runtime as type substitution operations do at compile time. Naturally, IRGen models this by evaluating conformance paths. - -This section just gives a summary of the implementation of runtime generics. There is a lot more to say about this topic, and it really deserves to be Part~IV of this book. For now, the most detailed resource is a recording of a talk from the LLVM Developer's Conference \cite{llvmtalk}. - -\paragraph{Generic signatures} -\index{contextual type} -\index{primary archetype type} -\index{conformance requirement} -\index{runtime type metadata} -\index{witness table} -IRGen computes the final calling convention of each generic function. The calling convention computation considers the formal named parameters of the function declaration. It also introduces some additional \emph{lowered parameters} derived from the function's generic signature: -\begin{itemize} -\item For every generic parameter that is not equivalent to some other generic parameter or concrete type, a pointer to runtime type metadata. -\item For every conformance requirement, a pointer to a witness table. -\end{itemize} -SIL instructions are written in terms of contextual types. These contextual types can contain primary archetypes, which are instantiated from the type parameters of the function's generic signature. The lowered parameters directly provide type metadata and witness tables for the generic parameters and root abstract conformances of the function's generic signature. Type metadata and witness tables for other primary archetypes and abstract conformances are constructed using conformance paths. - -\paragraph{Substitution maps} -\index{substitution map} -These lowered parameters look like they define the shape of a substitution map, which is not a coincidence. The caller of the generic function passes in type metadata and witness tables from the substitution map at the call site. When emitting a call to a generic function, IRGen generates code to instantiate runtime type metadata for each generic argument, and witness table for each conformance. The replacement types or conformances might contain archetypes, which represent type parameters in the generic signature of the caller. In this way, the callee's generic arguments and witness tables for the call are ultimately derived from the caller's lowered parameters. Each conformance path in a generic function is effectively evaluated against the runtime substitution map provided by the caller. - -\paragraph{Type metadata} -Non-generic nominal types define a \emph{metadata access function} taking no parameters. For generic nominal types, the metadata access function takes type metadata for each generic argument and a witness table for each conformance, much like a generic function. Metadata access functions for instantiating structural types, such as tuple types and function types, are defined in the Swift runtime. - -Runtime type metadata for an arbitrary contextual type can therefore be emitted recursively. At the leaves, we have non-generic nominal types and archetypes, which we can emit metadata for directly. The interior nodes are generic nominal types and structural types, for which we emit the metadata for each child, before passing the generic arguments to a metadata access function. - -\IndexDefinition{mangling} -\index{name mangling|see{mangling}} -\index{symbol mangling|see{mangling}} -If a type does not contain any archetypes, the instantiated metadata does not depend on the lowered parameters of the outer function. Instead of emitting the code for constructing metadata for a fully-concrete type, IRGen calls a runtime entry point with a compact string representation of the type. This \emph{mangling} scheme also determines symbol names for declarations, and is defined in \cite{mangling}. - -\paragraph{Witness tables} -Witness tables for concrete conformances are similarly instantiated by \emph{witness table access functions}. The witness table access function for the normal conformance of a non-generic type does not take any parameters. If the conforming type is generic, the witness table access function takes lowered parameters corresponding to the conforming type's generic signature. (If the conformance is conditional, the lowered parameters are actually those of the constrained extension in which the conformance was defined.) - -When emitting a protocol declaration, IRGen defines global symbols, called \emph{associated type descriptors} and \emph{associated conformance descriptors}. These represent associated type declarations and associated conformance requirements at runtime. - -Type witnesses and associated conformances are projected with a pair of runtime entry points: -\begin{itemize} -\item \verb|swift_getAssociatedTypeWitness()| takes a witness table, type metadata for the conforming type, and an associated type descriptor. -\item \verb|swift_getAssociatedConformanceWitness()| takes a witness table, type metadata for the conforming type, type metadata for the conformance requirement's subject type, and an associated conformance descriptor. -\end{itemize} -In our idealized algebraic notation (and in the compiler's in-memory representation), conformances know their conforming type. However, witness tables do not, and it is legal to share witness tables between different instantiations of a generic type. For this reason the projection operations take type metadata in addition to a witness table. - -\paragraph{Conformance paths} -Finally we get to conformance paths. To emit code for instantiating a type parameter or abstract conformance, we begin with the lowered parameter corresponding to the root abstract conformance. Then, we emit a call to the associated conformance access function for each subsequent step in the conformance path. When realizing a witness table, the process stops here; when deriving type metadata for a dependent member type, the last step is to call the associated type access function. - -\end{document} \ No newline at end of file diff --git a/docs/Generics/chapters/substitution-maps.tex b/docs/Generics/chapters/substitution-maps.tex index c833ceccd7c14..939061a4f5698 100644 --- a/docs/Generics/chapters/substitution-maps.tex +++ b/docs/Generics/chapters/substitution-maps.tex @@ -4,13 +4,9 @@ \chapter{Substitution Maps}\label{substmaps} -\IndexDefinition{substitution map} -\IndexDefinition{input generic signature} -\IndexDefinition{replacement type} -\index{conformance} -\lettrine{S}{ubstitution maps arise} when the type checker needs to reason about a reference to a generic declaration, specialized with list of generic arguments. Abstractly, a substitution map defines a replacement type corresponding to each type parameter of a generic signature; applying a substitution map to the interface type of a generic declaration recursively replaces the type parameters therein, producing the type of the specialized reference. +\lettrine{S}{ubstitution maps arise} when the type checker needs to reason about a reference to a generic declaration, specialized with list of generic arguments. Abstractly, a \IndexDefinition{substitution map}substitution map defines a \IndexDefinition{replacement type}replacement type corresponding to each type parameter of a generic signature; applying a substitution map to the interface type of a generic declaration recursively replaces the type parameters therein, producing the type of the specialized reference. -The generic signature of a substitution map is called the \emph{input generic signature}. A substitution map stores its input generic signature, and the generic signature's list of generic parameters and conformance requirements determine the substitution map's shape: +The generic signature of a substitution map is called the \IndexDefinition{input generic signature}\emph{input generic signature}. A substitution map stores its input generic signature, and the generic signature's list of generic parameters and \index{conformance}conformance requirements determine the substitution map's shape: \begin{quote} \texttt{<\ttbox{A}, \ttbox{B} where \ttbox{B:\ Sequence}, B.[Sequence]Element == Int>} \end{quote} @@ -75,10 +71,10 @@ \chapter{Substitution Maps}\label{substmaps} \SubstConf{B}{Array}{Sequence} } \] -Our substitution map appears while type checking the program shown in Listing~\ref{substmaptypecheck}. Here, all three of \texttt{genericFunction()}, \texttt{GenericType} and \texttt{nonGenericMethod()} have the same generic signature, \texttt{}. When type checking a generic function call, the expression type checker infers the generic arguments from the types of the argument expressions. When referencing a generic type, the generic arguments can be written explicitly. In fact, all three declarations are also referenced with the same substitution map. (In the case of a generic type, this substitution map is called the \emph{context substitution map}, as you will see in Section~\ref{contextsubstmap}.) +Our substitution map appears while type checking the program shown in \ListingRef{substmaptypecheck}. Here, all three of \texttt{genericFunction()}, \texttt{GenericType} and \texttt{nonGenericMethod()} have the same generic signature, \texttt{}. When type checking a generic function call, the expression type checker infers the generic arguments from the types of the argument expressions. When referencing a generic type, the generic arguments can be written explicitly. In fact, all three declarations are also referenced with the same substitution map. (In the case of a generic type, this substitution map is called the \emph{context substitution map}, as you will see in \SecRef{contextsubstmap}.) \end{example} -\paragraph{Type substitution} Substitution maps operate on interface types. Recall that an \index{interface type}interface type is a type \emph{containing} valid type parameters for some generic signature, which may itself not be a type parameter; for example, one possible interface type is \texttt{Array}, if \texttt{T} is a generic parameter type. Let's introduce the formal notation \IndexSetDefinition{type}{\TypeObj{G}}$\TypeObj{G}$ to mean the set of interface types for a generic signature $G$. Then, if $\texttt{T}\in\TypeObj{G}$ and $\Sigma$ is a substitution map with input generic signature $G$, we can \emph{apply} $\Sigma$ to \texttt{T} to get a new type. This operation is called \IndexDefinition{type substitution}\emph{type substitution}. The interface type here is called the \IndexDefinition{original type}\emph{original type}, and the result of the substitution is the \IndexDefinition{substituted type}\emph{substituted type}. We will think of applying a substitution map to an interface type as an binary operation: \[\texttt{T}\otimes\Sigma\] +\paragraph{Type substitution.} Substitution maps operate on interface types. Recall that an \index{interface type}interface type is a type \emph{containing} valid type parameters for some generic signature, which may itself not be a type parameter; for example, one possible interface type is \texttt{Array}, if \texttt{T} is a generic parameter type. Let's introduce the formal notation \IndexSetDefinition{type}{\TypeObj{G}}$\TypeObj{G}$ to mean the set of interface types for a generic signature $G$. Then, if $\texttt{T}\in\TypeObj{G}$ and $\Sigma$ is a substitution map with input generic signature $G$, we can \emph{apply} $\Sigma$ to \texttt{T} to get a new type. This operation is called \IndexDefinition{type substitution}\emph{type substitution}. The interface type here is called the \IndexDefinition{original type}\emph{original type}, and the result of the substitution is the \IndexDefinition{substituted type}\emph{substituted type}. We will think of applying a substitution map to an interface type as an binary operation: \[\texttt{T}\otimes\Sigma\] The \index{$\otimes$}\index{$\otimes$!z@\igobble|seealso{type substitution}}\index{binary operation}$\otimes$ binary operation is a \emph{right action} of substitution maps on types. (We could have instead defined a left action, but later we will see the right action formulation is more natural for expressing certain identities. Indeed, we'll develop this notation further throughout this book.) Type substitution recursively replaces any type parameters appearing in the original type with new types derived from the substitution map, while preserving the ``concrete structure'' of the original type. Thus the behavior of type substitution is ultimately defined by how substitution maps act the two kinds of type parameters: generic parameters and dependent member types: @@ -89,7 +85,7 @@ \chapter{Substitution Maps}\label{substmaps} \item Applying a substitution map to a dependent member type derives the replacement type from one of the substitution map's conformances. -Now, we haven't talked about conformances yet. There is a circularity between substitution maps and conformances---substitution maps can store conformances, and conformances can store substitution maps. We will look at conformances in great detail in Chapter~\ref{conformances}. The derivation of replacement types for dependent member types is discussed in Section~\ref{abstract conformances}. +Now, we haven't talked about conformances yet. There is a circularity between substitution maps and conformances---substitution maps can store conformances, and conformances can store substitution maps. We will look at conformances in great detail in \ChapRef{conformances}. The derivation of replacement types for dependent member types is discussed in \SecRef{abstract conformances}. \end{itemize} \begin{example} @@ -130,12 +126,12 @@ \chapter{Substitution Maps}\label{substmaps} \end{Verbatim} \end{listing} -\begin{example} Listing~\ref{typealiassubstlisting} shows a generic type with four member type alias declarations. There are four global variables, and the type of each global variable is written as a member type alias reference with the same type base type, \texttt{GenericType>}. +\begin{example}\label{type alias subst example} \ListingRef{typealiassubstlisting} shows a generic type with four member type alias declarations. There are four global variables, and the type of each global variable is written as a member type alias reference with the same type base type, \texttt{GenericType>}. \index{underlying type} \index{type alias declaration} \index{substitution map} -Type resolution resolves a member type alias reference by applying a substitution map to the underlying type of the type alias declaration. Here, the underlying type of each type alias declaration is an interface type for the generic signature of \texttt{GenericType}, and the substitution map is the substitution map $\Sigma$ of Example~\ref{substmaptypecheck}. +Type resolution resolves a member type alias reference by applying a substitution map to the underlying type of the type alias declaration. Here, the underlying type of each type alias declaration is an interface type for the generic signature of \texttt{GenericType}, and the substitution map is the substitution map $\Sigma$ of \ExRef{substmaptypecheck}. The type of each global variable \texttt{t1}, \texttt{t2}, \texttt{t3} and \texttt{t4} is determined by applying $\Sigma$ to the underlying type of each type alias declaration: \begin{quote} @@ -153,18 +149,13 @@ \chapter{Substitution Maps}\label{substmaps} The first two original types are generic parameters, and substitution directly projects the corresponding replacement type from the substitution map; the second two original types are substituted by recursively replacing generic parameters they contain. \end{example} -References to generic type alias declarations are more complex because in addition to the generic parameters of the base type, the generic type alias will have generic parameters of its own. Section~\ref{identtyperepr} describes how the substitution map is computed in this case. +References to generic type alias declarations are more complex because in addition to the generic parameters of the base type, the generic type alias will have generic parameters of its own. \SecRef{identtyperepr} describes how the substitution map is computed in this case. -\paragraph{Substitution failure} -\IndexDefinition{substitution failure}% -\index{error type}% -\index{SILGen}% -\index{abstract syntax tree}% -Substitution of an interface type containing dependent member types can \emph{fail} if any of the conformances in the substitution map are invalid. In this case, an error type is returned instead of signaling an assertion. Invalid conformances can appear in substitution maps when the user's own code is invalid; it is not an invariant violation as long as other errors are diagnosed elsewhere and the compiler does not proceed to SILGen with error types in the abstract syntax tree. +\paragraph{Substitution failure.} +Substitution of an interface type containing dependent member types can \IndexDefinition{substitution failure}\emph{fail} if any of the conformances in the substitution map are invalid. In this case, an \index{error type}error type is returned instead of signaling an assertion. Invalid conformances can appear in substitution maps when the user's own code is invalid; it is not an invariant violation as long as other errors are diagnosed elsewhere and the compiler does not proceed to \index{SILGen}SILGen with error types in the \index{abstract syntax tree}abstract syntax tree. -\paragraph{Output generic signature} -\IndexDefinition{output generic signature}% -If the replacement types in the substitution map are \index{fully-concrete type}fully concrete---that is, they do not contain any type parameters---then all possible substituted types produced by this substitution map will also be fully concrete. If the replacement types are interface types for some \emph{output} generic signature, the substitution map will produce interface types for this generic signature. The output generic signature might be a different from the \emph{input} generic signature of the substitution map. +\paragraph{Output generic signature.} +If the replacement types in the substitution map are \index{fully-concrete type}fully concrete---that is, they do not contain any type parameters---then all possible substituted types produced by this substitution map will also be fully concrete. If the replacement types are interface types for some \IndexDefinition{output generic signature}\emph{output} generic signature, the substitution map will produce interface types for this generic signature. The output generic signature might be a different from the \emph{input} generic signature of the substitution map. The output generic signature is not stored in the substitution map; it is implicit from context. Also, fully-concrete types can be seen as valid interface types for \emph{any} generic signature, because they do not contain type parameters at all. Keeping these caveats in mind, we have this essential principle: \begin{quote} @@ -173,7 +164,7 @@ \chapter{Substitution Maps}\label{substmaps} Recall our notation $\TypeObj{G}$ for the set of interface types of $G$. We also use the notation \IndexSetDefinition{sub}{\SubMapObj{G}{H}}$\SubMapObj{G}{H}$ for the set of substitution maps with input generic signature $G$ and output generic signature $H$. We make use of this notation to formalize our principle. If $\texttt{T}\in\TypeObj{G}$ and $\Sigma\in\SubMapObj{G}{H}$, then $\texttt{T}\otimes\Sigma\in\TypeObj{H}$, and thus the $\otimes$ binary operation is a function between the following sets: \[\TypeObj{G}\otimes\SubMapObj{G}{H}\longrightarrow\TypeObj{H}\] -\paragraph{Canonical substitution maps} +\paragraph{Canonical substitution maps.} \IndexDefinition{canonical substitution map}% \index{canonical type}% \index{canonical conformance}% @@ -188,7 +179,7 @@ \chapter{Substitution Maps}\label{substmaps} \item Given an original type and two canonically equal substitution maps, applying the two substitution maps to this type will also produce two canonically equal substituted types. \end{enumerate} -\section{Context Substitution Maps}\label{contextsubstmap} +\section{Generic Arguments}\label{contextsubstmap} \IndexDefinition{context substitution map} \index{declared interface type} @@ -217,7 +208,7 @@ \section{Context Substitution Maps}\label{contextsubstmap} }\\ = \texttt{Dictionary} \end{multline*} -\paragraph{The identity substitution map} +\paragraph{The identity substitution map.} What is the context substitution map of a type declaration's declared interface type? By definition, if $\Sigma$ is the context substitution map of $\texttt{T}_d$, then $\texttt{T}_d\otimes\Sigma=\texttt{T}_d$; it leaves the declared interface type unchanged. That is, this substitution map maps every generic parameter of the type declaration's generic signature to itself. If we look at the \texttt{Dictionary} type again, we can write down this substitution map: \begin{multline*} \texttt{Dictionary<\ttgp{0}{0}, \ttgp{0}{1}>}\otimes @@ -235,139 +226,25 @@ \section{Context Substitution Maps}\label{contextsubstmap} \begin{enumerate} \item The interface type must only contain type parameters which are valid in the input generic signature $G$ of this identity substitution map $1_G$. \item Substitution might change type sugar, because generic parameters appearing in the original interface type might be sugared differently than the input generic signature of this identity substitution map. Therefore, canonical equality of types is preserved, not necessarily pointer equality. -\item We won't talk about archetypes until Chapter~\ref{genericenv}, but you may have met them already. Applying the identity substitution map to a contextual type containing archetypes replaces the archetypes with equivalent type parameters. There is a corresponding \emph{forwarding substitution map} which maps all generic parameters to archetypes; the forwarding substitution map acts as the identity in the world of contextual types. +\item We won't talk about archetypes until \ChapRef{genericenv}, but you may have met them already. Applying the identity substitution map to a contextual type containing archetypes replaces the archetypes with equivalent type parameters. There is a corresponding \emph{forwarding substitution map} which maps all generic parameters to archetypes; the forwarding substitution map acts as the identity in the world of contextual types. \end{enumerate} -\paragraph{The empty substitution map} +\paragraph{The empty substitution map.} The \index{empty generic signature}empty generic signature only has a single unique substitution map, the \IndexDefinition{empty substitution map}\emph{empty substitution map}, so the context substitution map of a non-specialized nominal type is the empty substitution map. In our notation, the empty substitution map is denoted $\SubstMap{}$. The only valid interface types of the empty generic signature are the \index{fully-concrete type}fully-concrete types. The action of the empty substitution map leaves fully-concrete types unchanged, so for example, $\texttt{Int}\otimes\SubstMap{} = \texttt{Int}$. The empty substitution map $\SubstMap{}$ is almost never the same as the identity substitution map $1_G$. In fact, they only coincide if $G$ is the empty generic signature. Applying the empty substitution map to an interface type containing type parameters is a substitution failure and returns an error type. \[\texttt{\ttgp{0}{0}.[Sequence]Element} \otimes \SubstMap{} = \texttt{<>}\] -\paragraph{Other declaration contexts} A more general notion is the context substitution map of a type \emph{with respect to a declaration context}. This is where the ``context'' comes from in ``context substitution map.'' Recall that a \index{qualified lookup}qualified \index{name lookup}name lookup \texttt{foo.bar} looks for a member named \texttt{foo} on some base type, here the type of \texttt{foo}. The context substitution map for the member's declaration context describes the substitutions for computing the type of the \index{member reference expression}member reference expression. When the \index{declaration context}declaration context is the type declaration itself, ``context substitution map with respect to its own declaration context'' coincides with the earlier notion of ``the'' context substitution map of a base type. - -\index{direct lookup} -To understand the relationship between the type and the declaration context here, recall from Section~\ref{name lookup} that qualified name lookup performs a series of \emph{direct lookups}, first into the type declaration itself, then its superclass if any, and finally any protocols it conforms to. A direct lookup in turn searches the immediate members of the type declaration and any of its extensions. Thus, we can talk about the set of declaration contexts \emph{reachable} from a base type: -\begin{enumerate} -\item The type declaration itself and its extensions. -\item The superclass declaration and its extensions, and everything reachable recursively via the superclass declaration. -\item All protocol conformances of the type declaration, and their protocol extensions. -\end{enumerate} -The declaration context for computing a context substitution map must be reachable from the base type by the above definition. - -\index{constrained extension} -\index{conformance requirement} -\begin{definition}\label{context substitution map for decl context} The context substitution map with respect to a declaration context is defined as follows for the three kinds of reachable declaration contexts: -\begin{enumerate} -\item When the declaration context is the generic type or an extension, the replacement types of the substitution map are the corresponding generic arguments of the base type. If the context is a constrained extension, the substitution map will store additional conformances for the conformance requirements of the extension. -\item When the declaration context is a protocol or a protocol extension, the generic signature is the protocol generic signature, possibly with additional requirements if the context is a constrained protocol extension. The substitution map's single replacement type is the entire base type. -\item When the declaration context is a superclass of the generic type (which must be a class type or an archetype with a superclass requirement), the context substitution map is constructed recursively from the type declaration's superclass type. This case will be described in Chapter~\ref{classinheritance}. -\end{enumerate} -The context substitution map's input generic signature is the generic signature of the declaration context; thus it can be applied to the interface type of a member of this context. -\end{definition} - -\begin{listing}\captionabove{Context substitution map with respect to an extension context}\label{context substitution map of constrained extension listing} -\begin{Verbatim} -struct Outer { - struct Inner {} -} - -extension Outer.Inner where U: Sequence { - typealias A = (U.Element) -> () -} - -// What is the type of `x'? -let x: Outer.Inner.A = ... -\end{Verbatim} -\end{listing} -\begin{example} -Case~1 determines the type of \texttt{x} in Listing~\ref{context substitution map of constrained extension listing}. The base type is the generic nominal type \texttt{Outer.Inner} and the type alias \texttt{A} is a member of the constrained extension of \texttt{Outer.Inner}. - -The generic nominal type \texttt{Outer.Inner} sets \texttt{T} to \texttt{Int} and \texttt{U} to \texttt{String}. The extension defines the additional conformance requirement \texttt{U:~Sequence}. Therefore, we can write down the context substitution map and apply it to the declared interface type of the type alias \texttt{A} to get the final result: -\begin{multline*} -\texttt{(U.[Sequence]Element) -> ()} \otimes -\SubstMapLongC{ -\SubstType{T}{Int}\\ -\SubstType{U}{String} -}{ -\SubstConf{U}{String}{Sequence} -}\\ -= \texttt{(Character) -> ()} -\end{multline*} -\end{example} -\begin{example} -It is instructive to see what happens if we instead compute the context substitution map with respect to the type declaration context itself. We get an almost identical substitution map, except without the conformance requirement. Applying this substitution map to the declared interface type of the type alias \texttt{A} will produce an error type, because the dependent member type \texttt{U.[Sequence]Element} is not a valid type parameter for this substitution map's input generic signature: -\begin{multline*} -\texttt{(U.[Sequence]Element) -> ()} \otimes -\SubstMapLong{ -\SubstType{T}{Int}\\ -\SubstType{U}{String} -}\\ -= \texttt{<>} -\end{multline*} -\end{example} - -\begin{example} What if we use the correct declaration context, but the base type does not satisfy the requirements of the constrained extension? For example, consider the type \texttt{Outer.Inner}. Computing the context substitution map of our base type for the constrained extension's declaration context will output a substitution map containing an invalid conformance, because \texttt{Int} does not conform to \texttt{Sequence}: -\[ -\SubstMapLongC{ -\SubstType{T}{Int}\\ -\SubstType{U}{Int} -}{ -\ConfReq{U}{Sequence}\mapsto\text{invalid conformance} -} -\] -\end{example} -In the source language, the type alias \texttt{A} cannot be referenced as a member of this base type at all, because name lookup checks whether the generic requirements of a type declaration are satisfied. Checking generic requirements will be first introduced as part of type resolution (Section~\ref{identtyperepr}), and will come up elsewhere as well. - - -\paragraph{Protocol substitution map} -The context substitution map of a type with respect to a protocol declaration context is called the \IndexDefinition{protocol substitution map}\emph{protocol substitution map}. The generic signature of a protocol has a single generic parameter with a single conformance requirement, so a substitution map for this generic signature consists of a conformance together with its \index{conforming type}conforming type. Thus, if \texttt{T} is a concrete type or a type parameter conforming to \texttt{P}, the protocol substitution map is formed from \texttt{T} and $\ConfReq{T}{P}$. We will denote this by $\Sigma_{\ConfReq{T}{P}}$: -\[ -\Sigma_{\ConfReq{T}{P}} := \SubstMapC{ -\SubstType{Self}{T} -}{ -\SubstConf{Self}{T}{P} -} -\] - -\begin{listing}\captionabove{The context substitution map with respect to a protocol context}\label{protocolsubstitutionmaplisting}\index{horse} -\begin{Verbatim} -struct Horse: Animal { - typealias Food = Hay -} - -protocol Animal { - associatedtype Food - typealias Lunch = Array -} - -// What is the type of `x'? -let x: Horse.Lunch = ... -\end{Verbatim} -\end{listing} -\begin{example} The type of \texttt{x} in Listing~\ref{protocolsubstitutionmaplisting} is constructed from the underlying type of the type alias, together with the context substitution map for the member reference. The type alias is declared in the \texttt{Animal} protocol, and is accessed as a member of the \texttt{Horse} type. The context substitution map for this kind of member access is the protocol substitution map for the conformance $\ConfReq{Horse}{Animal}$: -\[ -\Sigma_{\ConfReq{Horse}{Animal}} := \SubstMapC{ -\SubstType{Self}{Horse} -}{ -\SubstConf{Self}{Horse}{Animal} -} -\] -The underlying interface type of the \texttt{Lunch} type alias declaration is \texttt{Array}, thus the type of \texttt{x} is obtained by applying $\Sigma_{\ConfReq{Horse}{Animal}}$ to $\texttt{Array}$, which gives us \texttt{Array}. - -We will define dependent member type substitution, and gain a deeper understanding of protocol substitution maps, in Section~\ref{abstract conformances}. For now, to understand the above type substitution, it suffices to know that applying $\Sigma_{\ConfReq{Horse}{Animal}}$ to \texttt{Self.Food} outputs the type \texttt{Hay}, so the final substituted type of \texttt{x} is \texttt{Array}. -\end{example} - \section{Composing Substitution Maps}\label{submapcomposition} Suppose that we have three generic signatures, $F$, $G$ and $H$, and a pair of substitution maps: $\Sigma_1\in\SubMapObj{F}{G}$, and $\Sigma_2\in\SubMapObj{G}{H}$. If we start with an interface type $\texttt{T}\in\TypeObj{F}$, then $\texttt{T}\otimes\Sigma_1\in\TypeObj{G}$. If we then apply $\Sigma_2$ to $\texttt{T}\otimes\Sigma_1$, we get an interface type in $\TypeObj{H}$: \[(\texttt{T}\otimes\Sigma_1)\otimes\Sigma_2\] -The \emph{composition} of the substitution maps $\Sigma_1$ and $\Sigma_2$, denoted by \index{$\otimes$}$\Sigma_1\otimes\Sigma_2$, is the unique substitution map which satisfies the following equation for all $\texttt{T}\in\TypeObj{F}$: +The \IndexDefinition{substitution map composition}\emph{composition} of the substitution maps $\Sigma_1$ and $\Sigma_2$, denoted by \index{$\otimes$}$\Sigma_1\otimes\Sigma_2$, is the unique substitution map which satisfies the following equation for all $\texttt{T}\in\TypeObj{F}$: \[\texttt{T}\otimes(\Sigma_1\otimes\Sigma_2)=(\texttt{T}\otimes\Sigma_1)\otimes\Sigma_2\] -That is, applying the composition of two substitution maps is the same as applying the first substitution map followed by the second. Since $(\texttt{T}\otimes\Sigma_1)\otimes\Sigma_2\in\TypeObj{H}$, we see that $\Sigma_1\otimes\Sigma_2\in\SubMapObj{F}{H}$; the input generic signature of the composition is the input generic signature of the first substitution map, and the output generic signature of the composition is the output generic signature of the second. Substitution map composition can thus be understood as a function between sets: +That is, applying the composition of two substitution maps is the same as applying the first substitution map followed by the second. Since $(\texttt{T}\otimes\Sigma_1)\otimes\Sigma_2\in\TypeObj{H}$, we see that $\Sigma_1\otimes\Sigma_2\in\SubMapObj{F}{H}$; the \index{input generic signature}input generic signature of the composition is the input generic signature of the first substitution map, and the output generic signature of the composition is the \index{output generic signature}output generic signature of the second. Substitution map composition can thus be understood as a function between sets: \[\SubMapObj{F}{G}\otimes\SubMapObj{G}{H}\longrightarrow\SubMapObj{F}{H}\] -To understand how the composition $\Sigma_1\otimes\Sigma_2$ is actually constructed from $\Sigma_1$ and $\Sigma_2$ in the implementation, we decompose $\Sigma_1$ by applying it to each generic parameter and conformance requirment of the generic signature $F$: +To understand how the composition $\Sigma_1\otimes\Sigma_2$ is actually constructed from $\Sigma_1$ and $\Sigma_2$ in the implementation, we decompose $\Sigma_1$ by applying it to each \index{generic parameter type}generic parameter and \index{conformance requirement}conformance requirement of the generic signature $F$: \[\Sigma_1 := \SubstMapLongC{ \SubstType{\ttgp{0}{0}}{$\ttgp{0}{0}\otimes\Sigma_1$}\\ \ldots @@ -419,7 +296,7 @@ \section{Composing Substitution Maps}\label{submapcomposition} \end{listing} \begin{example}\label{composesubstmapexample} \index{expression} -Listing~\ref{composesubstmaplisting} shows an example where substitution map composition can help reason about the types of chained member reference expressions. The \texttt{inner} stored property of \texttt{Outer} has type \texttt{Inner, A>}. Here is the context substitution map of this type, which we will refer to as $\Sigma_1$: +\ListingRef{composesubstmaplisting} shows an example where substitution map composition can help reason about the types of chained \index{member reference expression}member reference expressions. The \texttt{inner} stored property of \texttt{Outer} has type \texttt{Inner, A>}. Here is the context substitution map of this type, which we will refer to as $\Sigma_1$: \[ \Sigma_1 := \FirstMapInExample \] @@ -451,9 +328,9 @@ \section{Composing Substitution Maps}\label{submapcomposition} \end{example} If $\Sigma\in\SubMapObj{F}{G}$, then the identity substitution maps $1_F$ and $1_G$ have a natural behavior under substitution map composition: \[1_F\otimes\Sigma = \Sigma\qquad\Sigma\otimes 1_G = \Sigma\] -The second identity carries the same caveat as the identity $\texttt{T}\otimes 1_G=\texttt{T}$ for types; it is only true if the replacement types of $\Sigma$ are interface types. If they are contextual types, the archetypes will be replaced with equivalent type parameters, as we will explain in Section~\ref{archetypesubst}. +The second identity carries the same caveat as the identity $\texttt{T}\otimes 1_G=\texttt{T}$ for types; it is only true if the replacement types of $\Sigma$ are interface types. If they are contextual types, the archetypes will be replaced with equivalent type parameters, as we will explain in \SecRef{archetypesubst}. \begin{example} -Recall the generic signatures $F$ and $G$, and the substitution map $\Sigma_1 := \FirstMapInExample\in\SubMapObj{F}{G}$ from Example~\ref{composesubstmapexample}. We can write down the identity substitution maps $1_F$ and $1_G$: +Recall the generic signatures $F$ and $G$, and the substitution map $\Sigma_1 := \FirstMapInExample\in\SubMapObj{F}{G}$ from \ExRef{composesubstmapexample}. We can write down the identity substitution maps $1_F$ and $1_G$: \begin{gather*} 1_F := \SubstMap{\SubstType{T}{T},\,\SubstType{U}{U}}\\ 1_G := \SubstMap{\SubstType{A}{A}} @@ -481,15 +358,10 @@ \section{Composing Substitution Maps}\label{submapcomposition} Thus, our type substitution algebra allows us to omit grouping parentheses without introducing ambiguity: \[\texttt{T}\otimes\Sigma_1\otimes\Sigma_2\otimes\Sigma_3\] -\paragraph{Categorically speaking} -\IndexDefinition{category}% -\IndexDefinition{morphism}% -\IndexDefinition{identity morphism}% -\IndexDefinition{object}% -\index{function}% -A \emph{category} is a collection of \emph{objects} and \emph{morphisms}. (Very often the morphisms are functions of some sort, but they might also be completely abstract.) Each morphism is associated with a pair of objects, the \emph{source} and \emph{destination}. The collection of morphisms with source $A$ and destination $B$ is denoted $\mathrm{Hom}(A,B)$. The morphisms of a category must obey certain properties: +\paragraph{Categorically speaking.} +A \IndexDefinition{category}\emph{category} is a collection of \IndexDefinition{object}\emph{objects} and \IndexDefinition{morphism}\emph{morphisms}. (Very often the morphisms are \index{function}functions of some sort, but they might also be completely abstract.) Each morphism is associated with a pair of objects, the \emph{source} and \emph{destination}. The collection of morphisms with source $A$ and destination $B$ is denoted $\mathrm{Hom}(A,B)$. The morphisms of a category must obey certain properties: \begin{enumerate} -\item For every object $A$, there is an \emph{identity morphism} $1_A\in\mathrm{Hom}(A, A)$. +\item For every object $A$, there is an \IndexDefinition{identity morphism}\emph{identity morphism} $1_A\in\mathrm{Hom}(A, A)$. \item If $f\in\mathrm{Hom}(A, B)$ and $g\in\mathrm{Hom}(B, C)$ are a pair of morphisms, there is a third morphism $g\circ f\in\mathrm{Hom}(A,C)$, called the \emph{composition} of $g$ with $f$. \item Composition respects the identity: if $f\in\mathrm{Hom}(A, B)$, then $f\circ 1_A=1_B\circ f=f$. \item Composition is associative: if $f\in\mathrm{Hom}(A, B)$, $g\in\mathrm{Hom}(B, C)$ and $h\in\mathrm{Hom}(C, D)$, then $h\circ(g\circ f)=(h\circ g)\circ f$. @@ -503,7 +375,7 @@ \section{Composing Substitution Maps}\label{submapcomposition} \item The identity morphism is the identity substitution map (you will see later it does not act as the identity on archetypes, which is why we rule them out above). \item The composition of morphisms $g\circ f$ is the composition of substitution maps $f\otimes g$ (note that we must reverse the order here for the definition to work). \end{itemize} -Category theory often comes up in programming when working with data structures and higher-order functions; an excellent introduction to the topic is \cite{catprogrammer}. While we don't need to deal with categories in the abstract here, but we will encounter another idea from category theory, the commutative diagram, in Section~\ref{type witnesses}. +Category theory often comes up in programming when working with data structures and higher-order functions; an excellent introduction to the topic is \cite{catprogrammer}. While we don't need to deal with categories in the abstract here, but we will encounter another idea from category theory, the commutative diagram, in \SecRef{type witnesses}. \section{Building Substitution Maps}\label{buildingsubmaps} @@ -551,9 +423,9 @@ \section{Building Substitution Maps}\label{buildingsubmaps} \end{enumerate} For the conformance lookup callback, \begin{enumerate} -\item The \textbf{global conformance lookup} callback performs a global conformance lookup (Section~\ref{conformance lookup}). -\item The \textbf{local conformance lookup} callback performs a local conformance lookup into another substitution map (Section~\ref{abstract conformances}). -\item The \textbf{make abstract conformance} callback asserts that the substituted type is a type variable, type parameter or archetype, and returns an abstract conformance (also in Section~\ref{abstract conformances}). It is used when it is known that the substitution map can be constructed without performing any conformance lookups, as is the case with the identity substitution map. +\item The \textbf{global conformance lookup} callback performs a global conformance lookup (\SecRef{conformance lookup}). +\item The \textbf{local conformance lookup} callback performs a local conformance lookup into another substitution map (\SecRef{abstract conformances}). +\item The \textbf{make abstract conformance} callback asserts that the substituted type is a type variable, type parameter or archetype, and returns an abstract conformance (also in \SecRef{abstract conformances}). It is used when it is known that the substitution map can be constructed without performing any conformance lookups, as is the case with the identity substitution map. \end{enumerate} \IndexDefinition{conformance lookup callback} \index{abstract conformance} @@ -593,12 +465,10 @@ \section{Building Substitution Maps}\label{buildingsubmaps} \section{Nested Nominal Types}\label{nested nominal types} -\index{limitation} -\index{nested type declaration} -Nominal type declarations can appear inside other declaration contexts, subject to the following restrictions: +Nominal type declarations can appear \index{nested type declaration}inside other declaration contexts, subject to the following \index{limitation!nested type declarations}restrictions: \begin{enumerate} -\item Structs, enums and classes cannot be nested in generic local contexts. -\item Structs, enums and classes cannot be nested in protocols or protocol extensions. +\item Structs, enums and classes cannot be nested in generic \index{local declaration context}local contexts. +\item Structs, enums and classes cannot be nested in protocols or \index{protocol extension}protocol extensions. \item Protocols cannot be nested in generic contexts. \end{enumerate} We're going to explore the implementation limitations behind these restrictions, and possible future directions for lifting them. (The rest of the book talks about what the compiler does, but this section is about what the compiler \emph{doesn't} do.) @@ -607,7 +477,7 @@ \section{Nested Nominal Types}\label{nested nominal types} \index{local type declaration} \index{generic context} \index{context substitution map} -\paragraph{Types in generic local contexts} This restriction is a consequence of a shortcoming in the representation of a nominal type. Recall from Chapter~\ref{types} that nominal types and generic nominal types store a parent type, and generic nominal types additionally store a list of generic arguments, corresponding to the generic parameter list of the nominal type declaration. This essentially means there is no place to store the generic arguments from outer local contexts, such as functions. +\paragraph{Types in generic local contexts.} This restriction is a consequence of a shortcoming in the representation of a nominal type. Recall from \ChapRef{types} that nominal types and generic nominal types store a parent type, and generic nominal types additionally store a list of generic arguments, corresponding to the generic parameter list of the nominal type declaration. This essentially means there is no place to store the generic arguments from outer local contexts, such as functions. \begin{listing}\captionabove{A nominal type declaration in a generic local context}\label{nominal type in generic local context} \begin{Verbatim} @@ -630,7 +500,7 @@ \section{Nested Nominal Types}\label{nested nominal types} \end{Verbatim} \end{listing} -Listing~\ref{nominal type in generic local context} shows a nominal type nested inside of a generic function. The generic signature of \texttt{Nested} contains the generic parameter \texttt{T} from the outer generic function \texttt{algorithm()}. However, under our rules, the declared interface type of \texttt{Nested} is a singleton nominal type, because \texttt{Nested} does not have its own generic parameter list, and its parent context is not a nominal type declaration. This means there is no way to recover a context substitution map for this type because the generic argument for \texttt{T} is not actually stored anywhere. +\ListingRef{nominal type in generic local context} shows a nominal type nested inside of a generic function. The generic signature of \texttt{Nested} contains the generic parameter \texttt{T} from the outer generic function \texttt{algorithm()}. However, under our rules, the declared interface type of \texttt{Nested} is a singleton nominal type, because \texttt{Nested} does not have its own generic parameter list, and its parent context is not a nominal type declaration. This means there is no way to recover a context substitution map for this type because the generic argument for \texttt{T} is not actually stored anywhere. In the source language, there is no way to specialize \texttt{Nested}; the reference to \texttt{T} inside \texttt{f()} is always understood to be the generic parameter \texttt{T} of the outer function. However, inside the compiler, different generic specializations can still arise. If the two calls to \texttt{f()} from inside \texttt{g()} are specialized and inlined by the SIL optimizer for example, the two temporary instances of \texttt{Nested} must have different in-memory layouts, because in one call \texttt{T} is \texttt{Int}, and in the other \texttt{T} is \texttt{String}. @@ -639,7 +509,7 @@ \section{Nested Nominal Types}\label{nested nominal types} \index{runtime type metadata} Luckily, this ``flat'' representation is already implemented in the Swift runtime. The runtime type metadata for a nominal type includes all the generic parameters from the nominal type declaration's generic signature, not just the generic parameters of the nominal type declaration itself. So while lifting this restriction would require some engineering effort on the compiler side, it would be a backward-deployable and \index{ABI}ABI-compatible change. -\paragraph{Types in protocol contexts} Allowing struct, enum and class declarations to appear inside protocols and protocol extensions would come down to deciding if the \Index{protocol Self type@protocol \texttt{Self} type}protocol \texttt{Self} type should be ``captured'' by the nested type. +\paragraph{Types in protocol contexts.} Allowing struct, enum and class declarations to appear inside protocols and protocol extensions would come down to deciding if the \Index{protocol Self type@protocol \texttt{Self} type}protocol \texttt{Self} type should be ``captured'' by the nested type. \begin{listing}\captionabove{A nominal type declaration nested in a protocol context}\label{nominal type in protocol context} \begin{Verbatim} @@ -665,7 +535,7 @@ \section{Nested Nominal Types}\label{nested nominal types} struct S2: P {} // are S1.Nested and S2.Nested distinct? \end{Verbatim} \end{listing} -If the nested type captures \texttt{Self}, the code shown in Listing~\ref{nominal type in generic local context} would become valid. With this model, the \texttt{Nested} struct depends on \texttt{Self}, so it would not make sense to reference it as a member of the protocol itself, like \texttt{P.Nested}. Instead, \texttt{Nested} would behave as if it was a member of every \index{conforming type}conforming type, like \texttt{S.Nested} above (or even \texttt{T.Nested}, if \texttt{T} is a generic parameter conforming to \texttt{P}). At the implementation level, the generic signature of a nominal type nested inside of a protocol context would include the protocol \texttt{Self} type, and the \emph{entire} parent type, for example \texttt{S} in \texttt{S.Nested}, would become the replacement type for \texttt{Self} in the context substitution map. +If the nested type captures \texttt{Self}, the code shown in \ListingRef{nominal type in generic local context} would become valid. With this model, the \texttt{Nested} struct depends on \texttt{Self}, so it would not make sense to reference it as a member of the protocol itself, like \texttt{P.Nested}. Instead, \texttt{Nested} would behave as if it was a member of every \index{conforming type}conforming type, like \texttt{S.Nested} above (or even \texttt{T.Nested}, if \texttt{T} is a generic parameter conforming to \texttt{P}). At the implementation level, the generic signature of a nominal type nested inside of a protocol context would include the protocol \texttt{Self} type, and the \emph{entire} parent type, for example \texttt{S} in \texttt{S.Nested}, would become the replacement type for \texttt{Self} in the context substitution map. The alternative is to prohibit the nested type from referencing the protocol \texttt{Self} type. The nested type's generic signature would \emph{not} include the protocol \texttt{Self} type, and \texttt{P.Nested} would be a valid member type reference. The protocol would effectively act as a namespace for the nominal types it contains, with the nested type not depending on the conformance to the protocol in any way. @@ -688,12 +558,12 @@ \section{Nested Nominal Types}\label{nested nominal types} \Index{protocol Self type@protocol \texttt{Self} type} \index{Haskell} \index{multi-parameter type class} -\paragraph{Protocols in other declaration contexts} The final possibility is the nesting of protocols inside other declaration contexts, such as functions or nominal types. This breaks down into two cases, illustrated in Listing~\ref{protocol nested inside type}: +\paragraph{Protocols in other declaration contexts.} The final possibility is the nesting of protocols inside other declaration contexts, such as functions or nominal types. This breaks down into two cases, illustrated in \ListingRef{protocol nested inside type}: \begin{enumerate} \item Protocols inside non-generic declaration contexts. \item Protocols inside generic declaration contexts. \end{enumerate} -The first case was originally prohibited, but is now permitted as of Swift~5.10 \cite{se0404}; the non-generic declaration context acts as a namespace to which the protocol declaration is scoped, but apart from the interaction with name lookup this has no other semantic consequences. The second case is more subtle. If we were to allow a ``generic protocol'' to be parameterized by its outer generic parameters in addition to just the protocol \texttt{Self} type, we would get what \index{Haskell}Haskell calls a ``multi-parameter type class.'' Multi-parameter type classes introduce some complications, for example undecidable type inference~\cite{mptc}. +The first case was originally prohibited, but is now permitted as of \IndexSwift{5.a@5.10}Swift~5.10 \cite{se0404}; the non-generic declaration context acts as a namespace to which the protocol declaration is scoped, but apart from the interaction with name lookup this has no other semantic consequences. The second case is more subtle. If we were to allow a ``generic protocol'' to be parameterized by its outer generic parameters in addition to just the protocol \texttt{Self} type, we would get what \index{Haskell}Haskell calls a ``multi-parameter type class.'' Multi-parameter type classes introduce some complications, for example undecidable type inference~\cite{mptc}. \section{Source Code Reference}\label{substmapsourcecoderef} @@ -712,7 +582,7 @@ \section{Source Code Reference}\label{substmapsourcecoderef} \IndexSource{type substitution} \apiref{Type}{class} -See also Section~\ref{typesourceref}. +See also \SecRef{typesourceref}. \begin{itemize} \item \texttt{subst()} applies a substitution map to this type and returns the substituted type. \end{itemize} @@ -720,7 +590,7 @@ \section{Source Code Reference}\label{substmapsourcecoderef} \index{declaration context} \IndexSource{context substitution map} \apiref{TypeBase}{class} -See also Section~\ref{typesourceref} and Section~\ref{genericsigsourceref}. +See also \SecRef{typesourceref} and \SecRef{genericsigsourceref}. \begin{itemize} \item \texttt{getContextSubstitutionMap()} returns this type's context substitution map with respect to the given \texttt{DeclContext}. \end{itemize} @@ -750,7 +620,7 @@ \section{Source Code Reference}\label{substmapsourcecoderef} \item \texttt{empty()} answers if this is the empty substitution map; this is the logical negation of the \texttt{bool} implicit conversion. \item \texttt{getGenericSignature()} returns the substitution map's input generic signature. \item \texttt{getReplacementTypes()} returns an array of \texttt{Type}. -\item \texttt{hasAnySubstitutableParams()} answers if the input generic signature contains at least one generic parameter not fixed to a concrete type; that is, it must be non-empty and not fully concrete (see the \texttt{areAllParamsConcrete()} method of \texttt{GenericSignatureImpl} from Section~\ref{genericsigsourceref}). +\item \texttt{hasAnySubstitutableParams()} answers if the input generic signature contains at least one generic parameter not fixed to a concrete type; that is, it must be non-empty and not fully concrete (see the \texttt{areAllParamsConcrete()} method of \texttt{GenericSignatureImpl} from \SecRef{genericsigsourceref}). \end{itemize} Recursive properties computed from replacement types: \begin{itemize} @@ -763,11 +633,11 @@ \section{Source Code Reference}\label{substmapsourcecoderef} \item \texttt{isCanonical()} answers if the replacement types and conformances stored in this substitution map are canonical. \item \texttt{getCanonical()} constructs a new substitution map by canonicalizing the replacement types and conformances of this substitution map. \end{itemize} -Composing substitution maps (Section~\ref{submapcomposition}): +Composing substitution maps (\SecRef{submapcomposition}): \begin{itemize} \item \texttt{subst()} applies another substitution map to this substitution map, producing a new substitution map. \end{itemize} -Two overloads of the \texttt{get()} static method are defined for constructing substitution maps (Section~\ref{buildingsubmaps}). +Two overloads of the \texttt{get()} static method are defined for constructing substitution maps (\SecRef{buildingsubmaps}). \IndexSource{get substitution map} \medskip @@ -855,7 +725,7 @@ \section{Source Code Reference}\label{substmapsourcecoderef} \index{generic signature} \IndexSource{identity substitution map} \apiref{GenericSignature}{class} -See also Section~\ref{genericsigsourceref}. +See also \SecRef{genericsigsourceref}. \begin{itemize} \item \texttt{getIdentitySubstitutionMap()} returns the substitution map that replaces each generic parameter with itself. diff --git a/docs/Generics/chapters/symbols-terms-and-rules.tex b/docs/Generics/chapters/symbols-terms-and-rules.tex index c097dd69bd822..0c68974d3fe44 100644 --- a/docs/Generics/chapters/symbols-terms-and-rules.tex +++ b/docs/Generics/chapters/symbols-terms-and-rules.tex @@ -33,9 +33,9 @@ \chapter{Symbols, Terms and Rules}\label{symbols terms rules} \midrule conformance: &$\ConfReq{T}{P}$&$t.\protosym{P}\Rightarrow t$\\ layout: &$\ConfReq{T}{AnyObject}$&$t.\layoutsym{AnyObject}\Rightarrow t$\\ -superclass: &$\FormalReq{T:\ C}$&$t.\supersym{\texttt{C}}\Rightarrow t$\\ -concrete type: &$\FormalReq{T == C}$&$t.\concretesym{\texttt{C}}\Rightarrow t$\\ -same-type: &$\FormalReq{T == U}$&$u\Rightarrow t$ (if $t}\colon\texttt{P};\;\sigma_0 := \ldots,\, \sigma_1 := \ldots}\\ \concretesym{\texttt{(\ttgp{0}{0}) -> \ttgp{0}{0}}\colon\texttt{P};\;\sigma_0 := \ldots} \end{gather*} -The substitution terms also satisfy a mutual compatibility condition: either all terms must begin with (possibly different) generic parameter symbols, or all terms must belong to the same protocol domain. These invariants are established by the construction algorithm, and preserved when we manipulate concrete type symbols in Chapter~\ref{concrete conformances}. +The substitution terms also satisfy a mutual compatibility condition: either all terms must begin with (possibly different) generic parameter symbols, or all terms must belong to the same protocol domain. These invariants are established by the construction algorithm, and preserved when we manipulate concrete type symbols in \ChapRef{concrete conformances}. \section{Term Reduction}\label{term reduction} @@ -355,7 +355,7 @@ \section{Term Reduction}\label{term reduction} Hello world. \end{algorithm} -Our implementation explicitly stores the source and destination terms in a critical pair, as well as the rewrite path itself. As we will learn in Section~\ref{completion sourceref}, we use a compressed representation for \index{rewrite path!representation}rewrite paths in memory, where a \index{rewrite step!representation}rewrite step $x(u\Rightarrow v)y$ only stores the \emph{length} of each \index{whiskering}whisker $x$ and $y$ and not the actual terms $x$ and $y$. Thus, we must store the source and destination terms separately since they cannot be recovered from the rewrite path alone. +Our implementation explicitly stores the source and destination terms in a critical pair, as well as the rewrite path itself. As we will learn in \SecRef{completion sourceref}, we use a compressed representation for \index{rewrite path!representation}rewrite paths in memory, where a \index{rewrite step!representation}rewrite step $x(u\Rightarrow v)y$ only stores the \emph{length} of each \index{whiskering}whisker $x$ and $y$ and not the actual terms $x$ and $y$. Thus, we must store the source and destination terms separately since they cannot be recovered from the rewrite path alone. \section{The Reduction Order}\label{reduction order} @@ -365,9 +365,9 @@ \section{The Reduction Order}\label{reduction order} \IndexDefinition{reduction order!in requirement machine}% -Now that we can build terms from type parameters, we need a way to compare those terms so that we can proceed to define rewrite rules. We wish to use our rewrite system to compute reduced types, which are previously defined with the type parameter order of Section~\ref{typeparams}, so we would expect that if we take two type parameters \texttt{T} and \texttt{U} such that $\texttt{T}<\texttt{U}$ by the type parameter order, then the corresponding terms $t$ and $u$ satisfy $t$'' or ``$=$'' as output. \begin{enumerate} \item Compute the protocol inheritance closures of \texttt{P} and \texttt{Q}, and let $n_\texttt{P}$ and $n_\texttt{Q}$ be the number of elements in each set. \item If $n_\texttt{P}>n_\texttt{Q}$, return ``$<$'' (so \texttt{P} precedes \texttt{Q} if \texttt{P} inherits from \emph{more} protocols than \texttt{Q}). \item If $n_\texttt{P}$''. -\item If $n_\texttt{P}=n_\texttt{Q}$, compare the protocols using Algorithm~\ref{linear protocol order}. +\item If $n_\texttt{P}=n_\texttt{Q}$, compare the protocols using \AlgRef{linear protocol order}. \end{enumerate} \end{algorithm} \begin{example} @@ -420,7 +420,7 @@ \section{The Reduction Order}\label{reduction order} \draw [arrow] (RandomAccessCollection) -- (BidirectionalCollection); \end{tikzpicture} \end{quote} -The protocol symbols order as follows---the rows are shown in decreasing size of protocol inheritance closure, and within each row our algorithm falls back to the original protocol order of Algorithm~\ref{linear protocol order}: +The protocol symbols order as follows---the rows are shown in decreasing size of protocol inheritance closure, and within each row our algorithm falls back to the original protocol order of \AlgRef{linear protocol order}: \begin{gather*} \protosym{RandomAccessCollection}\\ {} < \protosym{BidirectionalCollection}<\protosym{MutableCollection}\\ @@ -446,7 +446,7 @@ \section{The Reduction Order}\label{reduction order} Takes two superclass symbols, two concrete type symbols, or two concrete conformance symbols as input, and returns one of ``$<$'', ``$>$'', ``$=$'' or \index{$\bot$}``$\bot$'' as output. \begin{enumerate} \item (Invariant) We assume the two symbols already have the same kind; the case of comparing a superclass symbol against a concrete type symbol, for example, is handled by the symbol order below. -\item (Concrete conformance) If the two symbols are both concrete conformance symbols, first compare their protocols using Algorithm~\ref{protocol reduction order}. Return the result if it is ``$<$'' or ``$>$''. Otherwise, keep going. +\item (Concrete conformance) If the two symbols are both concrete conformance symbols, first compare their protocols using \AlgRef{protocol reduction order}. Return the result if it is ``$<$'' or ``$>$''. Otherwise, keep going. \item (Incomparable) If the two symbols store a different pattern type (by canonical type equality), or a different number of substitution terms, return ``$\bot$''. \item (Initialize) Let $\{\sigma_i\}$ and $\{\uptau_i\}$ be the substitution terms of our two symbols, with $0\le i<\texttt{N}$ for some \texttt{N}. Let $i:=0$. \item (Equal) If $i=\texttt{N}$, return ``$=$''. @@ -475,26 +475,26 @@ \section{The Reduction Order}\label{reduction order} \end{quote} Otherwise, both symbols have the same kind, so we handle each kind by comparing structural components. \index{generic parameter symbol}% -\item (Generic parameter) If both symbols are generic parameter symbols, compare them as in Algorithm~\ref{generic parameter order} (which was defined on canonical generic parameter types, but generic parameter symbols have the same representation as a depth/index pair). +\item (Generic parameter) If both symbols are generic parameter symbols, compare them as in \AlgRef{generic parameter order} (which was defined on canonical generic parameter types, but generic parameter symbols have the same representation as a depth/index pair). \index{name symbol}% \index{identifier}% \item (Name) If both symbols are name symbols, compare the stored identifiers lexicographically. \index{protocol symbol}% -\item (Protocol) If both symbols are protocol symbols, compare them by Algorithm~\ref{protocol reduction order}. +\item (Protocol) If both symbols are protocol symbols, compare them by \AlgRef{protocol reduction order}. \index{layout symbol}% \item (Layout) If both symbols are layout symbols, we use a partial order that we won't define here. For the purposes of this book, \texttt{AnyObject} is the only layout constraint that can be written in the language, so assume the symbols are equal. Return ``$=$''. -\item (Concrete) If both symbols are superclass symbols, concrete type symbols or concrete conformance symbols, compare them using Algorithm~\ref{concrete reduction order}. +\item (Concrete) If both symbols are superclass symbols, concrete type symbols or concrete conformance symbols, compare them using \AlgRef{concrete reduction order}. \index{associated type symbol}% \item (Associated type) Otherwise, we have two associated type symbols $\assocsym{P}{A}$ and $\assocsym{Q}{B}$. First, compare the identifiers \texttt{A} and \texttt{B} lexicographically. Return the result if is ``$<$'' or ``$>$''. -\item (Same name) If both associated types have the same name, compare \texttt{P} with \texttt{Q} using Algorithm~\ref{protocol reduction order} and return the result. +\item (Same name) If both associated types have the same name, compare \texttt{P} with \texttt{Q} using \AlgRef{protocol reduction order} and return the result. \end{enumerate} \end{algorithm} The order among different symbol kinds in Step~1 looks arbitrary, but it has certain significance; even though we cannot fully explain everything yet: \begin{itemize} \item Protocol symbols must precede associated type symbols, so that the term for the protocol \verb|Self| type precedes the term for a dependent member type in the same protocol, \verb|Self.[P]A|. -\item Associated type symbols must precede name symbols, in the same way that bound dependent member types precede unbound dependent member types in the type parameter order. We will see why in Section~\ref{tietze transformations}. +\item Associated type symbols must precede name symbols, in the same way that bound dependent member types precede unbound dependent member types in the type parameter order. We will see why in \SecRef{tietze transformations}. \item Superclass symbols must precede concrete type symbols, because we must maintain compatibility with the old \texttt{GenericSignatureBuilder} minimization algorithm in a certain edge case (Section~TODO). -\item Concrete conformances must precede protocol symbols to ensure correct minimization when a type parameter is subject to both a same-type and conformance requirement (Section~\ref{minimal conformances}). +\item Concrete conformances must precede protocol symbols to ensure correct minimization when a type parameter is subject to both a same-type and conformance requirement (\SecRef{minimal conformances}). \end{itemize} \index{shortlex order}% @@ -503,17 +503,17 @@ \section{The Reduction Order}\label{reduction order} \IndexDefinition{weight function}% The final step is to extend the partial order on symbols defined above to a partial order on terms, which gives us the reduction order used in our rewrite system to compute reduced terms. -We modify the standard shortlex order from Algorithm~\ref{shortlex} as follows. We compare the number of name symbols appearing in each term first, so a \emph{longer} term may precede a shorter term in the reduction order, as long as the number of name symbols decreases; but if the number of name symbols remains the same, we fall through to the standard shortlex order. This is called a \emph{weighted shortlex order} where our \emph{weight function} counts the number of name symbols in each term. In general, this works for any weight function $w\colon A^*\rightarrow\mathbb{N}$ such that $w(xy)=w(x)+w(y)$ for all $x$, $y\in A^*$. The next section shows how the weighted shortlex order is required for correct modeling of Swift's protocol type aliases. +We modify the standard shortlex order from \AlgRef{shortlex} as follows. We compare the number of name symbols appearing in each term first, so a \emph{longer} term may precede a shorter term in the reduction order, as long as the number of name symbols decreases; but if the number of name symbols remains the same, we fall through to the standard shortlex order. This is called a \emph{weighted shortlex order} where our \emph{weight function} counts the number of name symbols in each term. In general, this works for any weight function $w\colon A^*\rightarrow\mathbb{N}$ such that $w(xy)=w(x)+w(y)$ for all $x$, $y\in A^*$. The next section shows how the weighted shortlex order is required for correct modeling of Swift's protocol type aliases. \begin{algorithm}[Term reduction order]\label{rqm reduction order} Takes two terms $t$ and $u$ as input, and returns one of ``$<$'', ``$>$'', ``$=$'' or \index{$\bot$}``$\bot$'' as output. \begin{enumerate} \item (Weight) Let $w(t)$ and $w(u)$ be the number of name symbols appearing in $t$ and $u$, respectively. \item (Less) If $w(t)w(u)$, return ``$>$''. -\item (Shortlex) Otherwise $w(t)=w(u)$, so we compare the terms using Algorithm~\ref{shortlex} and return the result. +\item (Shortlex) Otherwise $w(t)=w(u)$, so we compare the terms using \AlgRef{shortlex} and return the result. \end{enumerate} \end{algorithm} -We previously showed that the standard shortlex order satisfies the conditions of a reduction order from Definition~\ref{reduction order def}. Now we claim that the weighted shortlex order is also a reduction order. +We previously showed that the standard shortlex order satisfies the conditions of a reduction order from \DefRef{reduction order def}. Now we claim that the weighted shortlex order is also a reduction order. \begin{proposition} Let \index{natural numbers}$w\colon A^*\rightarrow\mathbb{N}$ be a weight function satisfying $w(xy)=w(x)+w(y)$. Then the weighted shortlex order induced by $w$ is \index{translation-invariant relation}translation-invariant and \index{well-founded order}well-founded. \end{proposition} \begin{proof} @@ -530,7 +530,7 @@ \section{The Reduction Order}\label{reduction order} \[\cdots}. The \texttt{RewriteContext::getSubstitutionSchemaFromType()} method implements Algorithm~\ref{concretesymbolcons} to build the pattern type and substitution terms from an arbitrary \texttt{Type}. Note that the pattern type is always a canonical type, so type sugar is not preserved when round-tripped through the Requirement Machine, for example when building a new generic signature. +The last three methods take the pattern type as a \texttt{CanType} and the substitution terms as an \texttt{ArrayRef}. The \texttt{RewriteContext::getSubstitutionSchemaFromType()} method implements \AlgRef{concretesymbolcons} to build the pattern type and substitution terms from an arbitrary \texttt{Type}. Note that the pattern type is always a canonical type, so type sugar is not preserved when round-tripped through the Requirement Machine, for example when building a new generic signature. Taking symbols apart: \begin{itemize} @@ -929,7 +929,7 @@ \subsection*{Symbols} Comparing symbols: \begin{itemize} \item \texttt{operator==} tests for equality. -\item \texttt{compare()} is the symbol reduction order (Algorithm~\ref{symbol reduction order}). The return type of \texttt{Optional} encodes the result as follows: +\item \texttt{compare()} is the symbol reduction order (\AlgRef{symbol reduction order}). The return type of \texttt{Optional} encodes the result as follows: \begin{itemize} \item \verb|None|: $\bot$ \item \verb|Some(0)|: $=$ @@ -959,7 +959,7 @@ \subsection*{Symbols} Comparing terms: \begin{itemize} \item \texttt{operator==} tests for equality. -\item \texttt{compare()} is the term reduction order (Algorithm~\ref{rqm reduction order}). The return type of \texttt{Optional} encodes the result in the same manner as \texttt{Symbol::compare()}. +\item \texttt{compare()} is the term reduction order (\AlgRef{rqm reduction order}). The return type of \texttt{Optional} encodes the result in the same manner as \texttt{Symbol::compare()}. \end{itemize} Debugging: \begin{itemize} @@ -981,10 +981,10 @@ \subsection*{Symbols} \apiref{rewriting::RewriteSystem}{class} \IndexSource{rewrite system!in requirement machine} -See also Section~\ref{completion sourceref}. +See also \SecRef{completion sourceref}. \begin{itemize} -\item \texttt{simplify()} reduces a term using Algorithm~\ref{term reduction trie algo}. +\item \texttt{simplify()} reduces a term using \AlgRef{term reduction trie algo}. \end{itemize} \apiref{rewriting::RewriteContext}{class} @@ -996,10 +996,10 @@ \subsection*{Symbols} \apiref{rewriting::Trie}{tempalte class} \IndexSource{trie} -See also Section~\ref{completion sourceref}. +See also \SecRef{completion sourceref}. \begin{itemize} -\item \texttt{insert()} inserts an entry using Algorithm~\ref{trie insert algo}. -\item \texttt{find()} finds an entry using Algorithm~\ref{trie lookup algo}. +\item \texttt{insert()} inserts an entry using \AlgRef{trie insert algo}. +\item \texttt{find()} finds an entry using \AlgRef{trie lookup algo}. \end{itemize} -\end{document} \ No newline at end of file +\end{document} diff --git a/docs/Generics/chapters/type-resolution.tex b/docs/Generics/chapters/type-resolution.tex index acb11ed65d438..6899ac3e67b78 100644 --- a/docs/Generics/chapters/type-resolution.tex +++ b/docs/Generics/chapters/type-resolution.tex @@ -4,891 +4,800 @@ \chapter{Type Resolution}\label{typeresolution} -\IndexDefinition{type representation} -\IndexDefinition{type resolution} -\index{type} -\IndexDefinition{type resolution context} -\IndexDefinition{type resolution flags} -\IndexDefinition{type resolution options} -\IndexDefinition{type resolution stage} -Recall from Chapter~\ref{types} that \emph{type resolution} transforms a type representation read by the \index{parser}parser into a semantic type understood by the type checker. We're now going to take a closer look at this process. - -\index{tree} -Type representations have a recursive tree structure, and type resolution proceeds recursively, first resolving the child nodes of a type representation into types, before forming a more complex type from its constituent parts. The leaf nodes of a type representation are the non-generic \emph{identifier type representations}, such as \texttt{Int}. These name a type declaration and are resolved by querying name lookup. The interior nodes of a type representation include function type representations and tuple type representations, as well as generic identifier type representations which store a list of generic arguments. - -\index{non-escaping function type} -\index{escaping function type} -\index{source location} -\index{declaration context} -\index{unqualified lookup} -The type resolution procedure receives a type representation as input, together with some additional information packaged into a \emph{type resolution context}: +\IndexDefinition{type resolution}\lettrine{T}{ype resolution} transforms the syntactic \IndexDefinition{type representation}type representations produced by the \index{parser}parser into the semantic \index{type}types of \ChapRef{types}. Type representations have a \index{tree}tree structure. The leaf nodes are \emph{identifier type representations} without generic arguments, such as \texttt{Int}. Nodes with children include \emph{member type representations} which recursively store a base type representation, such as \texttt{T.Element}. There are also type representations for function types, metatypes, tuples, and existentials; they have children and follow the same shape as the corresponding kind of type. Finally, identifier and member type representations may have children in the form of generic arguments, such as \texttt{Array}. One of our main goals in this chapter is to understand how type resolution forms a generic nominal type from a reference to a type declaration and a list of generic arguments. + +Type resolution builds the \index{resolved type!z@\igobble|see{type resolution}}\emph{resolved type} by consulting the type representation itself, as well as contextual information describing where the type representation appears: \begin{enumerate} -\item The \emph{declaration context} where the type representation appears. Any identifiers appearing in the type representation are resolved by querying unqualified name lookup from this declaration context. (Recall from Section~\ref{name lookup} that the declaration context alone is actually not enough, because unqualified name lookup also needs a source location, but this is stored in the type representation itself.) -\item A set of \emph{type resolution options} that encode the semantic position where the type representation appears, consisting of a \emph{type resolution context} and \emph{type resolution flags}. -\item A \emph{type resolution stage}, to select between \emph{structural} and \emph{interface} type resolution. +\item Identifier type representations are resolved by unqualified lookup of their identifier; this depends on the type representation's source location. For example, a type representation might name a generic parameter declared in the current scope. +\item Certain type representations are also resolved based on their semantic position. For example, a function type representation appearing in parameter position resolves to a \index{non-escaping function type}non-escaping function type unless annotated with the \texttt{@escaping} attribute; in any other position, a function type representation resolves to an \index{escaping function type}escaping function type. This behavior was introduced in \IndexSwift{3.0}Swift~3~\cite{se0103}. \end{enumerate} -You can see that type resolution depends on context in two different ways: +We encode contextual information in the \IndexDefinition{type resolution context}\emph{type resolution context}, consisting of the below: \begin{enumerate} -\item Name lookup depends on the lexical scope containing the type representation; for example, a type representation might name a generic parameter declaration scoped to the current function. -\item The semantic position of a type representation also influences the outcome. One example is that a function type representation appearing in the parameter list of a function declaration or another function type resolves into a non-escaping function type, unless it was annotated with the \texttt{@escaping} attribute. Elsewhere, such as the type of a variable declaration or in the return type of a function, a function type is always \texttt{@escaping}. This behavior was introduced in Swift 3~\cite{se0103}. +\item The \index{declaration context}\emph{declaration context} where the type representation is written. This is passed to unqualified lookup for identifier type representations. Fully resolving \index{dependent member type}dependent member types also requires the \index{generic signature}generic signature of this declaration context. +\item A \IndexDefinition{type resolution context}\emph{type resolution context} and a set of \IndexDefinition{type resolution flags}\emph{type resolution flags} together encode the semantic position of the type representation inside its declaration context. +\item A \IndexDefinition{type resolution stage}\emph{type resolution stage} specifies if type resolution may \index{generic signature query}query the generic signature of this declaration context. We use type resolution to build the generic signature, but type resolution uses the generic signature to resolve dependent member types and check generic arguments. The staged resolution breaks the \index{request evaluator}\index{request cycle}request cycle by delaying certain semantic checks until the generic signature is available. \end{enumerate} -\paragraph{Type resolution stage} -\index{dependent member type}% -\index{bound dependent member type}% -\index{unbound dependent member type}% -\index{identifier}% -The \emph{type resolution stage} specifies the \emph{temporal} context for type resolution, where some type representations are first resolved before a generic signature has been built for their generic context, and then again after. - -Recall from Section~\ref{fundamental types} that there are two kinds of dependent member types: unbound, and bound. Both kinds store a base type, which is a generic parameter type or another dependent member type. Unbound dependent member types also store an identifier naming an associated type, and bound dependent member types store a reference to an associated type declaration. - -\IndexDefinition{structural resolution stage} -For example, if \texttt{T} is a generic parameter type subject to the conformance requirement \verb|T: Sequence|, the type representation \texttt{T.Element} resolves in one of two ways, depending on the type resolution stage: +The two type resolution stages are \emph{structural} resolution stage and \emph{interface} resolution stage; we say we resolve a type \emph{in} the given stage: \begin{enumerate} -\item In the structural resolution stage, our type representation resolves to an unbound dependent member type, also written as \texttt{T.Element}, which stores the identifier ``\texttt{Element}.'' - -Type resolution does not have any knowledge of the requirements imposed on \texttt{T} in the structural resolution stage, because it is not permitted to ask for the generic signature of the type representation's declaration context. It simply constructs an unbound dependent member type with the identifier from the type representation. -\item In the interface resolution stage, our type representation instead resolves to a bound dependent member type \texttt{T.[Sequence]Element}, which stores a reference to the \texttt{Element} associated type declaration of the \texttt{Sequence} protocol. +\item \index{structural resolution stage}Structural resolution stage does not use the current declaration context's generic signature, and so it doesn't validate type parameters or check \index{generic arguments}generic arguments. -\Index{getRequiredProtocols()@\texttt{getRequiredProtocols()}} -\index{qualified lookup} -\IndexDefinition{interface resolution stage} -Type resolution is allowed to look at the generic signature of the current declaration context in the interface resolution stage. With this generic signature, it issues the \verb|getRequiredProtocols()| generic signature query (Section~\ref{genericsigqueries}) to get a list of all protocol conformance requirements for the base type \texttt{T}. This list of protocols is then handed off to qualified name lookup, which finds the associated type declaration \texttt{Element} of the \texttt{Sequence} protocol. +\item \index{interface resolution stage}Interface resolution stage requests the current context's generic signature first, and issues generic signature queries against this signature to perform semantic checks. \end{enumerate} -\index{generic signature request} -\index{interface type request} -\Index{where clause@\texttt{where} clause} -\index{inheritance clause} -\index{interface type} -\index{value declaration} -This design breaks the inherent circularity between type resolution when building a generic signature, and type resolution when computing the interface type of a declaration, which depends on the generic signature to perform semantic checks: -\begin{itemize} -\item The structural resolution stage is used by the \Request{generic signature request} to resolve type representations appearing in generic parameter inheritance clauses, trailing \texttt{where} clauses, as well as function and subscript parameter lists (these feed into requirement inference, which you will meet in Section~\ref{requirementinference}). +The \index{generic signature request}\Request{generic signature request} resolves various type representations so that it can build a generic signature from user-written requirements, as we will see in \ChapRef{building generic signatures}. This must be done in the structural resolution stage. -\item The interface resolution stage is used by the \Request{interface type request} to resolve type representations that form the interface type of a value declaration. In this way, the \Request{interface type request} depends on the \Request{generic signature request}. -\end{itemize} -As the next example shows, there is some overlap between the type representations resolved by the two requests, with some type representations getting resolved twice in the two type resolution stages. +The \index{interface type request}\Request{interface type request} resolves type representations in the interface resolution stage, to form a \index{value declaration}value declaration's \index{interface type}interface type from semantically well-formed types. Thus, when building the interface type of a declaration, we will evaluate the \Request{generic signature request}. -\begin{example} -The return type of this function declaration contains a member type: -\begin{Verbatim} -func union(_: T, _: U) -> Set - where T.Element == U.Element -\end{Verbatim} -The \Request{generic signature request} resolves various type representations appearing above in the structural resolution stage: +Structural resolution stage differs from interface resolution stage in two respects: \begin{itemize} -\item The inheritance clause entry \texttt{Sequence} of \texttt{T}; -\item The inheritance clause entry \texttt{Sequence} of \texttt{U}; -\item The left hand side of the same-type requirement, \texttt{T.Element}; -\item The right hand side of the same-type requirement, \texttt{U.Element}; -\item The types of the function's parameters, \texttt{T} and \texttt{U}; -\item The return type of the function, \texttt{Set}. +\item References to associated types resolve to \index{unbound dependent member type}unbound dependent member types in the structural resolution stage, and references to invalid member types are not \index{diagnostic!invalid member type}diagnosed. We describe this in \SecRef{member type repr}. +\item Generic arguments are not checked to satisfy the requirements of a generic nominal type in the structural resolution stage. We'll describe checking generic arguments in \SecRef{checking generic arguments}. \end{itemize} -The type representations \texttt{T.Element} and \texttt{U.Element} resolve to unbound member types during structural resolution. Requirement inference also introduces the conformance requirement \verb|T.Element: Hashable| from the application of \texttt{Set<>} to \texttt{T.Element} in the return type. -All of this information feeds into requirement minimization, which constructs a generic signature with a minimal, reduced list of requirements consisting of bound dependent member types: -\begin{quote} -\begin{verbatim} - -\end{verbatim} -\end{quote} -The \Request{interface type request} resolves the parameter and return types of our function again, this time in the interface resolution stage. The parameter types remain unchanged, but the return type becomes \texttt{Set}. +All invocations of type resolution ``downstream'' of the generics implementation must use the interface resolution stage, to not admit invalid types. To emit the full suite of \index{diagnostic!type resolution}diagnostics for type representations resolved in the structural stage, in particular inheritance clauses and trailing \texttt{where} clauses of generic declarations, the \index{type check source file request}\Request{type check source file request} revisits these type representations and resolves them again in the interface resolution stage. -\index{generic function type} -The interface type of \texttt{merge()} is a generic function type, constructed from the generic signature, parameter types and return type: -\begin{quote} -\begin{verbatim} - (S1, S2) -> Set -\end{verbatim} -\end{quote} -\end{example} +While the structural resolution stage skips some semantic checks, it can still produce diagnostics; name lookup can fail to resolve an identifier, and certain simpler semantic invariants are still enforced, such as checking that the \emph{number} of generic arguments is correct. For these reasons, care must be taken to not emit the same diagnostics twice if the same invalid type representation is resolved in both stages. -\paragraph{Practical considerations} -While structural resolution skips many semantic checks, there is some overlap between the work performed by the two stages and care must be taken to not emit the same diagnostics twice if the type representation is invalid. To help deal with this, each type representation has an ``invalid'' flag. After emitting a diagnostic, the invalid flag should be set and an error type returned. Resolving an invalid type representation again will short-circuit and immediately return an error type, skipping the actual type resolution process. Note that a type representation can only transition from a valid state to invalid, and never back again. +The general mechanism for this is to have each type representation store an \index{invalid type representation}``invalid'' flag; after diagnosing an error, type resolution sets the invalid flag and returns an \index{error type}error type as the resolved type. If an invalid type representation is resolved again, type resolution immediately returns another error type without visiting the type representation or emitting any new diagnostics. A type representation can only transition from valid to invalid, and never back again. -If you use the structural resolution stage, keep in mind that unbound dependent member types are not universally accepted by the type checker. For example, generic signature queries accept unbound dependent member types as long as they correspond to a valid type parameter for the given generic signature. On the other hand, type substitution will assert if the original type contains an unbound dependent member type. +\section{Identifier Type Representations}\label{identtyperepr} -\Index{getReducedType()@\texttt{getReducedType()}} -\Index{isValidTypeParameter()@\texttt{isValidTypeParameter()}} -\index{type parameter order} -\index{bound dependent member type} -\index{unbound dependent member type} -Recall that the linear order on type parameters was defined such that bound dependent member types precede unbound dependent member types, in Section~\ref{typeparams}. In particular, if you encounter an unbound dependent member type and need to convert it to bound, the \texttt{getReducedType()} generic signature query will do the trick. To determine whether an unbound dependent member type corresponds to a valid type parameter at all, use the \texttt{isValidTypeParameter()} generic signature query. +An \IndexDefinition{identifier type representation}\emph{identifier type representation} is a single identifier that names a type declaration in some outer scope. We find the \emph{resolved type declaration} via \index{unqualified lookup}unqualified lookup, starting from the source location of the type representation (\SecRef{name lookup}). We then form the resolved type, which will be a nominal type, type alias type, generic parameter type, or dependent member type, depending on the type declaration's kind. We show some examples before describing the general principle. -\section{Identifier Type Representations}\label{identtyperepr} +\paragraph{Nominal types.} +A top-level non-generic \index{nominal type}nominal type declaration declares a single type, which we referred to as the \index{declared interface type}declared interface type in \ChapRef{decls}. Here, the type representation resolves to the \index{struct type}struct type \texttt{Int} declared by the standard library: +\begin{Verbatim} +var x: Int = ... +\end{Verbatim} -\index{generic arguments} -\IndexDefinition{identifier type representation} -An \emph{identifier type representation} consists of one or more \emph{components}, separated by dot in written syntax. Each component stores an identifier, together with an optional list of one or more generic arguments, where each generic argument is again recursively a type representation. Identifier type representations are very general; by naming different kinds of type declarations, they can resolve to nominal types, type aliases, generic parameters, and dependent member types. Examples of identifier type representations: -\begin{quote} -\begin{verbatim} -T -T.Element -Array -Foo.Bar<(Int) -> ()>.Baz -\end{verbatim} -\end{quote} -Type resolution visits the components of an identifier type representation from left to right, resolving each component to a type and then using the result as the base type for resolving the next component, and so on. If a component carries generic arguments, their type representations are resolved and checked against the generic requirements of the component's type declaration. The type of the final component becomes the resolved type of the entire identifier type representation. +If a nominal type declaration has a \index{generic parameter list}generic parameter list, it instead declares a new type for every possible list of \index{generic arguments}generic arguments. Both identifier and member type representations may contain a list of generic arguments; we discuss how generic arguments are applied and checked in \SecRef{checking generic arguments}. + +We allow the generic arguments to be omitted when the resolved type declaration's generic parameters are visible from the type representation's lexical scope; the generic \emph{arguments} of the resolved type are taken to be the generic \emph{parameters} of the resolved type declaration. In other words, the resolved type becomes the nominal type declaration's \index{declared interface type}declared interface type, without a substitution map applied. For example: +\begin{Verbatim} +struct Outer { + // Interface type of `x' is `Optional.Inner>' + var x: Inner? -\index{unqualified lookup} -\paragraph{Unqualified lookup} The first component refers to a type declaration in the lexical scope of the type representation's source location. Type resolution queries unqualified name lookup to find this type declaration (Section~\ref{name lookup}). + class Inner { + // Interface type of `y' is `Outer' + var y: Outer + } -\index{module declaration} -Unqualified lookup will return a module declaration if the first component names a module. In this case, the identifier type representation must have at least one additional component naming a type declaration inside this module. Modules can only be used as the base of a lookup, and are not first-class entities which can be referenced on their own (except from an \texttt{import} declaration, but those do not go through type resolution). An example is \texttt{Swift.Int}, which has two components, \texttt{Swift} naming the standard library module, and \texttt{Int} naming the type declaration. + struct GenericInner { + // Return type of `f' is `Outer.GenericInner' + func f() -> GenericInner {} + } +} +\end{Verbatim} +\SecRef{unbound generic types} describes another special case where generic arguments can be omitted when referencing a generic nominal type. -\index{generic parameter declaration} -\IndexDefinition{generic parameter list} -Unqualified lookup will find generic parameter declarations if one of the outer declaration contexts has a generic parameter list. In this manner, if the first component names a generic parameter declaration, the type representation will resolve to a generic parameter type (if this is the only component) or a dependent member type thereof (if there is more than one component). +The next series of examples will show that a \index{substitution map}substitution map may need to be \index{type substitution}applied to the declared interface type. Recall the discussion from \SecRef{name lookup} and \ChapRef{decls}. As unqualified lookup visits each outer scope, it considers the scope kind and takes appropriate action to resolve identifiers for that kind of scope. If the scope is a nominal type declaration or extension, unqualified lookup performs a \index{qualified lookup}qualified lookup into this nominal type to search for member types of the nominal type. -\Index{where clause@\texttt{where} clause} -\index{associated type declaration} -\index{protocol declaration} -Generic parameters are always visible to unqualified lookup from inside the \texttt{where} clause of a nominal type or extension. However, member types are only visible from the \texttt{where} clause of a protocol or protocol extension (Section~\ref{protocols}). This allows the associated types of a protocol to be referenced from its \texttt{where} clause, without the explicit ``\texttt{Self.}'' prefix. In other kinds of nominal type declarations, type representations written in the \texttt{where} clause cannot reference member types from the first component; they must be referenced as member types of the nominal type. Listing~\ref{where clause unqualified} demonstrates this behavior. +Unqualified lookup might find a type declaration that is not a direct member of an outer scope. Instead, the resolved type declaration might be a member of a \index{superclass type}superclass or conformed protocol of some outer type declaration. In this case, the declared interface type is written in terms of type parameters that are not in our current scope, so we must apply a substitution map. The substitution map's \index{input generic signature}input generic signature is that of the resolved type declaration, and its \index{output generic signature}output generic signature is the generic signature of the innermost generic declaration that contains our type representation. -\begin{listing}\captionabove{Unqualified lookup from a \texttt{where} clause}\label{where clause unqualified} +If the resolved type declaration was found in a superclass, this substitution map is the \index{superclass substitution map}\emph{superclass substitution map}, which we will describe completely in \ChapRef{classinheritance}. In the below example, unqualified lookup of \texttt{Inner} inside \texttt{Derived} finds the member of \texttt{Base}. The generic parameter \texttt{T} of \texttt{Base} is always \texttt{Int} in \texttt{Derived}, so the superclass substitution map we apply to members of \texttt{Base} when seen from \texttt{Derived}, is $\SubstMap{\SubstType{T}{Int}}$: \begin{Verbatim} -// This is OK; `A' resolves to `Self.[P]A' -protocol P where A: Hashable { - associatedtype A +class Base { + struct Inner {} } -// This is not OK; typealias `A' is not visible from the `where' clause -struct G where A: Hashable { - typealias A = T +class Derived: Base { + // Interface type of `x' is `Base.Inner' + var x: Inner = ... } \end{Verbatim} -\end{listing} - -\index{declared interface type} -\Index{protocol Self type@protocol \texttt{Self} type} -\Index{dynamic Self type@dynamic \texttt{Self} type} -If the first component names the innermost type declaration or one of its parent type declarations without specifying generic arguments, type resolution implicitly applies the type declaration's generic parameter types as generic arguments. In order words, the component resolves to the declared interface type of this type declaration. -The identifier \texttt{Self} serves a similar purpose since Swift 5.1~\cite{se0068}. If the innermost type declaration is a struct or enum, \texttt{Self} stands for its declared interface type. If the innermost type declaration is a class, \texttt{Self} stands for the declared interface type of the class wrapped in a dynamic \texttt{Self} type (Section~\ref{misc types}). Inside a protocol or protocol extension, \texttt{Self} is just the protocol's implicit generic parameter named \texttt{Self}, not a special case (Section~\ref{protocols}). Listing~\ref{type resolution unqualified} demonstrates some of the above behaviors. - -\begin{listing}\captionabove{Referencing a type declaration from inside its body}\label{type resolution unqualified} +One edge case here is that if a protocol or protocol extension imposes a \index{superclass requirement}superclass requirement on \texttt{Self}, we might find a member of this superclass when resolving an identifier type representation inside our protocol or protocol extension. Swift does not support recursive superclass requirements, so the superclass bound of \texttt{Self} in this case must always be a \index{fully-concrete type}fully-concrete type. To continue our example, we can reference \texttt{Inner} from a protocol extension of \texttt{Proto}, which requires that \texttt{Self} inherit from \texttt{Derived}: \begin{Verbatim} -struct Outer { - struct Inner { - // Return type is `Outer.Inner' - func f1() -> Self {} +protocol Proto: Derived {} - // Return type is `Outer.Inner' - func f2() -> Inner {} +extension Proto { + // Return type of `f' is `Base.Inner' + func f() -> Inner {} +} +\end{Verbatim} - // Return type is `Outer' - func f3() -> Outer {} - } - - class Class { - // Return type is the dynamic Self type of `Outer.Class' - func f1() -> Self {} - - // Return type is `Outer.Class' - func f2() -> Class {} +\paragraph{Generic parameters.} +Unqualified lookup will find \index{generic parameter declaration}generic parameter declarations if any of the outer declaration contexts have \index{generic parameter list}generic parameter lists. +\begin{Verbatim} +struct G { + func f(...) { + var x: T = ... // Canonical type of `x' is τ_0_0 + var y: U = ... // Canonical type of `y' is τ_1_0 } } +\end{Verbatim} +The resolved type is the declared interface type of the generic parameter declaration, which is the corresponding \index{generic parameter type}generic parameter type. +\paragraph{The identifier Self.} +Speaking of \texttt{Self}, inside a protocol or protocol extension this refers to the \Index{protocol Self type@protocol \texttt{Self} type}implicit generic parameter named \texttt{Self}, also known as $\rT$ (\SecRef{protocols}), which we resolve like a reference to any other named other generic parameter: +\begin{Verbatim} protocol Proto { - // Return type is the `Self' generic parameter of the protocol - func f1() -> Self + // Return type of `f' is the `Self' generic parameter of `Proto' + func f() -> Self } \end{Verbatim} -\end{listing} - -\index{qualified lookup} -\index{identifier} -\paragraph{Qualified lookup} Every subsequent component after the first names a member type of the previous component. There are two cases to consider; the first is where the type of the previous component is not a type parameter, in which case it must be a struct, enum or class, type, or a dynamic \texttt{Self} type wrapping a class type. The second case is where the type of the previous component is a type parameter. -When the type of the previous component is not a type parameter, member types are resolved via a qualified name lookup with the previous component's type as the base type of the lookup (Section~\ref{name lookup}). +Inside the source range of a \index{struct type}struct or \index{enum type}enum declaration or an extension thereof, \texttt{Self} is shorthand for the \index{declared interface type}declared interface type of this nominal type declaration. This is not a generic parameter type at all, but rather a nominal type, generic or non-generic: +\begin{Verbatim} +struct Outer { + // Return type of `f' is `Outer' + func f() -> Self {} +} +\end{Verbatim} -When the type of the previous component is a type parameter, the behavior of type resolution depends on the type resolution stage: -\begin{itemize} -\item In the structural resolution stage, type resolution constructs an unbound dependent member type from the base type parameter and the current component's identifier. The lookup cannot fail; if the identifier does not name a valid member type, the error is diagnosed later, when the type representation is resolved again in the interface resolution stage. -\item In the interface resolution stage, type resolution collects a list of declarations with the \texttt{getRequiredProtocols()} and \texttt{getSuperclassBound()} generic signature queries. This list of declarations is then handed off to a qualified lookup. +Inside the source range of a class declaration or an extension of one, \texttt{Self} stands for the \Index{dynamic Self type@dynamic \texttt{Self} type}dynamic \texttt{Self} type, wrapping the declared interface type of the class: +\begin{Verbatim} +extension Outer { + class InnerClass { + // Return type of `f' is the dynamic Self type of + // `Outer.InnerClass' + func f() -> Self {} + } +} +\end{Verbatim} -Conceptually, this means that associated types, protocol type aliases and member types of the superclass bound are all visible as member types of a type parameter. -\end{itemize} +The dynamic \texttt{Self} type was previously described in \SecRef{misc types}. Historically, Swift only had \texttt{Self} in protocols and dynamic \texttt{Self} in classes, and the latter could only appear in the return type of a method. \IndexSwift{5.1}Swift~5.1 introduced the ability to state dynamic \texttt{Self} in more positions, and also refer to ``static'' \texttt{Self} inside struct and enum declarations~\cite{se0068}. -\index{type witness} -\index{improper dependent member type} -\index{global conformance lookup} -\index{associated type inference} -If the base type is a struct, enum or class type, qualified name lookup might find an associated type declaration from a protocol that the base type conforms to. Recall from Section~\ref{type witnesses} that the conformance checker inserts synthesized type alias members when a type witness is inferred from another protocol requirement. However, depending on request evaluation order, type resolution might find the associated type declaration before the conformance checker has run. Type resolution will never form a dependent member type with a concrete base type. Instead, we perform a global conformance lookup of the base type to the associated type's protocol, and project the corresponding type witness from this conformance, which will trigger associated type inference and synthesize the corresponding type alias. +\paragraph{Type aliases.} +Identifier type representations can also refer to type alias declarations, generalizing the behavior described for nominal type declarations above. Once again, we take the declared interface type of the type alias declaration, and possibly apply a substitution map. While the declared interface type of a nominal type declaration is a nominal type, the declared interface type of a \index{type alias declaration}type alias declaration is a \index{type alias type}type alias type. This is a sugared type, \index{canonical type}canonically equal to the \index{underlying type}underlying type of the type alias declaration. -\begin{listing}\captionabove{Resolving a reference to an associated type member of a concrete type}\label{associated type of concrete type} +If the named type alias declaration is in a local context, the resolved type is the declared interface type: \begin{Verbatim} -struct Egg {} +func f() { + typealias A = (Int, T) -protocol Animal { - associatedtype CommodityType + // Interface type of `a' is canonically equal to `(Int, T)' + let a: A = ... +} +\end{Verbatim} - func produce() -> CommodityType +If the resolved type alias declaration is a member of a superclass of some outer type declaration, we must apply a substitution map, just like we do when resolving a nominal type found inside a superclass: +\begin{Verbatim} +class Base { + typealias InnerAlias = T? } -struct Chicken: Animal { - func produce() -> Egg {...} +class Derived: Base { + // Return type of `f' is canonically equal to `Optional' + func f() -> InnerAlias {} } +\end{Verbatim} + +As explained in \SecRef{nested nominal types}, a nominal type declaration cannot be a member of a protocol or protocol extension, but a type alias declaration can. We say it's a \IndexDefinition{protocol type alias}\emph{protocol type alias}. Unqualified lookup will find such a type alias declaration from within the scope of any nominal type declaration that conforms to this protocol. We discuss protocol type aliases in the next section when we talk about member type representations. -func cookOmelette(_ egg: Chicken.CommodityType) {} +\paragraph{Associated types.} +If the identifier type representation is located within the source range of a \index{protocol declaration}protocol or protocol extension, the protocol's \index{associated type declaration}associated type declarations are visible to unqualified lookup. The resolved type is the declared interface type of the associated type declaration, which is a \index{dependent member type}dependent member type around ``\texttt{Self}'': +\begin{Verbatim} +protocol Pair { + associatedtype A + associatedtype B + + // Interface type of `a' is `Self.[P]A' + var a: A { get } + + // Interface type of `b' is `Self.[P]B' + var b: B { get } +} \end{Verbatim} -\end{listing} -\begin{example} In Listing~\ref{associated type of concrete type}, the type representation \texttt{Chicken.CommodityType} has two components. Type resolution proceeds as follows: -\begin{enumerate} -\item The first component is resolved by performing an unqualified lookup of \texttt{Chicken}, which finds the struct declaration \texttt{Chicken}. The declared interface type of this struct declaration is the struct type \texttt{Chicken}, which becomes the resolved type of the first component. -\item The second component is resolved via a qualified lookup of \texttt{CommodityType} into the base type \texttt{Chicken}. The qualified lookup will find the associated type declaration \texttt{CommodityType} of the \texttt{Animal} protocol, because \texttt{Chicken} conforms to \texttt{Animal}. +Associated type declarations are also visible from within the protocol's \index{conforming type}conforming types. Recall that associated types can be \index{type witness}witnessed by generic parameters, member type declarations, or \index{associated type inference}inference (\SecRef{type witnesses}). When the type witness is a generic parameter or member type, unqualified lookup will always find the witness \emph{before} the associated type declaration. However, if the type witness is inferred, unqualified lookup will find the associated type declaration: +\begin{Verbatim} +struct S: Pair { + // Explicit type witness: + typealias A = Int + + // Inferred type witness: + // typealias B = String -\item Type resolution projects the type witness of \texttt{CommodityType} from the conformance \texttt{Chicken:\ Animal}, which triggers associated type inference. The \texttt{Chicken} type declaration does not declare a member type named \texttt{CommodifyType}, so associated type inference derives the type witness from the type of the \texttt{produce()} method of \texttt{Chicken}. The return type of this method is \texttt{Egg}. This synthesizes the type alias member \texttt{CommodityType} of \texttt{Chicken}, with underlying type \texttt{Egg}. -\end{enumerate} -So the type representation \texttt{Chicken.CommodityType} resolves to the struct type \texttt{Egg}. -\end{example} + var a: A // resolved type is canonically equal to `Int' + var b: String // `B == String' is inferred from `b' -Regardless of whether the base type is a type parameter, concrete type or protocol type, qualified name lookup can also find type alias members of protocols. This exciting possibility is discussed in Section~\ref{protocol type alias}. + // Return type of `f' and `g' is canonically equal to `String' + func f() -> B {} + func g() -> B {} +} +\end{Verbatim} -\index{generic arguments} -\paragraph{Applying generic arguments} So far we've seen how type resolution uses name lookup to find type declarations, but we've skipped over the important detail of how we can go from a type declaration to the resolved \emph{type}. +Note that \texttt{f()} and \texttt{g()} have the same return type, but the two type representations are resolved in a slightly different manner. Let's assume that we compute the interface type of \texttt{f()} first (but the other order might also arise, depending on how the remainder of the program is structured). -\index{declared interface type} -If the type declaration is not generic, the resolved type is just the declared interface type of the type declaration. If the type declaration is generic, type resolution must apply a substitution map to the type declaration's declared interface type. +Resolving the return type of \texttt{f()}, we find the associated type declaration \texttt{B} of \texttt{Pair}. Type resolution performs a \index{global conformance lookup}global conformance lookup to find the concrete conformance $\ConfReq{S}{Pair}$, then projects the type witness for \texttt{B} from this conformance. This evaluates the \Request{type witness request}, which \emph{synthesizes} the type alias \texttt{B} inside \texttt{S}. The declared interface type of \texttt{S.B}, canonically equal to \texttt{String}, is returned as the type witness. -When a type representation refers to a nested generic type, each component provides a list of generic arguments for each level of nesting. As each component is resolved, type resolution collects the outer generic arguments applied so far together with the new generic arguments of each component into a substitution map, as follows. +Resolving the return type of \texttt{g()}, we now find the type alias \texttt{S.B} we just synthesized. This shows that associated type inference has a \index{side effect}side effect on the member lookup table of~\texttt{S}, but it effectively acts as a form of lazy caching; the first lookup triggers the synthesis of this member. -When resolving the first component, the only possibilities are that the named type declaration does not have any outer generic parameters at all, or that it is was found in an outer generic context of the generic context containing the type representation. In this case, the outer generic parameters, if any, are mapped to themselves. +\paragraph{Modules.} +A \index{module declaration}module declaration is a special kind of type declaration that can only be used as the base of a member type representation. Otherwise a bare module is an error: +\begin{Verbatim} +_ = Swift.self // error: expected module member name after module name +\end{Verbatim} +Of course \index{import declaration}\texttt{import} declarations are usually followed by a bare module name, but they are resolved by different means than type resolution. -\index{context substitution map} -When resolving subsequent components, type resolution recovers the outer generic arguments from the resolved type of the previous component. In Section~\ref{contextsubstmap}, you saw the concept of a context substitution map with respect to a declaration context. When type resolution finds a type declaration with a qualified lookup on the base type, the outer generic arguments are derived by taking the context substitution map of the base type with respect to the found type declaration's declaration context. +\paragraph{Summary.} If the resolved type declaration is in \index{local type declaration}local context, or at the \index{top-level type declaration}top level of a source file, the resolved type is just its \index{declared interface type}declared interface type. Otherwise, we found the resolved type declaration by performing a \index{qualified lookup}qualified lookup into some outer nominal type or extension. In this case, the resolved type declaration might be a \emph{direct} member of the outer nominal, or a member of a superclass or protocol. In the direct case, or if we started from a protocol and found a member of another protocol, we again return the member's \index{declared interface type}declared interface type. Otherwise, we build a substitution map whose \index{output generic signature}output generic signature is the generic signature of the outer nominal type or extension. We apply it to the member's declared interface type to get the resolved type: +\begin{itemize} +\item +In the \textbf{superclass case}, we take the outer nominal's superclass bound and the member's parent class declaration, and build the \index{superclass substitution map}superclass substitution map. +\item +In the \textbf{protocol case}, we take the outer nominal's declared interface type and the member's parent protocol, and build the \index{protocol substitution map}protocol substitution map from the conformance. (We described this as projecting a type witness from a conformance instead of applying a substitution map; we'll say more about the protocol case in the next section, when we generalize these concepts further.) +\end{itemize} -\begin{listing}\captionabove{Applying generic arguments in type resolution}\label{applying generic arguments} +\paragraph{Source ranges.} +In the \index{scope tree}scope tree, a nominal type or extension declaration actually defines \emph{two} scopes, one nested within the other. The smaller source range contains the \emph{body} only, from ``\verb|{|'' to ``\verb|}|''. The larger source range starts with the declaration's opening keyword, such as ``\texttt{class}'' or ``\texttt{extension}'', and continues until the ``\verb|}|''. In particular, the \Index{where clause@\texttt{where} clause}\texttt{where} clause is inside the larger source range, but outside of the the smaller source range. Within a protocol or protocol extension, the protocol's \index{associated type declaration}associated type members (and type alias members, too) are always visible in both scopes: +\begin{Verbatim} +// This is OK; `Element' resolves to `Self.[Collection]Element' +extension Collection where Element == Int {...} +\end{Verbatim} +Within a nominal type declaration, generic parameter declarations are visible in the larger scope, but member types can only be seen in the body: \begin{Verbatim} -func f1() -> Outer.Inner {} +// `A' cannot be referenced here: +struct G where A: Equatable { + typealias A = T -struct Outer { - struct Inner { - // Return type resolves to `Outer.Inner` - func f2() -> Inner {} - } + // `A' can be referenced here: + var a: A = ... } \end{Verbatim} -\end{listing} -\begin{example} Listing~\ref{applying generic arguments} shows two functions \texttt{f1()} and \texttt{f2()} whose return types are identifier type representations demonstrating some of these behaviors. +Prior discussion of name lookup from protocol contexts appeared in \SecRef{protocols}. -The return type of \texttt{f1()} is resolved as follows: +\section{Member Type Representations}\label{member type repr} + +A \IndexDefinition{member type representation}\emph{member type representation} consists of a \emph{base} type representation together with an identifier, joined by ``\verb|.|'' in the concrete syntax. The base might be an identifier type representation, or recursively, another member type representation. The base may also have generic arguments. The general procedure for resolving a member type representation is the following. We start by recursively resolving the base type reprensetation; then, we issue an \index{qualified lookup}qualified lookup (\SecRef{name lookup}) to look for a member type declaration with the given name, inside the resolved base type. This finds the resolved type declaration, from which we compute the resolved type by applying a substitution map. + +The resolved types obtained this way include nominal types, dependent member types, and type alias types. We classify the various behaviors by considering each kind of base type in turn, and describing its member types. + +\paragraph{Module base.} +When the base is an identifier naming a module, the member type representation refers to a top-level type declaration within this module: +\begin{Verbatim} +var x: Swift.Int = ... +\end{Verbatim} +The type declarations referenced this way are those that can appear at the top level of a module, namely nominal types and type aliases. The resolved type is the declared interface type of the declaration. + +\paragraph{Type parameter base.} +As we alluded at the beginning of this chapter, the behavior of member type resolution with a \index{type parameter}type parameter base depends on the \index{type resolution stage}type resolution stage. We will begin with a description of the \index{interface resolution stage}interface resolution stage, even though it happens last, because it implements the ``complete'' behavior. + +\smallskip + +\emph{Interface resolution stage.} In this stage we interpret the type parameter relative to the current \index{declaration context}declaration context's \index{generic signature}generic signature. The \index{derived requirement}requirements imposed on this type parameter give a list of protocols, and possibly a superclass bound, or a concrete type. The member type declarations of these types are the member type declarations of our type parameter. This includes: \begin{enumerate} -\item The first component refers to \texttt{Outer}, whose generic signature has a single innermost generic parameter and no outer generic parameters. The resolved type is obtained by forming a substitution map with the component's generic argument and applying it to the declared interface type of \texttt{Outer}: -\[\texttt{Outer}\otimes\SubstMap{\SubstType{T}{Int}}=\texttt{Outer}\] -\item The second component refers to \texttt{Inner}, whose generic signature has a single innermost generic parameter, as well as the outer parameter from the generic signature of \texttt{Outer}. The resolved type is obtained by forming a substitution map from the context substitution map of the previous component's type together with the second component's generic argument, and applying it to the declared interface type of \texttt{Outer.Inner}: -\[\texttt{Outer.Inner}\otimes\SubstMap{\SubstType{T}{Int}\\ -\SubstType{U}{String}}=\texttt{Outer.Inner}\] +\item Associated type declarations from all conformed protocols. +\item Type alias declarations from all conformed protocols and their extensions. +\item Member types of the concrete superclass type, if any. +\item If the type parameter is fixed to a concrete type via a same-type requirement, the members of this concrete type. \end{enumerate} -The return type of \texttt{f2} has a single component, \texttt{Inner}. Here, unqualified lookup finds \texttt{Outer.Inner}, because the source location of the type representation is inside the source range of \texttt{Outer}. The outer generic parameter \texttt{T} is mapped to itself, so type resolution applies the following substitution map to the declared interface type of \texttt{Outer.Inner}: -\[\texttt{Outer.Inner}\otimes\SubstMap{\SubstType{T}{T}\\ -\SubstType{U}{String}}=\texttt{Outer.Inner}\] +We get the list of types that qualified lookup must look inside by collecting the results of the \Index{getRequiredProtocols()@\texttt{getRequiredProtocols()}}$\Query{getRequiredProtocols}{}$, \Index{getSuperclassBound()@\texttt{getSuperclassBound()}}$\Query{getSuperclassBound}{}$, and \Index{getConcreteType()@\texttt{getConcreteType()}}$\Query{getConcreteType}{}$ \index{generic signature query}generic signature queries (\SecRef{genericsigqueries}). -\end{example} +The first case, where qualified lookup finds an associated type declaration, is extremely important; this is how we form type parameters recursively in type resolution. If some type parameter \texttt{T} conforms to a protocol, say \texttt{Provider}, and this protocol declares an associated type \texttt{Entity}, then type resolution of ``\texttt{T.Entity}'' will find the associated type declaration \texttt{Entity} by performing a qualified lookup into \texttt{Provider}. +\begin{Verbatim} +protocol Provider { + associatedtype Entity +} -\paragraph{An incorrect simplification} -The type resolution process for a nested generic type might appear unnecessarily convoluted at first sight. At each step, we collect the outer generic arguments by constructing a substitution map from the previous component's resolved type, extend the substitution map with the current component's generic arguments, and apply it to the declared interface type of the type declaration found via name lookup. Instead, could we perform a chain of name lookups to find the final type declaration, then collect all of the generic arguments and apply them in one shot? Unfortunately, the answer is ``no''! The next example shows why this appealing simplification does not handle the full generality of type resolution. +struct G { + var x: T.Entity = ... +} +\end{Verbatim} +To obtain the resolved type \texttt{T.[Provider]Entity} from \texttt{Self.[Provider]Entity} (the declared interface type of the \texttt{Entity} associated type), we must replace \texttt{Self} with \texttt{T}. We do this by applying the \index{protocol substitution map}protocol substitution map, which satisfies the protocol generic signature $G_\texttt{Provider}$ with the \index{abstract conformance}abstract conformance $\ConfReq{T}{Provider}$: \[\Sigma_{\ConfReq{T}{Provider}}:=\SubstMapC{\SubstType{Self}{T}}{\SubstConf{Self}{T}{Provider}}\] +Applying $\Sigma_{\ConfReq{T}{Provider}}$ to \texttt{Self.[Provider]Entity} projects the type witness from this conformance, with an equation we've seen a few times now, for example in \SecRef{abstract conformances}: +\begin{gather*} +\texttt{Self.[Provider]Entity}\otimes\Sigma_{\ConfReq{T}{Provider}}\\ +\qquad\qquad{}=\AssocType{[Provider]Entity}\otimes\ConfReq{Self}{Provider}\otimes\Sigma_{\ConfReq{T}{Provider}}\\ +\qquad\qquad{}=\AssocType{[Provider]Entity}\otimes\ConfReq{T}{Provider}\\ +\qquad\qquad{}=\texttt{T.[Provider]Entity} +\end{gather*} -\begin{listing}\captionabove{The named type declaration of a component can depend on previously-applied generic arguments}\label{type resolution with dependent base} +The second case, where we found a \index{protocol type alias}\index{type alias declaration}protocol type alias as a member of the base, is closely related. The resolved type is again obtained by replacing all occurrences of \texttt{Self} in the \index{underlying type}underlying type of the type alias with the resolved base type of the member type representation. We're also going to make things slightly more interesting by using a dependent member type \texttt{T.Element} as the base type, rather than the generic parameter~\texttt{T}: \begin{Verbatim} -struct Paul { - struct Pony {} +protocol Subscriber { + associatedtype Parent: Provider + typealias Content = Self.Parent.Entity // or just Parent.Entity } -struct Maureen { - struct Pony {} +func process(_: T) where T.Element: Subscriber { + // Interface type of `x' is canonically equal to + // `T.Element.Parent.Entity' + let x: T.Element.Content = ... } +\end{Verbatim} +The above behavior can be explained by applying the protocol substitution map to the declared interface type of the type alias declaration: +\begin{multline*} +\texttt{Self.[Subscriber]Parent.[Provider]Entity}\otimes\Sigma_{\ConfReq{T.Element}{Subscriber}}\\ +=\texttt{T.[Sequence]Element.[Subscriber]Parent.[Provider]Entity} +\end{multline*} +In this example, the underlying type of the protocol type alias was a type parameter, but it can also be a concrete type that recursively contains type parameters, too. -struct Person { - typealias Rides = T -} +\smallskip -struct Misty {} +References to associated type declarations and protocol type aliases are resolved by the same rule: we apply a protocol substitution map to the declared interface type of the resolved type declaration. We assumed the base type is a type parameter, so we form a protocol substitution map from an abstract conformance. We'll see shortly that when the base type is concrete, we can still reference associated type declarations and protocol type aliases, and we form a protocol substitution map from a \index{concrete conformance}concrete conformance instead. -typealias A = Person.Rides.Pony -typealias B = Person.Rides.Pony +\emph{Structural resolution stage.} Before discussing the two remaining possible kinds of member types with a type parameter base, let's take a moment to describe the \index{structural resolution stage}structural resolution stage. In the \index{structural resolution stage}structural resolution stage, we don't have a generic signature, so we cannot perform generic signature queries or qualified lookup. Instead, a member type representation with a type parameter base always resolves to an \index{unbound dependent member type}unbound dependent member type formed from the base type and identifier. First, we need an example where structural resolution actually takes place. We can take the \texttt{Provider} protocol declared prior, and write a function with a trailing \Index{where clause@\texttt{where} clause}\texttt{where} clause: +\begin{Verbatim} +func f(_: T) where T.Entity: Equatable {...} \end{Verbatim} -\end{listing} -\begin{example} -The two type aliases \texttt{A} and \texttt{B} in Listing~\ref{type resolution with dependent base} demonstrate an interesting phenomenon. The type representations of their underlying types look very similar, only differing in the generic arguments applied. However, they resolve to two different nominal types. - -First, consider how type resolution builds the underlying type of \texttt{A}: -\begin{enumerate} -\item The first component resolves to the generic struct type \texttt{Person}. -\item The second component performs a qualified name lookup into the declaration of \texttt{Person}, which finds the member type alias \texttt{Rides}. Applying the substitution map to the underlying type of \texttt{Person.Rides} gives us the struct type \texttt{Paul}. -\item The third component performs a qualified name lookup into the declaration of \texttt{Paul}, which finds the non-generic struct declaration \texttt{Paul.Pony}. The declared interface type \texttt{Paul.Pony} becomes the final resolved type. -\end{enumerate} -Now compare the above with the underlying type of \texttt{B}: -\begin{enumerate} -\item The first component resolves to \texttt{Person}. -\item The second component performs a qualified name lookup into the declaration of \texttt{Person}, which again finds the member type alias \texttt{Rides}. Applying the substitution map to the underlying type of \texttt{Person.Rides} gives us the struct type \texttt{Maureen}. The base type for the third component is now a completely different nominal type! -\item The third component performs a qualified name lookup into the declaration of \texttt{Maureen}, which finds the generic struct declaration \texttt{Maureen.Pony}. Applying the generic argument to this type declaration's declared interface type gives us the generic struct type \texttt{Maureen.Pony}. -\end{enumerate} +The \Request{generic signature request} resolves the type representation \texttt{T.Entity} in the structural resolution stage, to obtain the \index{unbound dependent member type}unbound dependent member type \texttt{T.Entity}. We have two user-written requirements, from which we build our generic signature: +\[\{\ConfReq{T}{Provider},\,\ConfReq{T.Entity}{Equatable}\}\] +Requirement minimization rewrites the second requirement into one that contains the \index{bound dependent member type}bound dependent member type \texttt{T.[Provider]Entity}, and we get the following generic signature: +\begin{quote} +\begin{verbatim} + +\end{verbatim} +\end{quote} +At this point, \texttt{f()} has a generic signature, so type representations appearing inside the function can be resolved in the interface resolution stage. The rewriting of requirements that use unbound dependent member types into requirements that use bound dependent member type swill be completely justified in \SecRef{minimal requirements}. -Clearly, \texttt{Paul.Pony} and \texttt{Maureen.Pony} are two unrelated type declarations, and one is even generic while the other is not. If you tried to save the generic arguments and apply them in a single shot at the end, you'd quickly realize there is no way to resolve the type representation \texttt{Person.Rides.Pony} to a single type declaration. The complication here is the type alias \texttt{Person.Rides}, whose underlying type is a type parameter. +We discussed bound and unbound dependent member types in \SecRef{type params}. An unbound dependent member type from the structural resolution stage can be converted into a bound dependent member type by first checking the +\Index{isValidTypeParameter()@\texttt{isValidTypeParameter()}}$\Query{isValidTypeParameter}{}$ generic signature query, followed by \Index{getReducedType()@\texttt{getReducedType()}}$\Query{getReducedType}{}$. The first check is needed since an unbound dependent member type, being a syntactic construct, might not name a valid member type at all. However, the usual situation is that the entire type representation is resolved again in the interface resolution stage, at which point an invalid type parameter is resolved to an error type and a \index{diagnostic!invalid type parameter}diagnostic is emitted. In particular, type representations in the trailing \texttt{where} clause of each declaration are re-visited by the \Request{type-check source file request}, which walks all top-level declarations in source order and emits further diagnostics. -\end{example} +Suppose now we change \texttt{f()}, to add the invalid requirement $\ConfReq{T.Foo}{Equatable}$: +\begin{Verbatim} +func f(_: T) + where T.Entity: Equatable, T.Foo: Equatable {...} +\end{Verbatim} +The invalid requirement will be dropped by requirement minimization, and the \Request{generic signature request} does not emit any diagnostics. Instead, the invalid member type representation \texttt{T.Foo} will be diagnosed by \Request{type-check source file request}, because qualified lookup would fail to find a member type named \texttt{Foo} in \texttt{Provider} when we revisit the \texttt{where} clause in interface resolution stage. We will resume the discussion of how invalid requirements are diagnosed in \SecRef{generic signature validity}. -\paragraph{Bound components} A minor optimization worth understanding, because it slightly complicates the implementation. After resolving the type of a component, the bound (or found, perhaps) type declaration is stored inside the component. If the identifier type representation is resolved again (perhaps the first time was the structural stage, and the second was the interface stage), resolving a bound component skips the name lookup and proceeds directly to computing the type from the bound declaration. - -\index{generic environment} -\index{map type into environment} -The optimization was more profitable in the past, when type resolution actually had \emph{three} stages, with a third stage resolving interface types to archetypes. The third stage was subsumed by the \textbf{map type into environment} operation on generic environments. Parsing textual SIL also ``manually'' binds components to type declarations which name lookup would otherwise not find, in order to parse some of the more esoteric SIL syntax that we're not going to discuss here. - -\section{Checking Generic Arguments}\label{checking generic arguments} - -\index{generic signature} -\index{identifier type representation} -In the previous section, you saw how type resolution applies generic arguments when resolving a generic identifier type representation. Now, we're going to turn our attention to the problem of checking whether those generic arguments satisfy the requirements of the type declaration's generic signature. Type resolution collects generic arguments into a substitution map whose input generic signature is the generic signature of the type declaration, so our problem reduces to asking if a substitution map satisfies the requirements of its input generic signature. - -\index{original requirement} -\IndexDefinition{substituted requirement} -\index{substitution map} -\index{requirement} -\index{conformance requirement} -\index{superclass requirement} -\index{same-type requirement} -\index{layout requirement} -We've seen that substitution maps can be applied to types, conformances and other substitution maps; it turns out that we can apply a substitution map to a requirement as well: -\[\mathboxed{original requirement}\otimes\mathboxed{substitution map} = \mathboxed{substituted requirement}\] -Requirement substitution is defined by applying the substitution map to each of the types stored in the requirement. All requirement kinds store a \emph{subject type}, which is a type parameter for its generic signature to which the requirement applies. Conformance, superclass and same-type requirements store a second type as well: -\begin{itemize} -\item In a conformance requirement, the second type is a protocol type, which is always non-generic and cannot be substituted. -\item Superclass and same-type requirements store an interface type which can contain type parameters for their generic signature. -\item Layout requirements store a layout constraint, which is not a type and cannot be substituted. -\end{itemize} +\smallskip -\index{type parameter} -\index{generic environment} -\index{declaration context} -\index{primary archetype type} -The type parameters in the original requirement are written for the generic signature of the referenced type declaration. The generic argument substitution map can also contain type parameters; they are written for the generic signature of the declaration context in which the type representation appears. To simplify the logic for checking whether a substituted requirement is satisfied, type resolution maps the replacement types of the generic argument substitution map into the generic environment of the current declaration context first. Thus, a substituted requirement no longer contains any type parameters; however, it may contain primary archetypes from the current declaration context's generic environment. +Next, we look at this function \texttt{g()}, which references our previously-seen protocol type alias \texttt{Content} from its trailing \texttt{where} clause. +\begin{Verbatim} +func g(_: T) + where T.Element: Subscriber, T.Element.Content: Equatable {...} +\end{Verbatim} +Here, we form the requirement $\ConfReq{T.Element.Content}{Equatable}$ whose subject type is the unbound dependent member type \texttt{T.Element.Content}. Indeed, this is not a type alias type at all. As we will learn in \SecRef{building rules}, protocol type aliases introduce rewrite rules, and the computation of reduced types replaces the unbound dependent member type with the underlying type of the protocol type alias. In this case, we get a generic signature with a requirement having quite the long subject type, +\begin{center} +$\ConfReq{T.[Sequence]Element.[Subscriber]Parent.[Provider]Entity}{Equatable}$. +\end{center} -Thus, requirement substitution replaces the type parameters in a requirement with concrete types from the generic argument substitution map. A substituted requirement becomes a statement about concrete types whose truth is independent of any generic signature, and type resolution can then check if the statement holds, in which case we say the requirement is \emph{satisfied}. +Type aliases can also appear in \index{protocol extension}protocol extensions. However, such aliases cannot be referenced from positions resolved in the structural resolution stage, and in particular, trailing \texttt{where} clauses. We will see later this is because type aliases from protocol extensions do not participate in the rewrite system, so the reduction described above cannot be performed. This situation is detected after the fact, when the type representation is revisited in the interface resolution stage: +\begin{Verbatim} +extension Provider { + typealias Object = Entity +} -Since the type parameters in the replacement types of the generic argument substitution map are mapped into the current declaration context's generic environment, checking generic arguments depends on the generic signature of the current declaration having already been built. For this reason, checking generic arguments is done in the interface resolution stage---the structural resolution stage only checks that the number of generic arguments in the component matches the number of generic parameters in the named type declaration. +// error: `Object' was defined in extension of protocol `Provider' +// and cannot be referenced from a `where' clause +struct G where T.Object: Equatable {} +\end{Verbatim} -\index{conformance checker} -\index{conforming type} -This concept of applying a substitution map to a set of requirements and then checking if they are satisfied is important. It comes up elsewhere in the type checker: -\begin{enumerate} -\item The expression type checker uses similar logic when solving constraints generated from requirements when type checking a call to a generic function. -\item Conformance checking ensures that the conforming type and its type witnesses satisfy the protocol's requirement signature (Section~\ref{requirement sig}). -\item Requirement inference is in some sense solving the ``opposite'' problem: when building a generic signature, we want to \emph{add} requirements to ensure that the substituted requirements derived from a generic type representation are satisfied (Section~\ref{requirementinference}). -\item The conditional requirements of a conditional normal conformance are computed by taking the requirements of a constrained extension not satisfied by the generic signature of the extended type. +\smallskip -After applying a substitution map to a conditional normal conformance, we get a conditional specialized conformance, whose substituted requirements are checked in the same manner as below (Section~\ref{conditional conformance}). -\item Class method override checking checks if the generic signature of the subclass method satisfies the requirements of the superclass method (Section~\ref{overridechecking}). -\end{enumerate} +\emph{Remaining cases.} We now wrap up the case where we have a type parameter base, the type parameter is subject to a concrete same-type or superclass requirement, and the resolved type declaration is a member of this concrete type. We compute the resolved type from this type declaration by proceeding as if the base type was this concrete type or superclass bound instead of the type parameter spelled by the user. Once again, various limitations surface if this type representation is also resolved in structural resolution stage. The \index{concrete contraction}``concrete contraction'' pass allows certain cases to work (\SecRef{concrete contraction}). -\IndexDefinition{satisfied requirement} -\begin{algorithm}[``Requirement is satisfied'' check]\label{reqissatisfied} -Takes a substituted requirement as input. The substituted requirement's types must not contain type parameters, but may contain archetypes. Returns true if the requirement is satisfied. - -\index{global conformance lookup} -\index{abstract conformance} -\index{concrete conformance} -\index{conditional conformance} -\index{conformance requirement} -\index{superclass requirement} -\index{layout requirement} -\index{same-type requirement} -\index{self-conforming protocol} -The algorithm handles each requirement kind as follows: -\begin{itemize} -\item \textbf{Conformance requirements:} Decompose the requirement into its subject type and protocol type, and perform a global conformance lookup. There are three possible outcomes: -\begin{enumerate} -\item If the conformance is abstract, the subject type was an archetype known to satisfy this conformance requirement. Return true. -\item If the conformance is concrete, it might be conditional (Section~\ref{conditional conformance}), and its conditional requirements are checked by recursively applying the algorithm. If all conditional requirements are satisfied (or if there are no conditional requirements), return true. -\item Otherwise, the conformance is invalid. Return false. -\end{enumerate} -\index{superclass type} -\index{class declaration} -\item \textbf{Superclass requirements:} Decompose the requirement into its subject type and constraint type. There are three possible cases: -\begin{enumerate} -\item If the subject type is canonically equal to the constraint type, return true. -\item If the subject type and constraint type are both generic class types with the same declaration but distinct generic arguments, return false. -\item If the subject type does not have a superclass type (Chapter~\ref{classinheritance}), return false. -\item The final case is where the subject type has a superclass type. Construct a new requirement by replacing the given requirement's subject type with its superclass type, and leave the constraint type unchanged. Recursively apply the algorithm to the new requirement. -\end{enumerate} -\item \textbf{Layout requirements:} Decompose the requirement into its subject type and layout constraint. The only kind of layout constraint that can be written in source is an \texttt{AnyObject} constraint. If the subject type is a class type, an archetype satisfying the \texttt{AnyObject} layout constraint, or an \texttt{@objc} existential, return true. -\index{canonical type equality} -\item \textbf{Same-type requirements:} Decompose the requirement into its subject type and constraint type. If the two types are canonical-equal, return true. -\end{itemize} -\end{algorithm} +\paragraph{Concrete base.} Now, consider what happens when the base resolves to \index{fully-concrete type}something other than a type parameter. The only interesting case is when the base type is a \index{nominal type}nominal type or \Index{dynamic Self type@dynamic \texttt{Self} type}dynamic \texttt{Self} type, because function types and such do not have member types. (The dynamic \texttt{Self} type is simply unwrapped to its underlying nominal type, allowing member types of the innermost class declaration to be referenced by ``\texttt{Self.Foo}'', which is perhaps slightly more explicit than the almost equivalent ``\texttt{Foo}''.) -\index{error type} -\index{substitution failure} -Using our algorithm, we can check the generic argument substitution map against its input generic signature. -\begin{algorithm}[Substitution map requirement check]\label{check generic arguments algorithm} -Takes two inputs: -\begin{enumerate} -\item A substitution map where the replacement types do not contain type parameters (but they may contain archetypes). -\item A list of requirements, understood to contain type parameters from the substitution map's input generic signature. (In type resolution, these are exactly the requirements of the referenced type declaration's generic signature, but elsewhere the requirements are obtained differently.) -\end{enumerate} -As output, returns three lists, which together form a partition of the input requirements: a \emph{satisfied} list, an \emph{unsatisfied} list, and a \emph{failed} list. -\begin{enumerate} -\item Initialize the three output lists, initially empty. -\item If the input list is empty, return. -\item Otherwise, remove the next original requirement from the input list and apply the substitution map to get a substituted requirement. -\item If the substituted requirement contains error types, move the original requirement to the failed list. -\item Otherwise, check if the substituted requirement is satisfied using Algorithm~\ref{reqissatisfied}, and move the original requirement to the satisfied or unsatisfied list based on the outcome of this check. -\item Go back to Step~2. -\end{enumerate} -\end{algorithm} -Type resolution applies the above algorithm to the generic argument substitution map together with the list of requirements in the named type declaration's generic signature. +Unlike the type parameter case, here we immediately perform a \index{qualified lookup}qualified lookup into the base type without consulting the generic signature of the current context. When both the base and the member type declaration are non-generic nominal types, and the member type is a direct member of the base, the resolved type is just the \index{declared interface type}declared interface type of the member. The member might also be found in a superclass, and not as a direct member of the base. We saw an almost identical example in the previous section, where the reference to \texttt{Inner} was instead an identifier type representation inside the body of \texttt{Derived}: +\begin{Verbatim} +class Base { + struct Inner {} +} -Any failed requirements are ignored, because a substitution failure indicates that either some other requirement is unsatisfied, or an error was diagnosed elsewhere by the conformance checker (for example, a missing type witness in a conformance). +class Derived: Base {} -Unsatisfied requirements are diagnosed at the source location of the component's generic arguments, with the appropriate error message showing the substituted subject type and requirement kind. +// Interface type of `x' is Base.Inner +var x: Derived.Inner = ... +\end{Verbatim} +The superclass is generic, so we apply the \index{superclass substitution map}superclass substitution map to the declared interface type of \texttt{Inner} to get the resolved type for \texttt{Inner} as a member of \texttt{Derived}: +\[\texttt{Base.Inner}\otimes\SubstMap{\SubstType{T}{Int}}=\texttt{Base.Inner}\] -\begin{listing}\captionabove{Satisfied and unsatisfied requirements with concrete types}\label{unsatisfied requirements} +Next, suppose we wish to reference \texttt{Inner} directly, as a member of \texttt{Base}. The \texttt{Base} class is generic, and we haven't seen how \index{generic argument}generic arguments are applied when resolving a type representation yet, but for now assume we know how to resolve the base type here: \begin{Verbatim} -struct G where T.Element == U.Element {} +var y: Base.Inner = ... +\end{Verbatim} +Given the generic nominal type \texttt{Base}, we resolve the member type representation written above by applying the \index{context substitution map}context substitution map of \texttt{Base} to the declared interface type of \texttt{Inner}, which we can do because \texttt{Inner} has the same generic signature as \texttt{Base}: +\[\texttt{Base.Inner}\otimes\SubstMap{\SubstType{T}{String}}=\texttt{Base.Inner}\] -// (1) all requirements satisfied -typealias A = G, Set> +The substitutions above are trivial in a sense, because the referenced member is a nominal type declaration, and to compute the resolved type, we simply transfer over the generic arguments from the base type. However, when the member is a \index{type alias declaration}type alias declaration, we can actually encode an arbitrary substitution. Each choice of base type defines a possible substitution map, which is then applied to the underlying type of the type alias, which can be any valid interface type for its generic signature. The reader may recall that when substitution maps were introduced in \ChapRef{substmaps}, one of the motivating examples was \ExRef{type alias subst example}, showing substitutions performed when resolving member type representations with type alias members. We're going to look at a few interesting examples of type alias members now. -// (2) `T.Element == U.Element' unsatisfied -typealias B = G, Set> +\smallskip -// (3) `T: Sequence' unsatisfied; -// `T.Element == U.Element' substitution failure -typealias C = G> +The type alias might be declared in a \index{constrained extension}constrained extension of the base. If the constrained extension introduces new conformance requirements, the underlying type of the type alias declaration may reference new type parameters not present in the base type's generic signature. Recall that \texttt{Optional} type has a single generic parameter \texttt{Wrapped}: +\begin{Verbatim} +extension Optional where Wrapped: Sequence { + typealias OptionalElement = Optional +} + +// Interface type of `x' is `Optional' +var x: Optional>.OptionalElement = ... \end{Verbatim} -\end{listing} -\begin{example} -Listing~\ref{unsatisfied requirements} shows three examples of checking generic arguments. The underlying type of each type alias is an identifier type representation referencing the declaration of \texttt{G} with different generic arguments. The generic signature of \texttt{G} is: -\begin{quote} -\begin{verbatim} - -\end{verbatim} -\end{quote} -There are four requirements: -\begin{quote} -\begin{tabular}{|l|l|l|} -\hline -Kind&Subject type&Constraint type\\ -\hline -Conformance&\texttt{T}&\texttt{Sequence}\\ -Conformance&\texttt{U}&\texttt{Sequence}\\ -Same type&\texttt{T.[Sequence]Element}&\texttt{U.[Sequence]Element}\\ -\hline -\end{tabular} -\end{quote} +The context substitution map of \texttt{Optional>} is $\SubstMap{\SubstType{Wrapped}{Array}}$, which is not a valid substitution map for the constrained extension's \index{input generic signature}generic signature, because there's no conformance for the requirement $\ConfReq{Wrapped}{Sequence}$. Applying this substitution map to the underlying type of our type alias would signal \index{substitution failure}substitution failure and return the \index{error type}error type, because the substituted type for the type parameter \texttt{Wrapped.[Sequence]Element} cannot be resolved without this conformance: +\begin{multline*}\texttt{Optional}\otimes\SubstMap{\SubstType{Wrapped}{Array}}\\ +{} =\texttt{<>} +\end{multline*} +Instead we build the context substitution map of our base type, but using the generic signature of our constrained extension. This uses \index{global conformance lookup}global conformance lookup to resolve the conformance: +\[\Sigma := \SubstMapLongC{\SubstType{Wrapped}{Array}}{\SubstConf{Wrapped}{Array}{Sequence}}\] +Now, applying $\Sigma$ to the underlying type of our type alias outputs the expected result by projecting the type witness from the conformance: +\[\texttt{Optional}\otimes\Sigma = \texttt{Optional}\] -\paragraph{First type alias} The context substitution map of the underlying type of \texttt{A}: -\[ -\SubstMapC{ -\SubstType{T}{Array}\\ -\SubstType{U}{Set} -}{ -\SubstConf{T}{Array}{Sequence}\\ -\SubstConf{U}{Set}{Sequence} -} -\] -We apply this substitution map to each requirement of our generic signature: -\begin{quote} -\begin{tabular}{|l|l|l|c|} -\hline -Kind&Subject type&Constraint type&Satisfied?\\ -\hline -Conformance&\texttt{Array}&\texttt{Sequence}&$\checkmark$\\ -Conformance&\texttt{Set}&\texttt{Sequence}&$\checkmark$\\ -Same type&\texttt{Int}&\texttt{Int}&$\checkmark$\\ -\hline -\end{tabular} -\end{quote} -The two conformance requirements are satisfied, because both substituted subject types conform to \texttt{Sequence}. The same-type requirement is satisfied, because the two substituted types are canonical-equal. +If instead we had a \texttt{Wrapped} type that does not conform to \texttt{Sequence}, for example as in \texttt{Optional.OptionalElement}, we again get a substitution failure so the resolved type becomes an error type. We will see in the next section how such type representations are diagnosed. -\paragraph{Second type alias} The context substitution map of the underlying type of \texttt{B}: -\[ -\SubstMapLongC{ -\SubstType{T}{Array}\\ -\SubstType{U}{Set} -}{ -\SubstConf{T}{Array}{Sequence}\\ -\SubstConf{U}{Set}{Sequence} +\smallskip + +To complete the discussion of a member type representation with a concrete base, we finally consider the possibility that the named member is an associated type or protocol type alias declared in some conformed protocol. When resolving a protocol member with a type parameter base, we used the protocol substitution map constructed from the abstract conformance of the type parameter to the protocol. With a concrete base, we do the same thing, except we form the protocol substitution map from a \emph{concrete} conformance. + +In the below, \texttt{Tomato} conforms to \texttt{Plant}, so we can access the protocol type alias \texttt{Food} as a member of \texttt{Tomato}: +\begin{Verbatim} +struct Ketchup {} +struct Pasta {} + +protocol Plant { + associatedtype Sauce + typealias Food = Pasta + consuming func process() -> Sauce } -\] -We apply this substitution map to each requirement of our generic signature: -\begin{quote} -\begin{tabular}{|l|l|l|c|} -\hline -Kind&Subject type&Constraint type&Satisfied?\\ -\hline -Conformance&\texttt{Array}&\texttt{Sequence}&$\checkmark$\\ -Conformance&\texttt{Set}&\texttt{Sequence}&$\checkmark$\\ -Same type&\texttt{Int}&\texttt{String}&$\times$\\ -\hline -\end{tabular} -\end{quote} -The two conformance requirements are satisfied, because the substituted subject types conform to \texttt{Sequence}. The same-type requirement is unsatisfied, because the two substituted types are not canonical-equal. -\paragraph{Third type alias} The context substitution map of the underlying type of \texttt{C}: -\[ -\SubstMapLongC{ -\SubstType{T}{Float}\\ -\SubstType{U}{Set} -}{ -\mbox{(invalid)}\\ -\SubstConf{U}{Set}{Sequence} +struct Tomato: Plant { + consuming func process() -> Ketchup {...} } -\] -We apply this substitution map to each requirement of our generic signature: -\begin{quote} -\begin{tabular}{|l|l|l|c|} -\hline -Kind&Subject type&Constraint type&Satisfied?\\ -\hline -Conformance&\texttt{Float}&\texttt{Sequence}&$\times$\\ -Conformance&\texttt{Set}&\texttt{Sequence}&$\checkmark$\\ -Same type&\texttt{<>}&\texttt{Int}&$-$\\ -\hline -\end{tabular} -\end{quote} -The first conformance requirement is unsatisfied and will be diagnosed. The same-type requirement has a substitution failure, and does not need to be diagnosed; in fact, its failure is a consequence of the first conformance requirement being unsatisfied. -\end{example} -\begin{listing}\captionabove{Satisfied and unsatisfied requirements with archetypes}\label{unsatisfied requirements archetypes} -\begin{Verbatim} -struct G where T.Element == U.Element {} +// Interface type of `x' is canonically equal to `Ketchup' +var x: Tomato.Sauce = ... -struct H { - // (1) all requirements satisfied - typealias A = G> +// Interface type of `y' is canonically equal to `Pasta' +var y: Tomato.Food = ... +\end{Verbatim} +When we resolve the interface type of ``\texttt{x}'' and then ``\texttt{y}'', we see that in both cases the resolved type declaration is a member of \texttt{Plant}, so we build the protocol substitution map $\Sigma_{\ConfReq{Tomato}{Plant}}$ from the normal conformance $\ConfReq{Tomato}{Plant}$: +\[\Sigma_{\ConfReq{Tomato}{Plant}} := \SubstMapC{\SubstType{Self}{Tomato}}{\SubstConf{Self}{Tomato}{Plant}}\] + +To resolve the interface type of ``\texttt{x}'', we apply $\Sigma_{\ConfReq{Tomato}{Plant}}$ to \texttt{Self.[Plant]Sauce}, the declared interface type of \texttt{Sauce}. This is equivalent to projecting the type witness for \texttt{Sauce} from this conformance, giving us the resolved type \texttt{Ketchup}: +\begin{gather*} +\texttt{Self.[Plant]Sauce} \otimes \Sigma_{\ConfReq{Tomato}{Plant}}\\ +\qquad\qquad {} = \AssocType{[Plant]Sauce} \otimes \ConfReq{Self}{Plant} \otimes \Sigma_{\ConfReq{Tomato}{Plant}}\\ +\qquad\qquad {} = \AssocType{[Plant]Sauce} \otimes \ConfReq{Tomato}{Plant}\\ +\qquad\qquad {} = \texttt{Ketchup} +\end{gather*} +To resolve the interface type of ``\texttt{y}'', where we have the type representation \texttt{Tomato.Food}, we apply $\Sigma_{\ConfReq{Tomato}{Plant}}$ to \texttt{Pasta}, the underlying type of \texttt{Food}, which recursively transforms the type parameter contained therein: +\begin{gather*} +\texttt{Pasta} \otimes \Sigma_{\ConfReq{Tomato}{Plant}}\\ +\qquad\qquad {} = \texttt{Pasta} +\end{gather*} - // (2) `T.Element == U.Element' unsatisfied - typealias B = G> +\paragraph{Protocol base.} This is really just a funny edge case. A protocol type alias is visible as a member type of the \index{protocol type}protocol type \emph{itself} when its underlying type does not depend on the \Index{protocol Self type@protocol \texttt{Self} type}protocol \texttt{Self} type. The \texttt{Food} protocol type alias above cannot be referenced as a member of the protocol type \texttt{Plant}, because the underlying type of \texttt{Food} contains \texttt{Self}, and \texttt{Plant} is not a valid replacement type for \texttt{Self}; \texttt{Plant} does not conform to \texttt{Plant}! Type resolution must reject the member type representation ``\texttt{Plant.Food}'': +\begin{Verbatim} +// error: cannot access type alias `Food' from `Plant'; use a +// concrete type or generic parameter base instead +var x: Plant.Food = ... +\end{Verbatim} +Similarly, an associated type member can \emph{never} be referenced with a protocol base, like ``\texttt{Plant.Sauce}''; its declared interface type \emph{always} contains \texttt{Self}. However, if the underlying type of some protocol type alias does not contain \texttt{Self}, the type alias is just a shortcut for another globally-visible type that does not depend on the conformance. We allow the reference because no substitution needs to be performed: +\begin{Verbatim} +protocol Pet { + typealias Age = Int } + +// `Pet.Age' is just another spelling for `Int' +func celebratePetBirthday(_ age: Pet.Age) {} \end{Verbatim} -\end{listing} + +Due to pecularities of type substitution, protocol type aliases that are also \index{generic type alias}generic are always considered to depend on \texttt{Self}, even if their underlying type does not reference \texttt{Self}, so they \index{limitation!generic type alias with protocol base}cannot be referenced with a protocol base. (In structural resolution stage, a generic type alias cannot be referenced with a \index{limitation!generic type alias with type parameter base}type parameter base, either. Perhaps it is best not to stick generic type aliases inside protocols, at all.) + +\paragraph{General principle.} Let's say that $H$ is the generic signature of the current context, and \texttt{T} is the resolved base type of our member type representation, obtained via a recursive call to type resolution. We perform a \index{qualified lookup}qualified lookup after considering the base type \texttt{T}: +\begin{itemize} +\item If \texttt{T} is a type parameter, we look through a list of \index{nominal type declaration}nominal type declarations discovered by \index{generic signature query}generic signature queries against \texttt{T} and $H$. +\item If \texttt{T} is a nominal type, we look into the nominal type declaration of \texttt{T}. +\end{itemize} +If qualified lookup fails or finds more than one type declaration, we \index{diagnostic!type resolution}diagnose an error. Otherwise, let $d$ be the resolved type declaration with declared interface type $\texttt{T}_d$ and generic signature $G$. We construct a substitution map $\Sigma\in\SubMapObj{G}{H}$, then compute $\texttt{T}_d\otimes\Sigma$ to get the final resolved type. We build $\Sigma$ from the base type \texttt{T} and information about the parent context of $d$. There are three cases to handle: +\begin{itemize} +\item In the \textbf{direct case}, $d$ is a direct member of the nominal type declaration of \texttt{T}, and $\Sigma$ is the \index{context substitution map}context substitution map of \texttt{T}. +\item In the \textbf{superclass case}, $d$ is a member of a superclass of \texttt{T}, and $\Sigma$ is the \index{superclass substitution map}superclass substitution map formed from \texttt{T} and the parent class of $d$. +\item In the \textbf{protocol case}, $d$ is a member of some protocol $\protosym{P}$, either via conformance (if \texttt{T} is concrete) or protocol inheritance (if \texttt{T} is a type parameter), and $\Sigma$ is the \index{protocol substitution map}protocol substitution map $\Sigma_{\ConfReq{T}{P}}$ for the conformance of \texttt{T} to $\protosym{P}$, found via global conformance lookup. +\end{itemize} +If the base type \texttt{T} is a type parameter subject to a concrete \index{same-type requirement}same-type requirement or a \index{superclass requirement}superclass requirement, we replace \texttt{T} with the corresponding concrete type obtained by a generic signature query against $H$ before proceeding to compute $\Sigma$ above. + +In all three cases above, $d$ might be defined in a constrained extension that imposes further conformance requirements. When building $\Sigma$, we resolve any of these additional conformances via \index{global conformance lookup}global conformance lookup (\SecRef{buildingsubmaps}). + +This is called a \IndexDefinition{context substitution map!for a declaration context} \emph{context substitution map for a declaration context}. This concept generalizes the context substitution map of a type from \SecRef{contextsubstmap}, which was an inherent property of a type, without reference to a declaration context. If \texttt{T} is a nominal type and $d$ is a direct member of the nominal type declaration of \texttt{T}, the context substitution map of \texttt{T} for the parent context of $d$ is simply the context substitution map of \texttt{T}. + +The context substitution map of a type for a declaration context also arises when type checking \index{member reference expression}member reference \index{expression}expressions, like ``\texttt{foo.bar}'', that appear in a function body. In this case, qualified lookup will find any member named \texttt{bar}, not just a type declaration, and the type of the expression is computed by applying the corresponding context substitution map to the \index{interface type}interface type of \texttt{bar}. + +\paragraph{History.} In pre-evolution Swift, associated type declarations in protocols were declared with the \texttt{typealias} keyword, and protocol type aliases did not exist. \IndexSwift{2.2}Swift 2.2 added the distinct \texttt{associatedtype} keyword for declaring associated types, to make space for protocol type aliases in the future~\cite{se0011}. Protocol type aliases were then introduced as part of \IndexSwift{3.0}Swift 3 \cite{se0092}. + +\paragraph{Caching the type declaration.} Having computed the resolved type of an identifier or member type representation, we stash the resolved type declaration within our type representation, as a sort of cache. If the type representation is resolved again (perhaps once in the \index{type resolution stage}structural stage, and then again in the interface stage), we skip name lookup and proceed directly to computing the resolved type from the stored type declaration. The optimization was more profitable in the past, when type resolution actually had \emph{three} stages. The third stage would resolve interface types to archetypes, but it has since been subsumed by the \index{map type into environment}\textbf{map type into environment} operation on \index{generic environment}generic environments. We also pre-populate this cache when parsing textual \index{SIL}SIL, by assigning a type declaration to certain type representations. Name lookup would otherwise not find these declarations, because of SIL syntax oddities that we're not going to discuss here. + +\section{Applying Generic Arguments}\label{checking generic arguments} + +Identifier and member type representations may be equipped with generic arguments, where each \index{generic argument}generic argument is recursively another type representation: +\begin{Verbatim} +Array +Dictionary> +Big.Small +\end{Verbatim} +To resolve a type representation with generic arguments, we begin by finding the resolved type declaration using name lookup, and then perform a few additional steps not present in the non-generic case. Let's say that $d$ is the resolved type declaration, $G$ is the \index{generic signature}generic signature of $d$, and $G^\prime$ is the generic signature of the parent context of~$d$. If the parent generic signature $G^\prime$ is \index{empty generic signature}empty, then we're referencing a top-level generic declaration, like \texttt{Array} for example. (If $G$ is \emph{also} empty, then~$d$ is not actually generic at all; we will diagnose below.) Notice there are several new semantic checks here, each of which can diagnose errors and return an \index{error type}error type: +\begin{enumerate} +\item We check that $d$ has a \index{generic parameter list}generic parameter list, and ensure we have the correct number of generic arguments. +\item We recursively resolve all generic argument type representations, to form an array of generic argument types. +\item We form a substitution map $\Sigma$ for $G$ from these generic argument types. +\item We check that $\Sigma$ satisfies all explicit requirements of $G$. +\item We compute the substituted type $\texttt{T}_d\otimes\Sigma$, where $\texttt{T}_d$ is the declared interface type of~$d$, to get the final resolved type. +\end{enumerate} + +Let's say that $H$ is the generic signature of the declaration context where the type representation appears. The check in Step~1 consults the \index{generic parameter list}generic parameter list of~$d$; this is a syntactic construct describing the innermost generic parameters of $G$ we met in \SecRef{generic params}. This does not require knowledge of either $G$ or $H$, so we perform this check in both \index{interface resolution stage}interface resolution stage and \index{structural resolution stage}structural resolution stage. Checking requirements in Step~4 requires knowledge of both $G$ and $H$, and we only do it in the interface resolution stage. How this checking is done is a topic unto itself; we will turn our attention to it shortly. + +We want our resolved type to be an interface type for $H$, so $\texttt{T}_d\otimes\Sigma\in\TypeObj{H}$. Since $\texttt{T}_d\in\TypeObj{G}$, we wish to construct a $\Sigma\in\SubMapObj{G}{H}$. In the previous section, we dealt with the case where $d$ does not introduce any new generic parameters or requirements, but may possibly be in a generic context. In this simple case, the resolved type was $\texttt{T}_d\otimes\Sigma^\prime$ where $\Sigma^\prime$ is the context substitution map for \texttt{T} with respect to the declaration context of $d$. In the general case, $\Sigma^\prime\in\SubMapObj{G^\prime}{H}$ and not $\SubMapObj{G}{H}$, but the ``difference'' between the two is that a substitution map for $G$ must also fix $d$'s innermost generic parameters and witness any new conformance requirements; so we form $\Sigma\in\SubMapObj{G}{H}$ by adding to $\Sigma^\prime$ the generic argument types from Step~2, and resolving their conformances. + +That is all a fancy way to say, the recursive nesting of type declarations gives rise to a recursive nesting of type representations. Type resolution forms a substitution map at each nesting level by adding new generic arguments to the previous substitution map. + \begin{example} -Listing~\ref{unsatisfied requirements archetypes} shows a pair of type alias declarations whose underlying types contain type parameters. Using the notation of Chapter~\ref{genericenv}, we write $\archetype{X}$ and $\archetype{Y}$ for the archetype of \texttt{X} and \texttt{Y} in the generic environment of \texttt{H}. +Suppose we're resolving the interface type of ``\texttt{x}'' below: +\begin{Verbatim} +struct Big { + struct Small {} +} -\paragraph{First type alias} The context substitution map for the underlying type of \texttt{A}: -\[ -\SubstMapC{ -\SubstType{T}{$\archetype{X}$}\\ -\SubstType{U}{Array<$\archetype{X.Element}$>} -}{ -\SubstConf{T}{$\archetype{X}$}{Sequence}\\ -\SubstConf{U}{Array<$\archetype{X.Element}$>}{Sequence} +struct From { + var x: Big.Small = ... } -\] -We apply this substitution map to each requirement of our generic signature: -\begin{quote} -\begin{tabular}{|l|l|l|c|} -\hline -Kind&Subject type&Constraint type&Satisfied?\\ -\hline -Conformance&$\archetype{X}$&\texttt{Sequence}&$\checkmark$\\ -Conformance&\texttt{Array<$\archetype{X.Element}$>}&\texttt{Sequence}&$\checkmark$\\ -Same type&$\archetype{X.Element}$&$\archetype{X.Element}$&$\checkmark$\\ -\hline -\end{tabular} -\end{quote} -Note that there are two generic signatures in play here; the generic signature of \texttt{G} describes the requirements to be checked, and the generic signature of \texttt{H} describes the type parameters appearing in the underlying type of \texttt{B}. +\end{Verbatim} +To illustrate the connection between nested type declarations and generic parameter \index{depth}depth, we're going to work with canonical types. We are resolving a type representation inside \texttt{From}, and indeed the generic arguments contain type parameters from the generic signature of \texttt{From}, canonically \texttt{<\ttgp{0}{0} where \ttgp{0}{0}:~Sequence>}. -\index{canonical type equality} -The same-type requirement merits a little bit of explanation. The replacement type for \texttt{T} is the archetype $\archetype{X}$, which conforms abstractly to \texttt{Sequence}. The replacement type for \texttt{U} is \texttt{Array<$\archetype{X.Element}$>}, which conforms concretely to \texttt{Sequence}. The type witness of \texttt{Element} in both conformances is the archetype $\archetype{X.Element}$, so the same-type requirement is satisfied because both substituted types are canonical-equal. +We first resolve the base type representation \texttt{Big}. The resolved type declaration is \texttt{Big}. Since the parent context of \texttt{Base} has an empty generic signature, the generic parameter of \texttt{Big} has the canonical type \ttgp{0}{0}. We form a substitution map $\SubstMap{\SubstType{\ttgp{0}{0}}{\ttgp{0}{0}.Iterator}}$ from our generic argument. Substitution gives us the resolved base type: +\[\texttt{Big<\ttgp{0}{0}>}\otimes\SubstMap{\SubstType{\ttgp{0}{0}}{\ttgp{0}{0}.Iterator}}=\texttt{Big<\ttgp{0}{0}.Iterator>}\] -\paragraph{Second type alias} The context substitution map for the underlying type of \texttt{B}: -\[ -\SubstMapC{ -\SubstType{T}{$\archetype{X}$}\\ -\SubstType{U}{Array<$\archetype{Y}$>} -}{ -\SubstConf{T}{$\archetype{X}$}{Sequence}\\ -\SubstConf{U}{Array<$\archetype{Y}$>}{Sequence} -} +Next, we resolve the member type declaration \texttt{Small} via qualified lookup into the base type. The generic parameters \texttt{U} and \texttt{V} of \texttt{Small} appear at depth 1, so they declare the generic parameter types \ttgp{1}{0} and \ttgp{1}{1}. We take the context substitution map of the base type, which we saw is $\SubstMap{\SubstType{\ttgp{0}{0}}{\ttgp{0}{0}.Iterator}}$, and insert our generic arguments as the replacement types for \ttgp{1}{0} and \ttgp{1}{1}: +\[\Sigma := \SubstMapLong{\SubstType{\ttgp{0}{0}}{\ttgp{0}{0}.Iterator}\\ +\SubstType{\ttgp{1}{0}}{\ttgp{0}{0}.Element}\\ +\SubstType{\ttgp{1}{1}}{Int}} \] -We apply this substitution map to each requirement of the generic signature of \texttt{H}: -\begin{quote} -\begin{tabular}{|l|l|l|c|} -\hline -Kind&Subject type&Constraint type&Satisfied?\\ -\hline -Conformance&\texttt{$\archetype{X}$}&\texttt{Sequence}&$\checkmark$\\ -Conformance&\texttt{Array<$\archetype{Y}$>}&\texttt{Sequence}&$\checkmark$\\ -Same type&\archetype{X.Element}&$\archetype{Y}$&$\times$\\ -\hline -\end{tabular} -\end{quote} -The second type alias differs from the first, because the \texttt{Element} type witness in the conformance of \texttt{Array<$\archetype{Y}$>} to \texttt{Sequence} is $\archetype{Y}$, which is not canonical-equal to $\archetype{X.Element}$; they represent two distinct type parameters in the generic signature of \texttt{H}. Thus, the same-type requirement is unsatisfied, and a diagnostic is produced. +Substitution gives us the final resolved type: +\begin{multline*} +\texttt{Big<\ttgp{0}{0}>.Small<\ttgp{1}{0}, \ttgp{1}{1}>}\otimes\Sigma\\ += \texttt{Big<\ttgp{0}{0}.Iterator>.Small<\ttgp{0}{0}.Element, Int>} +\end{multline*} +Note that the \ttgp{0}{0} on the left-hand side is interpreted relative to the generic signature of the resolved type declaration, while \ttgp{0}{0} on the right is relative to the generic signature of \texttt{From}. \end{example} -\begin{listing}\captionabove{Satisfied and unsatisfied superclass requirements}\label{unsatisfied requirements superclass} -\begin{Verbatim} -class Base {} -class Derived: Base {} +\paragraph{Checking generic arguments.} +Returning to Step~4 from the beginning of the present section, we're given a substitution map $\Sigma\in\SubMapObj{G}{H}$ and we must decide if $\Sigma$ satisfies the requirements of $G$, the generic signature of the referenced type declaration. We apply $\Sigma$ to each explicit requirement of $G$ to get a series of \index{substituted requirement}\emph{substituted requirements}. A substituted requirement is a statement about concrete types that is either true or false; we proceed to check each one by global conformance lookup, canonical type comparison, and so on, as will be described below. If any substituted requirements are unsatisfied, we \index{diagnostic!unsatisfied requirement}diagnose an error. This checking of substituted requirements is a generally useful operation, used not only by type resolution; we discuss other applications at the end of the section. -struct G, U> {} +Our generic argument types can contain type parameters of $H$ (the generic signature of the current context), and checking a substituted requirement may raise questions about~$H$. To avoid separately passing in~$H$, we require that a substituted requirement's types are expressed in terms of \index{archetype type}\emph{archetypes} instead (see \SecRef{archetypesubst} for a refresher). Thus, we begin by mapping $\Sigma$ into the \index{primary generic environment}primary generic environment of $H$, and assume henceforth that $\Sigma\in\SubMapObj{G}{\EquivClass{H}}$. -struct H { - // (1) requirement is satisfied - typealias A = G, Y> +\begin{definition} +We denote by \IndexSetDefinition{req}{\ReqObj{G}}$\ReqObj{G}$ the set of all requirements whose left-hand and right-hand side types contain type parameters of $G$ (all explicit and \index{derived requirement}derived requirements of $G$ are also elements of $\ReqObj{G}$, but there are many more). Similarly, let $\ReqObj{\EquivClass{H}}$ denote the set of requirements written using the primary archetypes of $H$. We define \emph{requirement substitution} as a new ``overload'' of \index{$\otimes$}$\otimes$: +\[\ReqObj{G}\otimes\SubMapObj{G}{\EquivClass{H}}\rightarrow\ReqObj{\EquivClass{H}}\] +Requirement substitution must apply $\Sigma$ to every type parameter appearing in a given requirement $R$ by considering the \index{requirement kind}requirement kind. In all of the below, \texttt{T} is the subject type of the requirement, so $\texttt{T}\in\TypeObj{G}$: +\begin{itemize} +\item For a \index{conformance requirement}\textbf{conformance requirement} $\ConfReq{T}{P}$, we apply $\Sigma$ to \texttt{T}. The \index{protocol type}protocol type~\texttt{P} remains unchanged because it does not contain any type parameters: +\[\ConfReq{T}{P}\otimes\Sigma:=\ConfReq{$(\texttt{T}\otimes\Sigma)$}{P}\] - // (1) requirement is satisfied - typealias B = G +\item For a \index{superclass requirement}\textbf{superclass requirement} $\ConfReq{T}{C}$, we apply $\Sigma$ to \texttt{T} as well as the superclass bound \texttt{C}, which might be a generic class type containing type parameters of~$G$: +\[\ConfReq{T}{C}\otimes\Sigma:=\ConfReq{$(\texttt{T}\otimes\Sigma)$}{$(\texttt{C}\otimes\Sigma)$}\] + +\item For a \index{same-type requirement}\textbf{same-type requirement} $\SameReq{T}{U}$, we apply $\Sigma$ to both sides; \texttt{U} is either a type parameter or a concrete type that may contain type parameters of $G$: +\[\SameReq{T}{U}\otimes\Sigma:=\SameReq{$(\texttt{T}\otimes\Sigma)$}{$(\texttt{U}\otimes\Sigma)$}\] + +\item For a \index{layout requirement}\textbf{layout requirement} $\ConfReq{T}{AnyObject}$, we apply $\Sigma$ to \texttt{T}. The right-hand side remains invariant under substitution: +\[\ConfReq{T}{AnyObject}\otimes\Sigma:=\ConfReq{$(\texttt{T}\otimes\Sigma)$}{AnyObject}\] +\end{itemize} +Having obtained a substituted requirement, we can check that it is satisfied. Before describing how this is done, we look at a few examples to help motivate what follows. +\end{definition} - // (2) requirement is unsatisfied - typealias C = G -} -\end{Verbatim} -\end{listing} \begin{example} -Listing~\ref{unsatisfied requirements superclass} shows two examples involving superclass requirements. The generic signature of \texttt{G} is: -\begin{quote} -\begin{verbatim} -> -\end{verbatim} -\end{quote} -The generic signature has a single requirement: -\begin{quote} -\begin{tabular}{|l|l|l|} -\hline -Kind&Subject type&Constraint type\\ -\hline -Superclass&\texttt{T}&\texttt{Base}\\ -\hline -\end{tabular} -\end{quote} +Let's begin by looking at how type resolution proceeds to resolve the underlying type of \texttt{A} and \texttt{B}: +\begin{Verbatim} +struct Concat where T.Element == U.Element {} -\paragraph{First type alias} The context substitution map of the underlying type of \texttt{A}: +// (1) all requirements satisfied +typealias A = Concat + +// (2) `T.Element == U.Element' unsatisfied +typealias B = Concat, Set> +\end{Verbatim} +When resolving the underlying type of type alias~\texttt{A}, we form this substitution map: \[ -\SubstMap{ -\SubstType{T}{Base<$\archetype{Y}$>}\\ -\SubstType{U}{$\archetype{Y}$} +\Sigma_a := \SubstMapLongC{ +\SubstType{\ttgp{0}{0}}{String}\\ +\SubstType{\ttgp{0}{1}}{Substring} +}{ +\SubstConf{\ttgp{0}{0}}{String}{Sequence}\\ +\SubstConf{\ttgp{0}{1}}{Substring}{Sequence} } \] -We apply this substitution map to the requirement of our generic signature: -\begin{quote} -\begin{tabular}{|l|l|l|c|} -\hline -Kind&Subject type&Constraint type&Satisfied?\\ -\hline -Superclass&\texttt{Base<$\archetype{Y}$>}&\texttt{Base<$\archetype{Y}$>}&$\checkmark$\\ -\hline -\end{tabular} -\end{quote} -The requirement is satisfied, because the subject type is canonical-equal to the constraint type. +We apply $\Sigma_a$ to each explicit requirement in the generic signature of \texttt{Concat}: +\begin{gather*} +\ConfReq{\ttgp{0}{0}}{Sequence}\otimes\Sigma_a = \ConfReq{String}{Sequence}\\ +\ConfReq{\ttgp{0}{1}}{Sequence}\otimes\Sigma_a = \ConfReq{Substring}{Sequence}\\ +\SameReq{\ttgp{0}{0}.Element}{\ttgp{0}{1}.Element}\otimes\Sigma_a = \SameReq{Character}{Character} +\end{gather*} +The first two substituted requirements claim their subject types conform to \texttt{Sequence}. We can check these by performing a global conformance lookup, which returns a valid concrete conformance in both cases, so we conclude both requirements are satisfied. The final substituted requirement is a statement that \texttt{Character} and \texttt{Character} are the same type, which also appears to be true. Thus, $\Sigma_a$ satisfies the generic signature of \texttt{Concat}, and we successfully form the resolved type \texttt{Concat}. -\index{superclass type} -\paragraph{Second type alias} The context substitution map of the underlying type of \texttt{B}: +When resolving the underlying type of type alias \texttt{B}, we see it is invalid: \[ -\SubstMap{ -\SubstType{T}{$\archetype{X}$}\\ -\SubstType{U}{Int} +\Sigma_b := \SubstMapLongC{ +\SubstType{\ttgp{0}{0}}{Array}\\ +\SubstType{\ttgp{0}{1}}{Set} +}{ +\SubstConf{\ttgp{0}{0}}{Array}{Sequence}\\ +\SubstConf{\ttgp{0}{1}}{Set}{Sequence} } \] -We apply this substitution map to the requirement of our generic signature: -\begin{quote} -\begin{tabular}{|l|l|l|} -\hline -Kind&Subject type&Constraint type\\ -\hline -Superclass&\archetype{X}&\texttt{Base}\\ -\hline -\end{tabular} -\end{quote} -The requirement hits the recursive case for superclass requirements in Algorithm~\ref{reqissatisfied}. The archetype \archetype{X} is replaced with its superclass type \texttt{Derived}, via the generic signature of \texttt{H}: -\begin{quote} -\begin{tabular}{|l|l|l|} -\hline -Kind&Subject type&Constraint type\\ -\hline -Superclass&\texttt{Derived}&\texttt{Base}\\ -\hline -\end{tabular} -\end{quote} -The algorithm recurses again, after replacing the class type \texttt{Derived} with its superclass type \texttt{Base}: -\begin{quote} -\begin{tabular}{|l|l|l|c|} -\hline -Kind&Subject type&Constraint type&Satisfied?\\ -\hline -Superclass&\texttt{Base}&\texttt{Base}&$\checkmark$\\ -\hline -\end{tabular} -\end{quote} -In its final form, the substituted requirement is trivially seen to be satisfied because the subject type is canonical-equal to the constraint type. +The substitution map $\Sigma_b$ does not satisfy the same-type requirement because the conformances of \texttt{Array} and \texttt{Set} to \texttt{Sequence} have different type witnesses for \texttt{Element}: +\[\SameReq{\ttgp{0}{0}.Element}{\ttgp{0}{1}.Element}\otimes\Sigma_b = \SameReq{Int}{String}\] +We diagnose the failure and reject the declaration of \texttt{B}. +\end{example} -\index{superclass type} -\paragraph{Third type alias} The context substitution map of the underlying type of \texttt{C}: +\begin{example} Let's take \texttt{Concat} as above, but reference it with generic arguments from a generic context. Let $G$ be the generic signature of \texttt{Concat}, and $H$ the generic signature of \texttt{OuterGeneric}. We are resolving the interface type of ``\texttt{x}'': +\begin{Verbatim} +struct OuterGeneric { + var x: Concat = ... +} +\end{Verbatim} +We build $\Sigma\in\SubMapObj{G}{\EquivClass{H}}$ by mapping our generic arguments into the primary generic environment of $H$: \[ -\SubstMap{ -\SubstType{T}{$\archetype{X}$}\\ -\SubstType{U}{$\archetype{Y}$} +\Sigma := \SubstMapLongC{ +\SubstType{\ttgp{0}{0}}{$\archetype{C}$}\\ +\SubstType{\ttgp{0}{1}}{$\archetype{C.SubSequence}$} +}{ +\SubstConf{\ttgp{0}{0}}{$\archetype{C}$}{Sequence}\\ +\SubstConf{\ttgp{0}{1}}{$\archetype{C.SubSequence}$}{Sequence} } \] -We apply this substitution map to the requirement of our generic signature: -\begin{quote} -\begin{tabular}{|l|l|l|} -\hline -Kind&Subject type&Constraint type\\ -\hline -Superclass&\archetype{X}&\texttt{Base<\archetype{Y}>}\\ -\hline -\end{tabular} -\end{quote} -The requirement is seen to be unsatisfied, as follows. As above, the archetype \archetype{X} is replaced with its superclass type \texttt{Derived}, which is replaced with its superclass type \texttt{Base}: -\begin{quote} -\begin{tabular}{|l|l|l|c|} -\hline -Kind&Subject type&Constraint type&Satisfied?\\ -\hline -Superclass&\texttt{Base}&\texttt{Base<$\archetype{Y}$>}&$\times$\\ -\hline -\end{tabular} -\end{quote} -At this point, the substituted requirement is between two different specializations of the same class declaration, \texttt{Base} and \texttt{Base<$\archetype{Y}$>}. They are not canonical-equal, because $\archetype{Y}$ is not \texttt{Int}, so the requirement is unsatisfied. -\end{example} -\index{specialized type} -\index{generic class type} -\index{class declaration} +Note that our original requirements contained type parameters of~$G$, but now involve the primary archetypes of $H$ after substitution: +\begin{gather*} +\ConfReq{\ttgp{0}{0}}{Sequence}\otimes\Sigma = \ConfReq{$\archetype{C}$}{Sequence}\\ +\ConfReq{\ttgp{0}{1}}{Sequence}\otimes\Sigma = \ConfReq{$\archetype{C.SubSequence}$}{Sequence}\\ +\SameReq{\ttgp{0}{0}.Element}{\ttgp{0}{1}.Element}\otimes\Sigma = \SameReq{$\archetype{C.Element}$}{$\archetype{C.Element}$} +\end{gather*} -The previous examples show that Algorithm~\ref{reqissatisfied} works when the substituted requirements contain archetypes as well as concrete types. As outlined in Section~\ref{archetypesubst}, archetypes can be used with global conformance lookup, getting the superclass type, and testing canonical equality. +We check the first two requirements by performing a global conformance lookup on each archetype. Recalling \SecRef{local requirements}, this is implemented as a generic signature query against the archetype's generic signature. In fact, from the associated requirements $\ConfReq{Self}{Sequence}$ and $\ConfReq{Self.SubSequence}{Collection}$ of \texttt{Collection}, we can derive $H\vdash\ConfReq{C}{Sequence}$ and $H\vdash\ConfReq{C.SubSequence}{Sequence}$. Global conformance lookup succeeds and outputs a pair of abstract conformances: +\begin{gather*} +\protosym{Sequence}\otimes\archetype{C}=\ConfReq{$\archetype{C}$}{Sequence}\\ +\protosym{Sequence}\otimes\archetype{C.SubSequence}=\ConfReq{$\archetype{C.SubSequence}$}{Sequence} +\end{gather*} +(In fact, we could have also checked that $\Sigma$ satisfies a conformance requirement with a \index{local conformance lookup}local conformance lookup into $\Sigma$ itself.) -There is one more case to cover. Recall from Section~\ref{trailing where clauses} that a \texttt{where} clause on a type declaration can constrain outer generic parameters. A similar phenomenon occurs when declarations appear inside constrained extensions, which will be introduced in Section~\ref{constrained extensions}. Type resolution needs to check these requirements even when resolving a non-generic identifier type representation, as shown in the next example. +The third requirement is a same-type requirement, so we apply $\Sigma$ to both sides; being dependent member types, we project a type witness from each abstract conformance: +\begin{gather*} +\texttt{\ttgp{0}{0}.Element}\otimes\Sigma=\AssocType{[Sequence]Element}\otimes\ConfReq{$\archetype{C}$}{Sequence}\\ +\texttt{\ttgp{0}{1}.Element}\otimes\Sigma=\AssocType{[Sequence]Element}\otimes\ConfReq{$\archetype{C.SubSequence}$}{Sequence} +\end{gather*} +The two type witness projections construct the \index{dependent member type}dependent member types \texttt{\ttgp{0}{0}.Element} and \texttt{\ttgp{0}{0}.SubSequence.Element}, and map them into the generic environment of~$H$. Since $H\vDash\SameReq{\ttgp{0}{0}.Element}{\ttgp{0}{0}.SubSequence.Element}$, both map to the same \index{reduced type}reduced type in $H$, so they also define the same archetype: +\begin{gather*} +\AssocType{[Sequence]Element}\otimes\ConfReq{$\archetype{C}$}{Sequence}=\archetype{C.Element}\\ +\AssocType{[Sequence]Element}\otimes\ConfReq{$\archetype{C.SubSequence}$}{Sequence}=\archetype{C.Element} +\end{gather*} +Both sides of our substituted same-type requirement are canonically equal, so we can conclude that all requirements of $G$ are satisfied, and our type representation is valid. +\end{example} -\begin{listing}\captionabove{Checking references to a nested type with a requirement on an outer generic parameter}\label{unsatisfied requirements nested} +\begin{algorithm}[``Requirement is satisfied'' check]\label{reqissatisfied} +Takes a substituted requirement $R\in\ReqObj{\EquivClass{H}}$ as input, where the generic signature $H$ is not given explicitly; $R$ may contain primary archetypes of $H$, but not type parameters. Returns true if $R$ is \IndexDefinition{satisfied requirement}satisfied, false otherwise. In the below, \texttt{T} is the concrete subject type of $R$, so $\texttt{T}\in\TypeObj{\EquivClass{H}}$. We handle each \index{requirement kind}requirement kind as follows: +\begin{itemize} +\item For a \index{conformance requirement}\textbf{conformance requirement} $\ConfReq{T}{P}$, we perform the \index{global conformance lookup}global conformance lookup $\protosym{P}\otimes\texttt{T}$. There are three possible outcomes: +\begin{enumerate} +\item If we get an \index{abstract conformance}abstract conformance, it must be that \texttt{T} is an archetype of $H$ whose type parameter conforms to $\protosym{P}$. Return true. +\item If we get a \index{concrete conformance}concrete conformance, it might be \index{conditional conformance}conditional (\SecRef{conditional conformance}). These conditional requirements are also substituted requirements of $\ReqObj{\EquivClass{H}}$, and we check them by recursively invoking this algorithm. If all conditional requirements are satisfied (or if there aren't any), return true. +\item If we get an invalid conformance, or if the conditional requirement check failed above, return false. +\end{enumerate} +\item For a \index{superclass requirement}\textbf{superclass requirement} $\ConfReq{T}{C}$, we proceed as follows: +\begin{enumerate} +\item If \texttt{T} is a class type \index{canonical type equality}canonically equal to \texttt{C}, return true. +\item If \texttt{T} and \texttt{C} are two distinct generic class types for the same \index{class declaration}class declaration, return false. +\item If \texttt{T} does not have a \index{superclass type}superclass type (\ChapRef{classinheritance}), return false. +\item Otherwise, let $\texttt{T}^\prime$ be the superclass type of \texttt{T}. Recursively apply the algorithm to the superclass requirement $\ConfReq{$\texttt{T}^\prime$}{C}$. +\end{enumerate} +\item For a \index{layout requirement}\textbf{layout requirement} $\ConfReq{T}{AnyObject}$, we check if \texttt{T} is a class type, an archetype satisfying the \Index{AnyObject@\texttt{AnyObject}}\texttt{AnyObject} \index{layout constraint}layout constraint, or an \index{Objective-C existential}\texttt{@objc} existential, and if so, we return true. Otherwise, we return false. (We'll discuss representation of existentials in \ChapRef{existentialtypes}.) +\index{canonical type equality} +\item For a \index{same-type requirement}\textbf{same-type requirement} $\ConfReq{T}{U}$, we check if \texttt{T} and \texttt{U} are canonically equal. +\end{itemize} +\end{algorithm} +\paragraph{Contextually-generic declarations.} +A type declaration with a trailing \Index{where clause@\texttt{where} clause}\texttt{where} clause but no generic parameter list was called a \index{contextually-generic declaration}contextually-generic declaration in \SecRef{requirements}. A non-generic declaration inside a constrained extension is conceptually similar; we'll meet constrained extensions in \SecRef{constrained extensions}. In both cases, the generic signature~$G$ of the referenced declaration and the generic signature of the parent context~$G^\prime$ share the same generic parameters, but $G$ has additional requirements not present in $G^\prime$. While there are no generic arguments to apply, we still proceed to check that~$\Sigma$ satisfies the requirements of~$G$. +\begin{example} +The \texttt{Inner} type below demonstrates the first case: \begin{Verbatim} struct Outer { struct Inner where T.Element == Int {} } -// (1) requirements are satisfied +// all requirements are satisfied typealias A = Outer>.Inner -// (2) requirement `T.Element == Int' is unsatisfied +// `T.Element == Int' is unsatisfied typealias B = Outer>.Inner \end{Verbatim} -\end{listing} - +The second type alias \texttt{B} is ultimately invalid. While the base \texttt{Outer>} resolves without error, the member type representation fails because this substituted requirement is unsatisfied, with $\Sigma$ as the \index{context substitution map}context substitution map of the base type: +\[\SameReq{\ttgp{0}{0}.Element}{Int}\otimes\Sigma=\SameReq{String}{Int}\] +\end{example} \begin{example} -Listing~\ref{unsatisfied requirements nested} shows two examples involving a generic requirement on an outer generic parameter. The generic signature of \texttt{Outer.Inner} is: -\begin{quote} -\begin{verbatim} - -\end{verbatim} -\end{quote} -The generic signature has two requirements: -\begin{quote} -\begin{tabular}{|l|l|l|} -\hline -Kind&Subject type&Constraint type\\ -\hline -Conformance&\texttt{T}&\texttt{Sequence}\\ -Same-type&\texttt{T.[Sequence]Element}&\texttt{Int}\\ -\hline -\end{tabular} -\end{quote} - -\paragraph{First type alias} The context substitution map of the underlying type of \texttt{A}: -\[ -\SubstMapC{\SubstType{T}{Array}}{\SubstConf{T}{Array}{Sequence}} -\] -We apply this substitution map to each requirement of our generic signature: -\begin{quote} -\begin{tabular}{|l|l|l|c|} -\hline -Kind&Subject type&Constraint type&Satisfied?\\ -\hline -Conformance&\texttt{Array}&\texttt{Sequence}&$\checkmark$\\ -Same-type&\texttt{Int}&\texttt{Int}&$\checkmark$\\ -\hline -\end{tabular} -\end{quote} - -\paragraph{Second type alias} The context substitution map of the underlying type of \texttt{B}: +There is one more detail left to discuss. Again using our \texttt{Concat} type, we now consider the following type alias: +\begin{Verbatim} +// (3) `T: Sequence' unsatisfied; +// `T.Element == U.Element' substitution failure +typealias C = Concat> +\end{Verbatim} +We get the following substitution map; note that it contains an invalid conformance: \[ -\SubstMapC{\SubstType{T}{Array}}{\SubstConf{T}{Array}{Sequence}} +\Sigma_c := \SubstMapLongC{ +\SubstType{\rT}{Float}\\ +\SubstType{\rU}{Set} +}{ +\mbox{(invalid)}\\ +\SubstConf{\rU}{Set}{Sequence} +} \] We apply this substitution map to each requirement of our generic signature: -\begin{quote} -\begin{tabular}{|l|l|l|c|} -\hline -Kind&Subject type&Constraint type&Satisfied?\\ -\hline -Conformance&\texttt{Array}&\texttt{Sequence}&$\checkmark$\\ -Same-type&\texttt{String}&\texttt{Int}&$\times$\\ -\hline -\end{tabular} -\end{quote} -The second requirement is unsatisfied, and type resolution emits a diagnostic. +\begin{gather*} +\ConfReq{\rT}{Sequence}\otimes\Sigma_a = \ConfReq{Float}{Sequence}\\ +\ConfReq{\rU}{Sequence}\otimes\Sigma_a = \ConfReq{Set}{Sequence}\\ +\SameReq{\rT.Element}{\rU.Element}\otimes\Sigma_a = \SameReq{<>}{Int} +\end{gather*} +The first conformance requirement is unsatisfied and will be diagnosed. The same-type requirement contains \index{error type}error types after substitution, and does not need to be diagnosed; in fact, its failure is a consequence of the first conformance requirement being unsatisfied. \end{example} +After applying our substitution map, but before checking the substituted requirement via the above algorithm, we look for error types. The presence of error types in the substituted requirement is indicative of one of several problems: +\begin{enumerate} +\item One of the generic argument types contained an error type, meaning an error was diagnosed earlier by type resolution. +\item The original type was a dependent member type, and the substituted base type does not conform to the member type's protocol. This means that an earlier \index{conformance requirement}conformance requirement was unsatisfied, and hence already diagnosed. +\item The declaration of a \index{normal conformance}normal conformance may itself contain error types for any invalid or missing \index{type witness}type witnesses, in which case projecting a type witness may output an error type; again, we will have diagnosed an error earlier, when checking the conformance. +\end{enumerate} +In all cases, a \index{diagnostic!substitution failure}diagnostic was already emitted, thus the requirement itself need not be diagnosed. We called this a \index{substitution failure}\emph{substitution failure} in \ChapRef{substmaps}. +\begin{algorithm}[Substitution map requirement check]\label{check generic arguments algorithm} +Takes two inputs: +\begin{enumerate} +\item A substitution map $\Sigma\in\SubMapObj{G}{\EquivClass{H}}$. +\item Some list of elements of $\ReqObj{G}$. (When checking the generic arguments of a type representation, these are the explicit requirements of the generic signature~$G$.) +\end{enumerate} +As output, returns an \emph{unsatisfied} list, and a \emph{failed} list. If both output lists are empty, all input requirements are satisfied by $\Sigma$. +\begin{enumerate} +\item (Setup) Initialize the two output lists, initially empty. +\item (Done) If the input list is empty, return. +\item (Next) Remove the next requirement $R$ from the input list and compute $R\otimes\Sigma$. +\item (Failed) If $R\otimes\Sigma$ contains error types, move $R$ to the failed list. +\item (Check) Move $R$ to the unsatisfied list if \AlgRef{reqissatisfied} reports that $R\otimes\Sigma$ is unsatisfied. +\item (Loop) Go back to Step~2. +\end{enumerate} +\end{algorithm} +If any requirements appear in the unsatisfied list, type resolution diagnoses a series of errors at the source location of the generic type representation, one for each unsatisfied requirement. Requirements on the failed list are dropped, because another diagnostic will have been emitted, as explained previously. If at least one requirement was unsatisfied or failed, the resolved type becomes the error type; this enforces the invariant that clients of type resolution will not encounter any types that do not satisfy generic requirements---with the important exception of the \index{error type}error type itself! + +\paragraph{There's more.} Other places where we use \AlgRef{check generic arguments algorithm}: +\begin{enumerate} +\item When \index{conformance checker}checking the declaration of a \index{normal conformance}normal conformance $\ConfReq{T}{P}$ where \texttt{T} is the declared interface type of some nominal type declaration, we must decide if a given set of type witnesses satisfy the associated requirements of $\protosym{P}$. In other words, we ask if the \index{protocol substitution map}protocol substitution map $\Sigma_{\ConfReq{T}{P}}$ satisfies the requirements of the \index{requirement signature}requirement signature of $\protosym{P}$ (\SecRef{requirement sig}). + +\item When a concrete type \texttt{T} conforms to a protocol $\protosym{P}$ via a \index{conditional conformance}conditional conformance, we check if the \index{context substitution map}context substitution map of \texttt{T} satisfies the conditional requirements of $\ConfReq{T}{P}$. We will describe this in \SecRef{conditional conformance}. + +\item The conditional requirements of a conditional conformance are also computed via the same algorithm when the declaration of the conformance is type checked. We ask which requirements in the generic signature of the constrained extension are \emph{not} satisfied by the generic signature of the \index{extended type}extended type. + +\item Checking if a subclass method is a well-formed override of a superclass method asks whether the generic signature of the subclass method satisfies each requirement of the generic signature of the superclass method (\SecRef{overridechecking}). +\end{enumerate} +There are also two related problems which follow different code paths but reason about requirements in the same way as above: +\begin{enumerate} +\item The expression type checker translates generic requirements to constraints when type checking a reference to a generic function; these constraints are then solved by the constraint solver and a substitution map is formed for the call. This is entirely analogous to what happens in type resolution when referencing a generic type declaration. + +\item Requirement inference is the step in building a new generic signature where we \emph{add} requirements to ensure that certain substituted requirements will be satisfied (\SecRef{requirementinference}). +\end{enumerate} \section{Unbound Generic Types}\label{unbound generic types} -\index{unbound generic type} -\index{type resolution context} -Recall from Section~\ref{misc types} that an \emph{unbound generic type} is a reference to a generic type declaration without generic arguments. You might remember from Section~\ref{identtyperepr} that generic arguments can be omitted when referencing a generic type declaration from inside its own context (or an extension). However, in that case type resolution does not return an actual unbound generic type; rather it is shorthand for forming a bound generic type whose generic arguments are the corresponding generic parameters of the type declaration. +Introduced in \SecRef{misc types}, the \index{unbound generic type}\emph{unbound generic type} represents a reference to a generic type declaration without generic arguments, and a \index{placeholder type}\emph{placeholder type} represents a specific missing generic arguments. Unbound generic types and placeholder types only appears when permitted by the \index{type resolution context}type resolution context. This corresponds to certain positions in the syntax where missing generic arguments can be filled in by some other mechanism, for example because we're resolving the type annotation for an expression whose type can be inferred. -Type representations only resolve to actual unbound generic types in a very limited set of positions. The position where a type representation appears is encoded by the \emph{type resolution context}, and the following type resolution contexts allow unbound generic types: +The following type resolution contexts allow unbound generic types or placeholder types, with each context handling them in some specific manner: \begin{enumerate} \index{variable declaration} \index{initial value expression} -\item The type annotation of a variable declaration with an initial value expression: -\begin{quote} -\begin{verbatim} +\item The type annotation of a variable declaration with an initial value expression may have either unbound generic types or placeholder types in various nested positions: +\begin{Verbatim} let callback: () -> (Array) = { return [1, 2, 3] } -\end{verbatim} -\end{quote} -The generic arguments of unbound generic types are inferred from the initial value expression when computing the variable declaration's interface type. +let dictionary: Dictionary = { 0: "zero" } +\end{Verbatim} +The missing generic arguments are inferred from the initial value expression when computing the variable declaration's interface type. -\item Types appearing in the expressions, such as a metatype value (\texttt{GenericType.self}), the callee of a constructor invocation (\texttt{GenericType(...)}), or the target type of a cast (\texttt{x as GenericType}). In all cases, the generic arguments are inferred from the surrounding expression. +\item Types written in expressions, such as metatype values (\texttt{GenericType.self}), the callees of constructor invocations (\texttt{GenericType(...)}), and target types of casts (\texttt{x as GenericType}). In all cases, the generic arguments are inferred from the surrounding expression. -\index{extension declaration} -\index{extended type} -\item The extended type of an extension: -\begin{quote} -\begin{verbatim} -struct GenericType { ... } +\item The \index{extended type}extended type of an \index{extension declaration}extension is typically written as an unbound generic type: +\begin{Verbatim} +struct GenericType { ... } extension GenericType { ... } -\end{verbatim} -\end{quote} -The extended type of an extension names a type declaration, not a specific specialization, so no generic arguments need to be provided. (You will see in Section~\ref{constrained extensions} that providing generic arguments for the extended type also has a meaning, as a shorthand for writing a series of same-type requirements.) +\end{Verbatim} +We will see in \SecRef{constrained extensions} that writing generic arguments for the extended type also has meaning, as a shorthand for a series of same-type requirements. Placeholder types cannot appear here. -\index{type alias declaration} -\index{underlying type} -\item The underlying type of a type alias: -\begin{quote} -\begin{verbatim} +\item The \index{underlying type}underlying type of a \index{type alias declaration}type alias may contain an unbound generic type (but not a placeholder type). This is a shorthand for a generic type alias that forwards its generic arguments, so the below are equivalent, given \texttt{GenericType} as above: +\begin{Verbatim} typealias GenericAlias = GenericType -\end{verbatim} -\end{quote} -This is a shorthand for a generic type alias that forwards its generic arguments, so the above is equivalent to: -\begin{quote} -\begin{verbatim} -typealias MyGenericAlias = MyGenericAlias -\end{verbatim} -\end{quote} -Note that such a type alias cannot have its own generic parameter list, so the following is invalid: -\begin{quote} -\begin{verbatim} -typealias MyGenericAlias = MyGenericAlias -\end{verbatim} -\end{quote} -Additionally, only the outermost type representation can resolve to an unbound generic type in this case. So the following is prohibited as well: -\begin{quote} -\begin{verbatim} -typealias MyGenericAlias = () -> (GenericType) -\end{verbatim} -\end{quote} +typealias GenericAlias = GenericType +\end{Verbatim} +If the underlying type of a type alias is an unbound generic type, it cannot have its own generic parameter list, and vice versa. Additionally, only the outermost type representation can resolve to an unbound generic type. Both of the following are thus invalid: +\begin{Verbatim} +typealias WrongGenericAlias = GenericType +typealias WrongFunctionAlias = () -> (GenericType) +\end{Verbatim} \end{enumerate} Notice how the first two contexts allow any type representation to resolve to an unbound generic type, while in the last two, only the top-level type representation can resolve to an unbound generic type. -\index{limitation} -\paragraph{A limitation} An unbound generic type can refer to either a nominal type declaration, or a type alias declaration. However, an unbound generic type referring to a type alias declaration cannot be the parent type of a nominal type. As Listing~\ref{unbound generic parent type} shows, it is an error to access a member type of a generic type alias without providing generic arguments, even if an unbound generic type referencing a nominal type declaration can appear in the same position. -\begin{listing}\captionabove{Member types of unbound generic types}\label{unbound generic parent type} +\paragraph{A limitation.} An unbound generic type can refer to either a nominal type declaration, or a type alias declaration. However, an unbound generic type referring to a type alias declaration cannot be the parent type of a nominal type. It is an error to access a member type of a generic type alias without providing generic arguments, \index{limitation!generic type alias with unbound generic type base}even when an unbound generic type referencing a nominal type declaration can appear in the same position: \begin{Verbatim} struct GenericType { struct Nested { @@ -897,147 +806,29 @@ \section{Unbound Generic Types}\label{unbound generic types} } typealias GenericAlias = GenericType -_ = GenericType.Nested(value: 123) // OK -_ = GenericAlias.Nested(value: 123) // OK -_ = GenericAlias.Nested(value: 123) // error -\end{Verbatim} -\end{listing} - -\index{type variable type} -\paragraph{A future direction} If you think of type representations as syntactic, and types as semantic, unbound generic types occupy a weird point in between. They are produced by type resolution and refer to type declarations, but they do not actually survive type checking, and neither expressions nor values can actually have an unbound generic type. - -A better model is to allow type resolution to take a callback that fills in missing generic arguments. The expression type checker would provide a callback which returns a fresh type variable type, other places might provide a callback which returns the corresponding generic parameter, and so on. This would eliminate unbound generic types altogether. - -This callback model is partially implemented today, but all existing callbacks simply return the unbound generic type; the presence of the callback effectively communicates to type resolution that an unbound generic type is permitted in this position. - -\section{Protocol Type Aliases}\label{protocol type alias} - -\index{type alias declaration} -\IndexDefinition{protocol type alias} -Now we're going to take a closer look at type alias members of protocols, or protocol type aliases for short. The other kinds of protocol members---associated types, methods, properties and subscripts---declare \emph{requirements} which must be implemented by the protocol's conforming types, a protocol type alias is merely a shorthand for writing out a possibly longer type by hand, much like any other type alias. - -\begin{listing}\captionabove{A protocol type alias as a member type of a type parameter and concrete type}\label{dependent protocol type alias listing} -\begin{Verbatim} -protocol Animal { - associatedtype Feed: AnimalFeed - typealias FeedStorage = Silo -} - -struct Cow: Animal { - typealias Feed = Grain -} - -func useAlias>(_: T) { - let x: T.Element.FeedStorage = ... // Silo - let y: Horse.FeedStorage = ... // Silo -} -\end{Verbatim} -\end{listing} - -A component of an identifier type representation can reference a protocol type alias as a member type of either of the following: -\begin{enumerate} -\item A type parameter conforming to the protocol. -\item A concrete type conforming to the protocol. -\item The protocol type itself. -\end{enumerate} -Listing~\ref{dependent protocol type alias listing} shows examples of the first two cases, which we're going to look at first. The third case is special; a protocol type alias is only visible as a member type of the protocol type itself if it is not \emph{dependent}. A protocol type alias is dependent if its underlying type contains the protocol \texttt{Self} type (or one of the protocol's associated types as a dependent member type of \texttt{Self}). - -\index{protocol substitution map} -Recall that when resolving an identifier type representation, each component after the first component is resolved by performing a qualified lookup into the previous component's type, and then applying a substitution map to the declared interface type of the found type declaration. When this declaration is a dependent protocol type alias, the substitution map is the protocol substitution map (Section~\ref{contextsubstmap}) constructed from the base type together with the base type's conformance to the protocol. - -\index{type parameter} -\index{type resolution stage} -\paragraph{Type parameter base} -If the base type is another type parameter, the behavior depends on the type resolution stage. -\index{structural resolution stage} -In the structural resolution stage, references to protocol type aliases resolve to unbound dependent member types, because there is not enough information to locate the type alias declaration via qualified lookup. This is entirely analogous to the situation with references to associated types, which require a generic signature before the associated type declaration can be found. - -In our example, the conformance of \texttt{T.Element} to \texttt{Animal} only becomes known after the generic signature has been built, so the type representation \texttt{T.Element.FeedStorage} resolves to the unbound dependent member type \texttt{T.Element.FeedStorage}. - -\index{abstract conformance} -\index{interface resolution stage} -In the interface resolution stage, the generic signature allows us to determine that \texttt{T.Element} conforms to \texttt{Animal}, and a qualified lookup finds the protocol type alias. Type resolution constructs the protocol substitution map with the type parameter and the abstract conformance of the type parameter to the protocol, and applies the substitution map to the underlying type of the type alias to get the substituted underlying type: -\[ -\texttt{Silo} \otimes \SubstMapC{\SubstType{Self}{T.Element}}{\SubstConf{Self}{T.Element}{Animal}} = \texttt{Silo} -\] -Applying this substitution map has the effect of replacing all occurrences of \texttt{Self} in the underlying type with this type parameter. The substituted type is then wrapped in a type alias type, which prints as \texttt{T.Element.FeedStorage} in diagnostics, but is canonically equal to \texttt{Silo}. - -\index{generic type alias} -\index{limitation} -Note that the special behavior of protocol type aliases with a type parameter base allows type resolution to proceed before a generic signature has been built. However, there is no way to construct an unbound dependent member type with applied generic arguments. This means that only non-generic protocol type aliases are supported in full generality. A type representation appearing in a position visited by the structural resolution stage (such as a function parameter or return type, or a requirement in a \texttt{where} clause) cannot reference a \emph{generic} type alias with a type parameter base. - -In our example, \texttt{FeedStorage} is not generic, and it is referenced from inside a function body, where it is not visited by the structural resolution stage regardless, so we're fine on both counts. Listing~\ref{unsupported generic type alias} shows a different example which hits the implementation limitation. -\begin{listing}\captionabove{Unsupported reference to generic type alias with type parameter base}\label{unsupported generic type alias} -\begin{Verbatim} -protocol P { - typealias G = (A) -> T - associatedtype A -} - -// `T.G' cannot be resolved from the `where' clause while building -// the generic signature of `f()' -func f(_: T, _: U) - where U.Element == T.G {} -\end{Verbatim} -\end{listing} -Note that generic protocol type aliases can still be accessed as member types of a concrete conforming type, because even in the structural resolution stage, there is enough information for name lookup to locate the type alias declaration. - -\index{global conformance lookup} -\index{concrete conformance} -\paragraph{Concrete base} -If the base type is a concrete type, the protocol substitution map is constructed with the base type and its conformance to the protocol, which is now a concrete conformance found by global conformance lookup: -\[ -\texttt{Silo} \otimes \SubstMapC{\SubstType{Self}{Horse}}{\SubstConf{Self}{Horse}{Animal}} = \texttt{Silo} -\] -Note that \texttt{Silo.Feed} was replaced with \texttt{Grain}. You might recall from Section~\ref{abstract conformances} that when a substitution map's replacement type for a generic parameter is a concrete type, applying the substitution map to a dependent member type outputs the corresponding type witness from the concrete conformance: -\begin{gather*} -\texttt{Self.Feed} \otimes \SubstMapC{\SubstType{Self}{Horse}}{\SubstConf{Self}{Horse}{Animal}}\\ -= \AssocType{[Animal]Feed} \otimes \ConfReq{Horse}{Animal} \\ -= \texttt{Grain} -\end{gather*} - -\index{protocol type} -\paragraph{Protocol base} The \texttt{FeedStorage} protocol type alias above cannot be used as a member type of \texttt{Animal} itself, because its underlying type contains the protocol \texttt{Self} type. The \texttt{Animal} protocol does not conform to itself, so a protocol substitution map cannot be formed for the base type \texttt{Animal}. Type resolution diagnoses an error when resolving the type representation \texttt{Animal.FeedStorage}. - -However, if a protocol type alias is not dependent---that is, if its underlying type does not contain any type parameters---then it can be referenced as a member type of the protocol itself. In this case, the underlying type is just a shortcut for some other type, and no substitution is performed when resolving the identifier type representation: -\begin{Verbatim} -protocol Animal { - typealias Age = Int -} - -// `Animal.Age' is just another spelling for `Int' -func celebrateBirthday(_ animal: A, age: Animal.Age) { +_ = GenericType.Nested(value: 123) // OK +_ = GenericAlias.Nested(value: 123) // OK - // Also just `Int'; type parameter base is ignored - let age: A.Age = ... - ... -} +_ = GenericAlias.Nested(value: 123) // error +_ = GenericAlias.Nested(value: 123) // OK \end{Verbatim} -A generic protocol type alias is always dependent, even if its underlying type does not reference \texttt{Self}, because resolving an identifier type representation that references a generic type alias must apply generic arguments, which requires forming a substitution map which is then applied to the underlying type of the type alias. Thus, this final kind of protocol type alias reference is only valid in a very narrow set of circumstances. - -For similar reasons, an associated type can never be referenced as a member type of the protocol. The declared interface type of an associated type \texttt{A} in a protocol \texttt{P} is the bound dependent member type \texttt{Self.[P]A}, which is always dependent since it contains \texttt{Self}. - -\index{protocol extension} -\paragraph{Protocol extensions} Type aliases can also appear in protocol extensions, where they have almost identical semantics to type aliases inside protocols. However, type aliases in protocol extensions cannot be referenced from a \texttt{where} clause, or a constraint type appearing in the inheritance clause of a generic parameter or associated type. The technical reason for this limitation will be explained in Section~\ref{building rules}. -\paragraph{History} In pre-evolution Swift, associated type declarations in protocols were declared with the \texttt{typealias} keyword, and protocol type aliases did not exist. Swift 2.2 \cite{se0011} added the distinct \texttt{associatedtype} keyword for declaring associated types, to leave room for protocol type aliases. Protocol type aliases were then introduced in Swift 3 \cite{se0092}. +\paragraph{A future direction.} If we think of type representations as syntactic, and types as semantic, unbound generic types occupy a weird point in between. They are produced by type resolution and refer to type declarations, but they do not actually survive type checking. When all is said and done, generic arguments in expressions will either be resolved or replaced with error types. We could eliminate unbound generic types from the implementation by reworking type resolution to take a callback that fills in missing generic arguments. Each context where an unbound generic type or placeholder type can appear would supply its own callback. For example, the expression type checker would provide returns a fresh \index{type variable type}type variable type. This callback model is partially implemented today, but all existing callbacks are trivial; the callback's presence simply communicates to type resolution that an unbound generic type is permitted in this position. \section{Source Code Reference}\label{type resolution source ref} -\subsection*{Type Resolution Interface} - Key source files: \begin{itemize} \item \SourceFile{include/swift/AST/TypeResolutionStage.h} \item \SourceFile{lib/Sema/TypeCheckType.h} \item \SourceFile{lib/Sema/TypeCheckType.cpp} \end{itemize} -First, we will look at how the rest of the compiler interfaces with type resolution. -\IndexSource{type resolution options} +A client of \IndexSource{type resolution}type resolution must first create a \texttt{TypeResolutionOptions}, and then a \texttt{TypeResolution} instance from that. The latter defines a \texttt{resolveType()} method to resolve a \texttt{TypeRepr *} to a \texttt{Type}. We start with \texttt{TypeResolutionOptions} and its constituent parts, \texttt{TypeResolverContext} and \texttt{TypeResolutionFlags}. + \apiref{TypeResolutionOptions}{class} -Encodes the semantic information that controls various different behaviors of type resolution, consisting of a base context, a current context, and flags. The base context and context are initially identical. When type resolution recurses on a nested type representation, it preserves the base context, but possibly changes the context, and clears the \texttt{Direct} flag. Thus, the base context encodes the overall syntactic position of the type representation, while the context encodes the role of the current type representation. +Encodes \IndexSource{type resolution options}the semantic behavior of type resolution, consisting of a base context, a current context, and flags. The base context and context are initially identical. When type resolution recurses on a nested type representation, it preserves the base context, possibly changes the current context, and clears the \texttt{Direct} flag. Thus, the base context encodes the overall syntactic position of the type representation, while the context encodes the role of the current type representation. \begin{itemize} \item \texttt{TypeResolutionOptions(TypeResolutionContext)} creates a new instance with the given base context, and no flags. \item \texttt{getBaseContext()} returns the base \texttt{TypeResolverContext}. @@ -1045,9 +836,8 @@ \subsection*{Type Resolution Interface} \item \texttt{getFlags()} returns the \texttt{TypeResolutionFlags}. \end{itemize} -\IndexSource{type resolution context} \apiref{TypeResolverContext}{enum class} -The type resolution context, which encodes the position of a type representation: +The \IndexSource{type resolution context}type resolution context, which encodes the position of a type representation: \begin{itemize} \item \texttt{TypeResolverContext::None}: no special type handling is required. \item \texttt{TypeResolverContext::GenericArgument}: generic arguments of a bound generic type. @@ -1069,7 +859,7 @@ \subsection*{Type Resolution Interface} \item \texttt{TypeResolverContext::EnumPatternPayload}: the payload type of an enum element pattern. Tweaks the behavior of tuple element labels. \item \texttt{TypeResolverContext::TypeAliasDecl}: the underlying type of a non-generic type alias. \item \texttt{TypeResolverContext::GenericTypeAliasDecl}: the underlying type of a generic type alias. -\item \texttt{TypeResolverContext::ExistentialConstraint}: the constraint type of an existential type (Chapter~\ref{existentialtypes}); +\item \texttt{TypeResolverContext::ExistentialConstraint}: the constraint type of an existential type (\ChapRef{existentialtypes}); \item \texttt{TypeResolverContext::GenericRequirement}: the constraint type of a conformance requirement in a \texttt{where} clause. \item \texttt{TypeResolverContext::SameTypeRequirement}: the subject type or constraint type of a same-type requirement in a \texttt{where} clause. \item \texttt{TypeResolverContext::ProtocolMetatypeBase}: the instance type of a protocol metatype, like \texttt{P.Protocol}. @@ -1098,28 +888,28 @@ \subsection*{Type Resolution Interface} \item \texttt{TypeResolutionFlags::Preconcurrency}: when resolving a reference to a type alias marked with the \texttt{@preconcurrency} attribute, strips off \texttt{@Sendable} and global actor attributes from function types appearing in the underlying type of the type alias. \end{itemize} -\IndexSource{structural resolution stage} -\IndexSource{interface resolution stage} -\IndexSource{contextual type} \apiref{TypeResolution}{class} -This class is the main entry point into type resolution. A pair of static factory methods create instances. Each method takes a \texttt{DeclContext}, \texttt{TypeResolutionOptions}, and a pair of callbacks for resolving placeholder types and unbound generic types: +After initializing options, a client must create an instance of this class by calling one of two factory methods, depending on the desired \IndexSource{type resolution stage}type resolution stage. Each method takes a \texttt{DeclContext}, \texttt{TypeResolutionOptions}, and a pair of callbacks for resolving placeholder types and unbound generic types: \begin{itemize} -\item \texttt{forStructural()} creates a type resolution in structural resolution stage. -\item \texttt{forInterface()} creates a type resolution in interface resolution stage. +\item \texttt{forStructural()} creates a type resolution in \IndexSource{structural resolution stage}structural resolution stage. +\item \texttt{forInterface()} creates a type resolution in \IndexSource{interface resolution stage}interface resolution stage. \end{itemize} -The main entry point is an instance method: +The main thing you can do with an instance is, of course, resolve types: \begin{itemize} \item \texttt{resolveType()} takes a \texttt{TypeRepr *} and returns a \texttt{Type}. \end{itemize} -A static utility method creates a new type resolution in interface resolution stage, resolves a type representation, and maps it into a generic environment; this is used by the expression type checker to resolve types appearing in expression context: + +Another way to use this class is to call a static utility method that creates a new type resolution in interface resolution stage, resolves a type representation, and maps it into a generic environment to get a \IndexSource{contextual type}contextual type. This is used by the expression type checker to resolve types appearing in expression context: \begin{itemize} \item \texttt{resolveContextualType()} \end{itemize} -Utility methods used by the implementation of \texttt{resolveType()}: +The actual implementation of \texttt{resolveType()} uses these getter methods: \begin{itemize} \item \texttt{getDeclContext()} returns the declaration context where this type representation appears. \item \texttt{getASTContext()} returns the global AST context singleton. \item \texttt{getStage()} returns the current \texttt{TypeResolutionStage}. +\item \texttt{getOptions()} returns the current \texttt{TypeResolutionOptions}. +\item \texttt{getGenericSignature()} returns the current \texttt{GenericSignature} in interface resolution stage. \end{itemize} \IndexSource{type resolution stage} @@ -1131,15 +921,13 @@ \subsection*{Type Resolution Interface} \end{itemize} \apiref{ResolveTypeRequest}{class} -The \texttt{TypeResolution::resolveType()} entry point evaluates the \texttt{ResolveTypeRequest}. While this request is not cached, it uses the request evaluator infrastructure to detect and diagnose circular type resolution. The evaluation function creates an instance of a visitor class, which recursively visits the given type representation. +The implementation of the \texttt{TypeResolution::resolveType()} entry point evaluates the \texttt{ResolveTypeRequest}. While this request is not cached, it uses the request evaluator infrastructure to detect and diagnose circular type resolution. The evaluation function destructures the given type representation by kind using a visitor class. \apiref{TypeResolver}{class} A recursive visitor, internal to the implementation of type resolution. The methods of this visitor implement type resolution for each kind of type representation. -\subsection*{Identifier Type Representations} +\subsection*{Type Representations} -\IndexSource{type representation} -\IndexSource{identifier type representation} Key source files: \begin{itemize} \item \SourceFile{include/swift/AST/TypeRepr.h} @@ -1147,18 +935,33 @@ \subsection*{Identifier Type Representations} \end{itemize} \apiref{TypeRepr}{class} -Abstract base class for type representations. +Abstract base class for \IndexSource{type representation}type representations. This class hierarchy resembles \texttt{TypeBase}, \texttt{Decl}, \texttt{ProtocolConformance}, and others. -\apiref{IdentTypeRepr}{class} -Abstract base class for identifier type representations. +Type representations store a source location and kind: +\begin{itemize} +\item \texttt{getLoc()}, \texttt{getSourceRange()} returns the source location and source range of this type representation. +\item \texttt{getKind()} returns the \texttt{TypeReprKind}. +\end{itemize} +Each \texttt{TypeReprKind} corresponds to a subclass of \texttt{TypeRepr}. Instances of subclasses support safe downcasting via the \verb|isa<>|, \verb|cast<>| and \verb|dyn_cast<>| template functions, +\begin{Verbatim} +if (auto *indentRepr = dyn_cast(typeRepr)) + ... + +auto *funcRepr = cast(typeRepr); +... -\apiref{ComponentIdentTypeRepr}{class} -Abstract base class for components of identifier type representations. +assert(isa(typeRepr)); +\end{Verbatim} + +\subsection*{Identifier Type Representations} + +\apiref{DeclRefTypeRepr}{class} +Abstract base class for identifier and member type representations. \begin{itemize} -\item \texttt{getNameRef()} returns the name stored in this type representation. -\item \texttt{getNameLoc()} returns the source location of the name stored in this type representation. +\item \texttt{getNameRef()} returns the identifier. +\item \texttt{getGenericArgs()} returns the array of \IndexSource{generic arguments}generic argument type representations, if any were written. \end{itemize} -Recall that type representations cache the type declaration found by name lookup to speed up the case where multiple type resolution stages resolve the same type representation: +Recall that type representations cache the type declaration found by name lookup to speed up the case where both type resolution stages resolve the same type representation: \begin{itemize} \item \texttt{getBoundDecl()} returns the cached type declaration. \item \texttt{getDeclContext()} for the first component only, returns the outer declaration context where unqualified lookup found the type declaration. @@ -1166,24 +969,8 @@ \subsection*{Identifier Type Representations} \end{itemize} Note that \texttt{getDeclContext()} is not always the same as the declaration context of the type declaration, because the type declaration might be a member of a conformed protocol or superclass of the declaration context, having been reached indirectly. However, the declaration context is always one of the outer declaration contexts in which the type representation appears. -\apiref{SimpleIdentTypeRepr}{class} -A non-generic component of an identifier type representation. - -\IndexSource{generic arguments} -\apiref{GenericIdentTypeRepr}{class} -A generic component of an identifier type representation. -\begin{itemize} -\item \texttt{getGenericArgs()} returns the generic arguments of this component as an array of type representations. -\end{itemize} - -\apiref{CompoundIdentTypeRepr}{class} -An identifier type representation consisting of two or more components. -\begin{itemize} -\item \texttt{getComponents()} returns an array of \texttt{ComponentIdentTypeRepr}. -\end{itemize} - -\apiref{resolveIdentTypeComponent()}{function} -The \texttt{TypeResolver()} visitor calls this static function in \texttt{TypeCheckType.cpp} to resolve identifier type representations. +\apiref{IdentTypeRepr}{class} +Subclass of \texttt{DeclRefTypeRepr}. Represents something like \texttt{Int} or \texttt{Array}. \subsection*{Unqualified Lookup} @@ -1208,6 +995,14 @@ \subsection*{Unqualified Lookup} \apiref{resolveTypeDecl()}{function} Resolves an unqualified reference to a type declaration to a type. This handles the behavior where omitted generic arguments when referencing type within its own context are deduced to be the corresponding generic parameters. It also handles various edge-case behaviors where the referenced type declaration was a nested type of a different nominal type, such as a superclass or conformed protocol, and requires substitutions. +\subsection*{Member Type Representations} + +\apiref{MemberTypeRepr}{class} +Subclass of \texttt{DeclRefTypeRepr} representing a member type representation. +\begin{itemize} +\item \texttt{getBase()} returns the base type representation. +\end{itemize} + \subsection*{Qualified Lookup} Key source files: @@ -1227,29 +1022,36 @@ \subsection*{Qualified Lookup} Wrapper around qualified lookup which only considers type declarations, and resolves type witnesses of concrete types when it finds an associated type declaration with a concrete base type. \apiref{TypeChecker::isUnsupportedMemberTypeAccess()}{function} -Determines if a member type access is invalid. This includes such oddities as accessing a member type of an unbound generic type referencing a type alias (Section~\ref{unbound generic types}), or accessing a dependent protocol type alias where the base type is the protocol itself (Section~\ref{protocol type alias}). +Determines if a member type access is invalid. This includes such oddities as accessing a member type of an unbound generic type referencing a type alias (\SecRef{unbound generic types}), or accessing a dependent protocol type alias where the base type is the protocol itself (\SecRef{member type repr}). -\subsection*{Checking Generic Arguments} +\subsection*{Applying Generic Arguments} -\IndexSource{generic arguments} \begin{itemize} \item \SourceFile{include/swift/AST/Requirement.h} \item \SourceFile{lib/AST/Requirement.cpp} \item \SourceFile{lib/Sema/TypeCheckGeneric.cpp} \end{itemize} \apiref{applyGenericArguments()}{function} -Constructs a nominal type or type alias type, given a type declaration and a list of generic arguments. In the interface resolution stage, also checks that the generic arguments satisfy the requirements of the type declaration's generic signature. +Constructs a nominal type or type alias type, given a type declaration and a list of \IndexSource{generic arguments}generic arguments. In the interface resolution stage, also checks that the generic arguments satisfy the requirements of the type declaration's generic signature. + +\apiref{TypeResolution::applyUnboundGenericArguments()}{method} +Factor of \texttt{applyGenericArguments()} to build a substitution map from the base type and generic arguments and check requirements of a \IndexSource{generic declaration}generic declaration. + +\apiref{TypeResolution::checkContextualRequirements()}{method} +Factor of \texttt{applyGenericArguments()} to build the substitution map from the base type and check requirements of \IndexSource{contextually-generic declaration}contextually-generic declarations. -\IndexSource{requirement} -\IndexSource{substituted requirement} \apiref{Requirement}{class} -See also Section~\ref{genericsigsourceref}. +See also \SecRef{genericsigsourceref}. Requirement substitution: +\begin{itemize} +\item \texttt{subst()} applies a substitution map to this \IndexSource{requirement}requirement, returning the \IndexSource{substituted requirement}substituted requirement. +\end{itemize} +Checking requirements: \begin{itemize} -\item \texttt{subst()} applies a substitution map to this requirement, returning the substituted requirement. -\item \texttt{isSatisfied()} answers if the substituted requirement is satisfied, implementing Algorithm~\ref{reqissatisfied}. +\item \texttt{checkRequirement()} answers if a single requirement is satisfied, implementing \AlgRef{reqissatisfied}. Any conditional requirements are returned via a \texttt{SmallVector} supplied by the caller, who must then check these requirements recursively. +\item \texttt{checkRequirements()} checks each requirement in an array; if any have their own conditional requirements, those are checked too. This implements \AlgRef{check generic arguments algorithm} for when the caller needs to check a condition without emitting diagnostics. For example, this is used by \texttt{checkConformance()} to check conditional requirements in \SecRef{extensionssourceref}. \end{itemize} \apiref{checkGenericArgumentsForDiagnostics()}{function} -Implements Algorithm~\ref{check generic arguments algorithm} using \texttt{Requirement::isSatisfied()}. +Uses \texttt{Requirement::checkRequirement()} to implements a variant of \AlgRef{check generic arguments algorithm} that records additional information for diagnostics. This is used by type resolution and the \IndexSource{conformance checker}conformance checker. -\end{document} \ No newline at end of file +\end{document} diff --git a/docs/Generics/chapters/type-substitution-summary.tex b/docs/Generics/chapters/type-substitution-summary.tex index c0eb8a8be69f3..49768a938117d 100644 --- a/docs/Generics/chapters/type-substitution-summary.tex +++ b/docs/Generics/chapters/type-substitution-summary.tex @@ -4,39 +4,42 @@ \chapter{Type Substitution Summary}\label{notation summary} -We've been using an algebraic notation for expressing identities among types, conformances and substitution maps. The formalism was gradually introduced over the course of Chapter \ref{substmaps}~and~\ref{conformances}, and this chapter filled in the gaps around conformance paths and dependent member type substitution. Now, we're going to restate the rules of this algebra in full to serve as a convenient reference. +We've been using the \index{type substitution}type substitution algebra for describing identities among types, conformances and substitution maps. The formalism was introduced over the course of Chapters \ref{substmaps}, \ref{conformances}, and \ref{conformance paths}. The entities being operated upon: \begin{itemize} \item \IndexSet{proto}{\ProtoObj}$\ProtoObj$ is the set of all \index{protocol declaration}protocols. -\item \IndexSet{type}{\TypeObj{G}}$\TypeObj{G}$ is the set of all \index{interface type}interface types valid in the \index{generic signature}generic signature $G$ (see Section~\ref{derived req}). +\item \IndexSet{type}{\TypeObj{G}}$\TypeObj{G}$ is the set of all \index{interface type}interface types valid in the \index{generic signature}generic signature $G$ (see \SecRef{derived req}). \item \IndexSet{sub}{\SubMapObj{G}{H}}$\SubMapObj{G}{H}$ is the set of all \index{substitution map}substitution maps with \index{input generic signature}input generic signature $G$ and \index{output generic signature}output generic signature $H$. \item \IndexSet{conf}{\ConfObj{G}}$\ConfObj{G}$ is the set of all \index{conformance}conformances with output generic signature $G$. \item \IndexSet{confp}{\ConfPObj{P}{G}}$\ConfPObj{P}{G}$ is the set of all conformances to protocol \texttt{P} with output generic signature $G$. \item \IndexSet{assoctype}{\AssocTypeObj{P}}$\AssocTypeObj{P}$ is the set of all \index{associated type declaration}associated type declarations in protocol $\texttt{P}$. \item \IndexSet{assocconf}{\AssocConfObj{P}}$\AssocConfObj{P}$ is the set of all \index{associated conformance requirement}associated conformance requirements in protocol $\texttt{P}$. +\item \IndexSet{req}{\ReqObj{G}}$\ReqObj{G}$ is the set of all \index{requirement}requirements containing interface types of $G$. \end{itemize} A composition operation is defined on various kinds of entities, written as \index{$\otimes$}$A \otimes B$. If a juxtaposition of three entities can be evaluated in two different ways, the result is always the same, so the composition operator is \emph{associative}: $A\otimes (B\otimes C) = (A\otimes B) \otimes C$. -Substitution maps act on the right of types, conformances, and other substitution maps. This gives us \index{type substitution}type substitution (Section~\ref{substmaps}), \index{conformance substitution map}conformance substitution (Chapter~\ref{conformances}) and \index{substitution map composition}substitution map composition (Section~\ref{submapcomposition}). This action produces a new entity of the same kind as the original: +Substitution maps act on the right of types, conformances, other substitution maps, and requirements. This gives us \index{type substitution}type substitution (\SecRef{substmaps}), \index{conformance substitution map}conformance substitution (\ChapRef{conformances}), \index{substitution map composition}substitution map composition (\SecRef{submapcomposition}), and \index{substituted requirement}requirement substitution (\SecRef{checking generic arguments}). This action produces a new entity of the same kind as the original: \begin{gather*} \TypeObj{G}\otimes\SubMapObj{G}{H}\longrightarrow\TypeObj{H}\\ \ConfObj{G}\otimes\SubMapObj{G}{H}\longrightarrow\ConfObj{H}\\ -\SubMapObj{F}{G}\otimes\SubMapObj{G}{H}\longrightarrow\SubMapObj{F}{H} +\SubMapObj{F}{G}\otimes\SubMapObj{G}{H}\longrightarrow\SubMapObj{F}{H}\\ +\ReqObj{G}\otimes\SubMapObj{G}{H}\rightarrow\ReqObj{H} \end{gather*} -Protocols act on the left of types (\index{global conformance lookup}global conformance lookup, Section~\ref{conformance lookup}): + +Protocols act on the left of types (\index{global conformance lookup}global conformance lookup, \SecRef{conformance lookup}): \[\ProtoObj\otimes\TypeObj{G}\longrightarrow\ConfObj{G}\] -Associated type declarations act on the left of conformances (\index{type witness}type witness projection, Section~\ref{type witnesses}): +Associated type declarations act on the left of conformances (\index{type witness}type witness projection, \SecRef{type witnesses}): \[\AssocTypeObj{P}\otimes\ConfPObj{P}{G}\longrightarrow\TypeObj{G}\] -Associated conformance requirements act on the left of conformances (\index{associated conformance projection}associated conformance projection, Section~\ref{associated conformances}): +Associated conformance requirements act on the left of conformances (\index{associated conformance projection}associated conformance projection, \SecRef{associated conformances}): \[\AssocConfObj{P}\otimes\ConfPObj{P}{G}\longrightarrow\ConfPObj{P}{G}\] -\paragraph{Substitution maps} +\paragraph{Substitution maps.} A substitution map is completely determined by its action on each generic parameter type and root abstract conformance. Type substitution with a \index{generic parameter type}generic parameter type projects the replacement type from the substitution map: \[\ttgp{d}{i} \otimes \SubstMapC{\ldots,\,\SubstType{\ttgp{d}{i}}{Int},\,\ldots}{\ldots} := \texttt{Int}\] Conformance substitution with a \index{root abstract conformance}root abstract conformance projects the corresponding conformance from the substitution map: \[ -\Conf{\texttt{T}}{Q} \otimes \SubstMapC{\ldots}{\ldots,\,\SubstConf{T}{G}{Q},\,\ldots} := \Conf{\texttt{G}}{Q} +\ConfReq{T}{Q} \otimes \SubstMapC{\ldots}{\ldots,\,\SubstConf{T}{G}{Q},\,\ldots} := \ConfReq{G}{Q} \] If $\texttt{T}\in\TypeObj{G_1}$, $\ConfReq{T}{P}\in\ConfPObj{P}{G_1}$, $\Sigma_1\in\SubMapObj{G_1}{G_2}$, $\Sigma_2\in\SubMapObj{G_2}{G_3}$, $\Sigma_3\in\SubMapObj{G_3}{G_4}$, then \begin{gather*} @@ -45,27 +48,27 @@ \chapter{Type Substitution Summary}\label{notation summary} (\Sigma_1\otimes \Sigma_2) \otimes \Sigma_3=\Sigma_1\otimes(\Sigma_2\otimes \Sigma_3) \end{gather*} -\paragraph{Conformances} Type witness and associated conformance projection is compatible with conformance substitution. Suppose $\ConfReq{T}{P}\in\ConfPObj{P}{G}$, $\AssocType{[P]A}\in\AssocTypeObj{P}$, $\AssocConf{Self.U}{Q}\in\AssocConfObj{P}$, and $\Sigma\in\SubMapObj{G}{H}$, then +\paragraph{Conformances.} Type witness and associated conformance projection is compatible with conformance substitution. Suppose $\ConfReq{T}{P}\in\ConfPObj{P}{G}$, $\AssocType{[P]A}\in\AssocTypeObj{P}$, $\AssocConf{Self.U}{Q}\in\AssocConfObj{P}$, and $\Sigma\in\SubMapObj{G}{H}$, then \begin{gather*} \bigl(\AssocType{[P]A}\otimes \ConfReq{T}{P}\bigr)\otimes \Sigma=\AssocType{[P]A}\otimes (\ConfReq{T}{P}\otimes \Sigma)\\ \bigl(\AssocConf{Self.U}{Q}\otimes \ConfReq{T}{P}\bigr)\otimes \Sigma=\AssocConf{Self.U}{Q}\otimes \bigl(\ConfReq{T}{P}\otimes \Sigma\bigr) \end{gather*} -If $d$ is a \index{nominal type declaration}nominal type declaration conforming to \texttt{P}, and $G$ is the generic signature of the conformance context, then there exists a \index{normal conformance}normal conformance $\Conf{\texttt{T}_d}{P}\in\ConfPObj{P}{G}$, and -\[\protosym{P}\otimes\texttt{T}_d=\Conf{\texttt{T}_d}{P}\] -If $H$ is some other generic signature, $\Sigma\in\SubMapObj{G}{H}$, and $\texttt{T}=\texttt{T}_d\otimes \Sigma$, then there exists a \index{specialized conformance}specialized conformance $\Conf{\texttt{T}}{P}\in\ConfPObj{P}{H}$, and -\[\protosym{P}\otimes\texttt{T}=\Conf{\texttt{T}_d}{P}\otimes \Sigma = \Conf{\texttt{T}}{P}\] -If \texttt{T} is a \index{type parameter}type parameter of $G$ and the conformance requirement $\ConfReq{T}{P}$ can be derived from $G$, then there exists an \index{abstract conformance}abstract conformance $\Conf{\texttt{T}}{P}\in\ConfPObj{G}{P}$, and -\[\protosym{P}\otimes\texttt{T} = \Conf{\texttt{T}}{P}\] +If $d$ is a \index{nominal type declaration}nominal type declaration conforming to \texttt{P}, and $G$ is the generic signature of the conformance context, then there exists a \index{normal conformance}normal conformance $\ConfReq{$\texttt{T}_d$}{P}\in\ConfPObj{P}{G}$, and +\[\protosym{P}\otimes\texttt{T}_d=\ConfReq{$\texttt{T}_d$}{P}\] +If $H$ is some other generic signature, $\Sigma\in\SubMapObj{G}{H}$, and $\texttt{T}=\texttt{T}_d\otimes \Sigma$, then there exists a \index{specialized conformance}specialized conformance $\ConfReq{T}{P}\in\ConfPObj{P}{H}$, and +\[\protosym{P}\otimes\texttt{T}=\ConfReq{$\texttt{T}_d$}{P}\otimes \Sigma = \ConfReq{T}{P}\] +If \texttt{T} is a \index{type parameter}type parameter of $G$ and the conformance requirement $\ConfReq{T}{P}$ can be derived from $G$, then there exists an \index{abstract conformance}abstract conformance $\ConfReq{T}{P}\in\ConfPObj{G}{P}$, and +\[\protosym{P}\otimes\texttt{T} = \ConfReq{T}{P}\] Global conformance lookup is compatible with type substitution: \[(\protosym{P}\otimes\texttt{T})\otimes \Sigma = \texttt{P}\otimes(\texttt{T}\otimes \Sigma)\] -\paragraph{Abstract conformances} The type witnesses of an abstract conformance are \index{dependent member type}dependent member types, and the associated conformances are other abstract conformances: +\paragraph{Abstract conformances.} The type witnesses of an abstract conformance are \index{dependent member type}dependent member types, and the associated conformances are other abstract conformances: \begin{gather*} -\AssocType{[P]A} \otimes \Conf{\texttt{T}}{P} = \texttt{T.[P]A}\\ -\AssocConf{Self.U}{Q} \otimes \Conf{\texttt{T}}{P} = \Conf{\texttt{T.U}}{Q} +\AssocType{[P]A} \otimes \ConfReq{T}{P} = \texttt{T.[P]A}\\ +\AssocConf{Self.U}{Q} \otimes \ConfReq{T}{P} = \ConfReq{T.U}{Q} \end{gather*} -Applying a substitution map to an abstract conformance is called \index{local conformance lookup}local conformance lookup (Section~\ref{abstract conformances}). The first identity above defines substitution of dependent member types in terms of local conformance lookup: -\[\texttt{T.[P]A} \otimes \Sigma = \AssocType{[P]A} \otimes (\Conf{\texttt{T}}{P} \otimes \Sigma)\] +Applying a substitution map to an abstract conformance is called \index{local conformance lookup}local conformance lookup (\SecRef{abstract conformances}). The first identity above defines substitution of dependent member types in terms of local conformance lookup: +\[\texttt{T.[P]A} \otimes \Sigma = \AssocType{[P]A} \otimes (\ConfReq{T}{P} \otimes \Sigma)\] Every abstract conformance has a unique factorization as a \index{reduced conformance path}reduced conformance path. A conformance path is a sequence of zero or more associated conformance projections applied to an root abstract conformance: \[ \AssocConf{Self.[$\texttt{P}_{n+1}$]$\texttt{A}_n$}{$\texttt{P}_n$} @@ -74,7 +77,7 @@ \chapter{Type Substitution Summary}\label{notation summary} \otimes \AssocConf{Self.[$\texttt{P}_2$]$\texttt{A}_1$}{$\texttt{P}_1$} \otimes -\Conf{\texttt{T}_0}{$\texttt{P}_0$} +\ConfReq{$\texttt{T}_0$}{$\texttt{P}_0$} \] When applied to a non-root abstract conformance, local conformance lookup first factors the abstract conformance with a conformance path. Thus, applying a substitution map to a dependent member type breaks down into this juxtaposition of entities, from \emph{right} to \emph{left}: \begin{itemize} @@ -84,4 +87,4 @@ \chapter{Type Substitution Summary}\label{notation summary} \end{itemize} The substituted type of a dependent member type is computed by taking a root conformance from the substitution map, following a chain of zero or more associated conformance projections, and finally projecting the type witness. -\end{document} \ No newline at end of file +\end{document} diff --git a/docs/Generics/chapters/types.tex b/docs/Generics/chapters/types.tex index 2e7ff20fa36d8..a9e572094d2db 100644 --- a/docs/Generics/chapters/types.tex +++ b/docs/Generics/chapters/types.tex @@ -119,7 +119,7 @@ \chapter{Types}\label{types} \end{minipage} \medskip -Generic parameters are declared in a generic parameter list attached to a declaration. They are scoped to the body of this declaration, and can be uniquely identified by a pair of integers, the \emph{depth} and \emph{index}. Generic parameter types can also store a name, but the name only has significance when printed in diagnostics. +Generic parameters are declared in a generic parameter list attached to a declaration. They are scoped to the body of this declaration, and can be uniquely identified by a pair of integers, the \emph{depth} and \emph{index}. Generic parameter types can also store a name, but the name only has significance when printed in \index{diagnostic!printing generic parameter type}diagnostics. \medskip @@ -144,19 +144,19 @@ \chapter{Types}\label{types} \end{minipage} \medskip -Outside of type resolution, type representations do not play a big role in the compiler, so we punt on the topic of type representations until Chapter~\ref{typeresolution} and just focus on types for now. For our current purposes, it suffices to say that type resolution is really just one possible mechanism by which types are constructed. The expression checker builds types by solving a constraint system, and the generics system builds types via substitution, to give two examples. +Outside of type resolution, type representations do not play a big role in the compiler, so we punt on the topic of type representations until \ChapRef{typeresolution} and just focus on types for now. For our current purposes, it suffices to say that type resolution is really just one possible mechanism by which types are constructed. The expression checker builds types by solving a constraint system, and the generics system builds types via substitution, to give two examples. \paragraph{Structural components.} A type is constructed from structural components, which may either be other types, or non-type information. Common examples include: nominal types, which consist of a pointer to a declaration, together with a list of \index{generic argument}generic argument types; \index{tuple type}tuple types, which have element types and labels; and \index{function type}function types, which contain parameter types, return types, and various additional bits like \texttt{@escaping} and \texttt{inout}. We will give a full accounting of all type kinds and their structural components in the second half of this chapter. Once created, types are immutable. To say that a type \emph{contains} another type means that the latter appears as a structural component of the former, perhaps nested several levels deep. We will often talk about \emph{replacing} a type contained by another type. This is understood as constructing a new type with the same kind as the original type, preserving all structural components except for the one being replaced. The original type is never mutated directly. -More generally, types can be transformed by taking the type apart by kind, recursively transforming each structural component, and forming a new type of the same kind from the new components. To preview Chapter~\ref{substmaps}, if \texttt{Element} is a generic parameter type, the type \texttt{Array} can be formed from \texttt{Array} by replacing \texttt{Element} with \texttt{Int}; this is called \emph{type substitution}. The compiler provides various utilities to simplify the task of implementing recursive walks and transformations over kinds of types; type substitution is one example of such a transformation. +More generally, types can be transformed by taking the type apart by kind, recursively transforming each structural component, and forming a new type of the same kind from the new components. To preview \ChapRef{substmaps}, if \texttt{Element} is a generic parameter type, the type \texttt{Array} can be formed from \texttt{Array} by replacing \texttt{Element} with \texttt{Int}; this is called \emph{type substitution}. The compiler provides various utilities to simplify the task of implementing recursive walks and transformations over kinds of types; type substitution is one example of such a transformation. \paragraph{Canonical types.} It is possible for two types to differ by their spelling, and yet be equivalent semantically: \begin{itemize} \item The Swift language defines some shorthands for common types, such as \texttt{T?} for \texttt{Optional}, \texttt{[T]} for \texttt{Array}, and \texttt{[K:\ V]} for \texttt{Dictionary}. \item \index{type alias declaration}Type alias declarations introduce a new name for some existing underlying type, equivalent to writing out the \index{underlying type}underlying type in place of the type alias. The standard library, for example, declares a type alias \IndexDefinition{Void type@\texttt{Void} type}\texttt{Void} with underlying type \texttt{()}. -\item Another form of fiction along these lines is the preservation of generic parameter names. \index{generic parameter type}Generic parameter types written in source have a name, like ``\texttt{Element},'' and should be printed back as such in diagnostics, but internally they are uniquely identified in their generic signature by a pair of integers, the \index{depth}depth and the \index{index}index. This is detailed in Chapter~\ref{generic declarations}. +\item Another form of fiction along these lines is the preservation of generic parameter names. \index{generic parameter type}Generic parameter types written in source have a name, like ``\texttt{Element},'' and should be printed back as such in diagnostics, but internally they are uniquely identified in their generic signature by a pair of integers, the \index{depth}depth and the \index{index}index. This is detailed in \SecRef{generic params}. \end{itemize} These constructions are the so-called \IndexDefinition{sugared type}\emph{sugared types}. A sugared type has a desugaring into a more primitive form in terms of its structural components. The compiler constructs type sugar in \index{type resolution}type resolution, and attempts to preserve it as much as possible when transforming types. Preserving sugar in diagnostics can be especially helpful with more complex type aliases and such. @@ -164,7 +164,7 @@ \chapter{Types}\label{types} The compiler can transform an arbitrary type into a canonical type by the process of \emph{canonicalization}, which recursively replaces sugared types with their desugared form; in this way, \texttt{[(Int?, Void)]} becomes \verb|Array<(Optional, ())>|. This operation is very cheap; each type caches a pointer to its canonical type, which is computed as needed (so types are not completely immutable, as we said previously; but the mutability cannot be observed from outside). -One notable exception where the type checker does depend on type sugar is the rule for default initialization of variables: if the variable's type is declared as the sugared optional type \texttt{T?} for some \texttt{T}, the variable's \index{initial value expression}initial value \index{expression}expression is assumed to be \texttt{nil} if none was provided. Spelling the type as \texttt{Optional} avoids the default initialization behavior: +One notable exception where the type checker does depend on type sugar is the rule for default initialization of variables: if the variable's type is declared as the \index{optional sugared type}sugared optional type \texttt{T?} for some \texttt{T}, the variable's \index{initial value expression}initial value \index{expression}expression is assumed to be \texttt{nil} if none was provided. Spelling the type as \texttt{Optional} avoids the default initialization behavior: \begin{Verbatim} var x: Int? print(x) // prints `nil' @@ -173,7 +173,7 @@ \chapter{Types}\label{types} print(y) // error: use of uninitialized variable `y' \end{Verbatim} -Another exception is \index{requirement inference}requirement inference with a \index{generic type alias}generic type alias (Section~\ref{requirementinference}). +Another exception is \index{requirement inference}requirement inference with a \index{generic type alias}generic type alias (\SecRef{requirementinference}). \paragraph{Type equality.} Types are uniquely allocated, which is made possible by them being immutable. A type \texttt{(Int) -> ()} has a unique pointer identity within a compilation; inside the tuple type \texttt{((Int) -> (), (Int) -> ())}, the two element types have the same pointer value in memory. From this, three levels of equality are defined on types: \begin{enumerate} @@ -214,7 +214,7 @@ \chapter{Types}\label{types} \end{tikzpicture} \end{figure} -Figure~\ref{type equality fig} demonstrates type equality in the following extension declaration, where the \texttt{Key} (\ttgp{0}{0}) and \texttt{Value} (\ttgp{0}{1}) generic parameters of \texttt{Dictionary} are declared equivalent with a same-type requirement: +\FigRef{type equality fig} demonstrates type equality in the following extension declaration, where the \texttt{Key} (\ttgp{0}{0}) and \texttt{Value} (\ttgp{0}{1}) generic parameters of \texttt{Dictionary} are declared equivalent with a same-type requirement: \begin{Verbatim} extension Dictionary where Key == Value { func foo(a: Key?, b: Optional, c: Value?, d: Optional) {} @@ -228,7 +228,7 @@ \chapter{Types}\label{types} \item When we consider the generic signature of the extension, we're left with just one reduced type, because both canonical types reduce to \texttt{Optional<\ttgp{0}{0}>} via the same-type requirement. Thus, all four of the original types are equal under reduced type equality. \end{itemize} -Intuitively, reduced type equality means ``equivalent as a consequence of one or more same-type requirements.'' In Section~\ref{derived req} we make this notion precise via the derived requirements formalism, and then define reduced type equality in Section~\ref{reducedtypes}. Presenting a computable algorithm for finding reduced types is one of the overarching goals of this book; key developments take place in Section~\ref{rewritesystemintro} and Chapter~\ref{symbols terms rules}. +Reduced type equality means ``equivalent as a consequence of one or more same-type requirements.'' We will define this equivalence of type parameters in \SecRef{type params} using the derived requirements formalism, and then generalize it to all interface types \SecRef{genericsigqueries}. Presenting a computable algorithm for reduced type equality is one of our main results in this book; key developments take place in \SecRef{rewritesystemintro} and \ChapRef{symbols terms rules}. \section{Fundamental Types}\label{fundamental types} @@ -289,7 +289,7 @@ \section{Fundamental Types}\label{fundamental types} \smallskip -The specialized type where each generic argument is set to the corresponding generic parameter type is the \index{declared interface type}\emph{declared interface type} of the nominal type declaration. It is ``universal'': any specialized type is obtainable from the declared interface type by assigning a replacement type to each generic parameter type. This assignment is known as a \index{substitution map}\emph{substitution map}, and the substitution map defined by the generic arguments of a specialized type is its \index{context substitution map}\emph{context substitution map} (Section~\ref{contextsubstmap}). Finally, in the case where neither the nominal type declaration nor any of its parents are generic, the declaration defines just one \index{fully concrete type}\emph{fully concrete} specialized type, with an empty context substitution map. +The specialized type where each generic argument is set to the corresponding generic parameter type is the \index{declared interface type}\emph{declared interface type} of the nominal type declaration. It is ``universal'': any specialized type is obtainable from the declared interface type by assigning a replacement type to each generic parameter type. This assignment is known as a \index{substitution map}\emph{substitution map}, and the substitution map defined by the generic arguments of a specialized type is its \index{context substitution map}\emph{context substitution map} (\SecRef{contextsubstmap}). Finally, in the case where neither the nominal type declaration nor any of its parents are generic, the declaration defines just one \index{fully-concrete type}\emph{fully-concrete} specialized type, with an empty context substitution map. \begin{figure}[b!]\captionabove{Bound and unbound dependent member types}\label{type params fig} @@ -345,10 +345,10 @@ \section{Fundamental Types}\label{fundamental types} \end{center} \end{figure} -\paragraph{Generic parameter types.} Conceptually, a \IndexDefinition{generic parameter type}generic parameter type abstracts over a generic argument provided by the caller. Generic parameter types are declared by \index{generic parameter declaration}generic parameter declarations. The sugared form references the declaration, and prints as the declaration's name; the canonical form only stores a depth and an index. Care must be taken not to print canonical generic parameter types in diagnostics, to avoid surfacing the ``\ttgp{1}{2}'' notation to the user. (Section~\ref{genericsigsourceref} shows a trick to transform a canonical generic parameter type back into its sugared form using a generic signature.) +\paragraph{Generic parameter types.} A \IndexDefinition{generic parameter type}generic parameter type abstracts over a generic argument provided by the caller. Generic parameter types are declared by \index{generic parameter declaration}generic parameter declarations, described in \SecRef{generic params}. The sugared form references the declaration, and prints as the declaration's name; the canonical form only stores a depth and an index. Care must be taken not to print canonical generic parameter types in \index{diagnostic!printing generic parameter type}diagnostics, to avoid surfacing the ``\ttgp{1}{2}'' notation to the user. (We will show how to transform a canonical generic parameter type into its sugared form at the end of \SecRef{genericsigsourceref}.) \paragraph{Dependent member types.} -A \IndexDefinition{dependent member type}dependent member type abstracts over a concrete type that fulfills an associated type requirement. There are two structural components: +A \IndexDefinition{dependent member type}dependent member type abstracts over a concrete type that fulfills an associated type requirement. It has two structural components: \begin{itemize} \item A base type, which is a generic parameter type or another dependent member type. \item An \index{identifier}identifier (in which case this is an \IndexDefinition{unbound dependent member type}\emph{unbound} dependent member type), or an \index{associated type declaration}associated type declaration (in which case it is \IndexDefinition{bound dependent member type}\emph{bound}). @@ -356,30 +356,30 @@ \section{Fundamental Types}\label{fundamental types} Unbound dependent member types fit in somewhere in between type representations and ``true'' types, in that they are a syntactic construct. They appear in the \index{structural resolution stage}structural resolution stage when building a generic signature. Most type resolution happens in the \index{interface resolution stage}interface resolution stage, after a generic signature is available, and thus dependent member types appearing in the interface types of declarations are always bound. -If \ttbf{T} is the base type, \texttt{P} is a protocol and \texttt{A} is an associated type declared inside this protocol, we denote the bound dependent member type by \texttt{\textbf{T}.[P]A}, and the unbound dependent member type by \texttt{\textbf{T}.A}. The base type \ttbf{T} may be another dependent member type, so we can have a series of member type accesses, like \texttt{\ttgp{0}{0}.[P]A.[Q]B}. Figure~\ref{type params fig} shows the recursive structure of two dependent member types, bound and unbound. +If \texttt{T} is the base type, \texttt{P} is a protocol and \texttt{A} is an associated type declared inside this protocol, we denote the bound dependent member type by \texttt{T.[P]A}, and the unbound dependent member type by \texttt{T.A}. The base type \texttt{T} may be another dependent member type, so we can have a series of member type accesses, like \texttt{\ttgp{0}{0}.[P]A.[Q]B}. \FigRef{type params fig} shows the recursive structure of two dependent member types, bound and unbound. -A dependent member type is ``dependent'' in the C++ sense, \emph{not} a \index{dependent type}type dependent on a value in the \index{lambda cube}``lambda cube'' sense. Generic parameter types and dependent member types are together known as \emph{type parameters}. A type that might contain type parameters but is not necessarily a type parameter itself is called an \IndexDefinition{interface type}\emph{interface type}. The above summary necessarily leaves many questions unanswered, and understanding the behaviors of type parameters is a recurring theme throughout the book: +A dependent member type is ``dependent'' in the C++ sense, \emph{not} a \index{dependent type}type dependent on a value in the \index{lambda cube}``lambda cube'' sense. Generic parameter types and dependent member types are together known as \emph{type parameters}. A type that might contain type parameters but is not necessarily a type parameter itself is called an \IndexDefinition{interface type}\emph{interface type}. The above summary necessarily leaves many questions unanswered, because a significant portion of the rest of the book is devoted to type parameters. Key topics include: \begin{itemize} -\item What it means for a type parameter to be semantically valid (Section~\ref{derived req}). -\item Generic signature queries (Section~\ref{genericsigqueries}). -\item Dependent member type substitution (Section~\ref{abstract conformances} and Chapter~\ref{conformance paths}). -\item Type resolution with bound and unbound type parameters (Section~\ref{typeresolution}). +\item Semantic validity of type parameters (\SecRef{derived req}, \SecRef{type params}). +\item Generic signature queries (\SecRef{genericsigqueries}). +\item Dependent member type substitution (\SecRef{abstract conformances}, \ChapRef{conformance paths}). +\item Type resolution with bound and unbound type parameters (\ChapRef{typeresolution}). \end{itemize} \paragraph{Archetype types.} -Type parameters derive their meaning from the requirements of a generic signature; they are only ``names'' of external entities, in a sense. \IndexDefinition{archetype type}Archetypes are an alternate ``self-describing'' representation. Archetypes are instantiated from a \emph{generic environment}, which stores a generic signature together with other information (Chapter~\ref{genericenv}). +Type parameters derive their meaning from the requirements of a generic signature; they are only ``names'' of external entities, in a sense. \IndexDefinition{archetype type}Archetypes are an alternate ``self-describing'' representation. Archetypes are instantiated from a \emph{generic environment}, which stores a generic signature together with other information (\ChapRef{genericenv}). -Archetypes occur inside \index{expression}expressions and \index{SIL}SIL instructions. Archetypes also represent references to opaque return types (Section~\ref{opaquearchetype}) and the type of the payload inside of an existential (Section~\ref{open existential archetypes}). In diagnostics, an archetype is printed as the type parameter it represents. We will denote by $\archetype{T}$ the archetype for the type parameter \texttt{T} in some generic environment understood from context. A type that might contain archetypes but is not necessarily an archetype itself is called a \index{contextual type}\emph{contextual type}. +Archetypes occur inside \index{expression}expressions and \index{SIL}SIL instructions. Archetypes also represent references to opaque return types (\SecRef{opaquearchetype}) and the type of the payload inside of an existential (\SecRef{open existential archetypes}). In \index{diagnostic!printing archetype type}diagnostics, an archetype is printed as the type parameter it represents. We will denote by $\archetype{T}$ the archetype for the type parameter \texttt{T} in some generic environment understood from context. A type that might contain archetypes but is not necessarily an archetype itself is called a \index{contextual type}\emph{contextual type}. \medskip -The fundamental type kinds we surveyed above---nominal types, type parameters, and archetypes---are \textsl{the Swift types that can conform to protocols}. In other words, they can satisfy the left-hand side of a \index{conformance requirement}conformance requirement, with the details for each type given later in Section~\ref{conformance lookup}. Also important are \index{constraint type}\emph{constraint types}, appearing on the \emph{right hand side} of a conformance requirement. Constraint types themselves are never the types of value-producing expressions. (We will see shortly that type-erased values are represented by an existential type, wrapping a constraint type in a level of indirection.) +The fundamental type kinds we surveyed above---nominal types, type parameters, and archetypes---are \textsl{the Swift types that can conform to protocols}. In other words, they can satisfy the left-hand side of a \index{conformance requirement}conformance requirement, with the details for each type given later in \SecRef{conformance lookup}. Also important are \index{constraint type}\emph{constraint types}; these are \textsl{the types appearing on the right hand side of conformance requirements}. Constraint types themselves are never the types of value-producing expressions. (Type-erased values are represented by existential types, wrapping a constraint type in a level of indirection.) \paragraph{Protocol types.} -A protocol type is the most fundamental kind of constraint type; a conformance requirement involving any other kind of constraint type can always be split up into simpler conformance requirements. A \IndexDefinition{protocol type}protocol type is a kind of \index{nominal type}nominal type, so it will have a \index{parent type}parent type if the protocol declaration is nested inside of another nominal type declaration. Unlike other nominal types, protocols cannot be nested in generic contexts (Section~\ref{nested nominal types}), so neither the protocol type itself nor any of its parents can have generic arguments. Thus, there is exactly one protocol type corresponding to each protocol declaration. +A protocol type is the most fundamental kind of constraint type; a conformance requirement involving any other kind of constraint type can always be split up into simpler conformance requirements. A \IndexDefinition{protocol type}protocol type is a kind of \index{nominal type}nominal type, so it will have a \index{parent type}parent type if the protocol declaration is nested inside of another nominal type declaration. Unlike other nominal types, protocols cannot be nested in generic contexts (\SecRef{nested nominal types}), so neither the protocol type itself nor any of its parents can have generic arguments. Thus, there is exactly one protocol type corresponding to each protocol declaration. \paragraph{Protocol composition types.} -A \IndexDefinition{protocol composition type}protocol composition type is a constraint type with a list of members. On the right hand side of a conformance requirement, protocol compositions \emph{expand} into a series of requirements for each member of the composition (Section~\ref{requirement desugaring}). The members can include protocol types, a class type (at most one), and the \Index{AnyObject@\texttt{AnyObject}}\texttt{AnyObject} layout constraint: +A \IndexDefinition{protocol composition type}protocol composition type is a constraint type with a list of members. On the right hand side of a conformance requirement, protocol compositions \emph{decompose} into a series of requirements for each member of the composition (\SecRef{requirement desugaring}). The members can include protocol types, a class type (at most one), and the \Index{AnyObject@\texttt{AnyObject}}\texttt{AnyObject} layout constraint: \begin{quote} \begin{verbatim} P & Q @@ -389,11 +389,11 @@ \section{Fundamental Types}\label{fundamental types} \end{quote} \paragraph{Parameterized protocol types.} -A \index{constrained protocol type|see{parameterized protocol type}}\IndexDefinition{parameterized protocol type}parameterized protocol type stores a protocol type together with a list of generic arguments. As a constraint type, it expands into a conformance requirement together with one or more same-type requirements, for each of the protocol's \emph{primary associated types} (Section~\ref{protocols}). The written representation looks just like a generic nominal type, except the named declaration is a protocol, for example, \texttt{Sequence}. Parameterized protocol types were introduced in Swift 5.7 \cite{se0346} (the evolution proposal calls them ``constrained protocol types''). +A \index{constrained protocol type|see{parameterized protocol type}}\IndexDefinition{parameterized protocol type}parameterized protocol type stores a protocol type together with a list of generic arguments. As a constraint type, it expands into a conformance requirement together with one or more same-type requirements, for each of the protocol's \emph{primary associated types} (\SecRef{protocols}). The written representation looks just like a generic nominal type, except the named declaration is a protocol, for example, \texttt{Sequence}. Parameterized protocol types were introduced in \IndexSwift{5.7}Swift 5.7~\cite{se0346} (the evolution proposal calls them ``constrained protocol types''). \section{More Types}\label{more types} -Now we will look at the various \IndexDefinition{structural type}\emph{structural types} which are part of the language. (Not to be confused with the types produced by the \emph{structural resolution stage}, which is discussed in Chapter~\ref{typeresolution}.) +Now we will look at the various \IndexDefinition{structural type}\emph{structural types} which are part of the language. (Not to be confused with the types produced by the \emph{structural resolution stage}, which is discussed in \ChapRef{typeresolution}.) \begin{wrapfigure}[15]{r}{6.5cm} \begin{center} @@ -411,11 +411,11 @@ \section{More Types}\label{more types} \end{wrapfigure} \paragraph{Existential types.} -An \index{existential type}existential type has one structural component, the \emph{constraint type}. An existential value is a container for a value with some unknown dynamic type that is known to satisfy the constraint; to the right we show the existential type \verb|any (P & Q)|, which stores a value conforming to both \texttt{P} and \texttt{Q}. +An \index{existential type}existential type has one structural component, the \emph{constraint type}. An existential value is a container holding a value of some unknown dynamic type that is known to satisfy the constraint; to the right we show the existential type \verb|any (P & Q)|, which stores a value conforming to both \texttt{P} and \texttt{Q}. -The \texttt{any} keyword was added in Swift~5.6~\cite{se0355}; in \index{history}Swift releases prior, existential types and constraint types were the same concept in the language and implementation. (For the sake of source compatibility, a constraint type without the \texttt{any} keyword still resolves to an existential type except when it appears on the right-hand side of a conformance requirement.) +The \texttt{any} keyword was added in \IndexSwift{5.6}Swift~5.6~\cite{se0355}; in Swift releases prior, existential types and constraint types were the same concept in the language and implementation. (For the sake of source compatibility, a constraint type without the \texttt{any} keyword still resolves to an existential type except when it appears on the right-hand side of a conformance requirement.) -Existential types are covered in Chapter~\ref{existentialtypes}. +Existential types are covered in \ChapRef{existentialtypes}. \begin{wrapfigure}[9]{r}{3cm} \begin{tikzpicture} @@ -426,7 +426,7 @@ \section{More Types}\label{more types} \end{tikzpicture} \end{wrapfigure} -\paragraph{Metatype types.} Types can be used as callees in \index{call expression}call expressions, \texttt{\textbf{T}(...)}; this is shorthand for a constructor invocation \texttt{\textbf{T}.init(...)}. They can serve as the base of a static method call, \texttt{\textbf{T}.foo(...)}, where the type is passed as the \texttt{self} parameter. Finally, types can be directly referenced by the expression \texttt{\textbf{T}.self}. In all cases, the type becomes a \emph{value}, and this value must itself be assigned a type; this type is called a \emph{metatype}. The metatype of a type \ttbf{T} is written as \texttt{\ttbf{T}.Type}. The type \ttbf{T} is the \IndexDefinition{instance type}\emph{instance type} of the metatype. For example, the type of the expression ``\verb|Int.self|'' is the metatype \texttt{Int.Type}, whose instance type is \verb|Int|. +\paragraph{Metatype types.} A type \texttt{T} can the callee in a \index{call expression}call expression, \texttt{T(...)}; this is shorthand for a constructor call \texttt{T.init(...)}. It can serve as the base of a static method call, \texttt{T.foo(...)}, where the type is passed as the \texttt{self} parameter. Finally, it can be directly referenced by the expression \texttt{T.self}. In all cases, the type becomes a \emph{value}, and this value must itself be assigned a type; this type is called a \emph{metatype}. The metatype of a type \texttt{T} is written as \texttt{T.Type}. The type \texttt{T} is the \IndexDefinition{instance type}\emph{instance type} of the metatype. For example, the type of the expression ``\verb|Int.self|'' is the metatype \texttt{Int.Type}, whose instance type is \verb|Int|. Metatypes are sometimes referred to as \IndexDefinition{concrete metatype type}\emph{concrete metatypes}, to distinguish them from existential metatypes, which we introduce below. Most concrete metatypes are singleton types with one value, the instance type itself. One exception is that the class metatype for a non-final class also has all subclasses of the class as values. @@ -454,20 +454,20 @@ \section{More Types}\label{more types} \paragraph{Existential metatype types.} An \index{existential metatype type}existential metatype is a container for an unknown concrete metatype, whose instance type is known to satisfy the constraint type. -Existential metatypes are not the same as metatypes whose instance type is existential, as we can demonstrate by considering language semantics. If \texttt{P} is a protocol and \texttt{S} is a type conforming to \texttt{P}, then the existential metatype \texttt{any P.Type} can store the value \texttt{S.self}. As the existential type \texttt{any P} does not conform to \texttt{P}, the value \texttt{(any P).self} cannot be stored inside of the existential metatype \texttt{any P.Type}. In fact, the type of this value is the \emph{concrete} metatype \texttt{(any P).Type}. Figure~\ref{existential metatype fig} compares the recursive structure of \texttt{any P.Type} against \texttt{(any P).Type}. +Existential metatypes are not the same as metatypes whose instance type is existential, as we can demonstrate by considering language semantics. If \texttt{P} is a protocol and \texttt{S} is a type conforming to \texttt{P}, then the existential metatype \texttt{any P.Type} can store the value \texttt{S.self}. As the existential type \texttt{any P} does not conform to \texttt{P}, the value \texttt{(any P).self} cannot be stored inside of the existential metatype \texttt{any P.Type}. In fact, the type of this value is the \emph{concrete} metatype \texttt{(any P).Type}. \FigRef{existential metatype fig} compares the recursive structure of \texttt{any P.Type} against \texttt{(any P).Type}. -Prior to the introduction of the \texttt{any} keyword, existential metatypes were written as \texttt{P.Type}, and the concrete metatype of an existential as \texttt{P.Protocol}. This was a source of confusion, because for all non-protocol types \ttbf{T}, \texttt{\textbf{T}.Type} is always a concrete metatype. The below table compares the old and new syntax: -\begin{quote} +Prior to the introduction of the \texttt{any} keyword, existential metatypes were written as \texttt{P.Type}, and the concrete metatype of an existential as \texttt{P.Protocol}. This was a source of confusion, because when \texttt{T} is a non-protocol type, \texttt{T.Type} is otherwise always a concrete metatype. The below table compares the old and new syntax: +\begin{center} \begin{tabular}{lll} \toprule \textbf{Old syntax}&\textbf{New syntax}&\textbf{Type kind}\\ \midrule -\texttt{(\textbf{T}).Type}&\texttt{(\textbf{T}).Type}&Concrete metatype\\ +\texttt{(T).Type}&\texttt{(T).Type}&Concrete metatype (unchanged)\\ \texttt{P.Protocol}&\texttt{(any P).Type}&Concrete metatype of existential\\ \texttt{P.Type}&\texttt{any P.Type}&Existential metatype\\ \bottomrule \end{tabular} -\end{quote} +\end{center} \begin{wrapfigure}[12]{r}{6cm} \begin{center} @@ -487,9 +487,9 @@ \section{More Types}\label{more types} \paragraph{Tuple types.} A \IndexDefinition{tuple type}tuple type is a list of element types together with optional labels. If the list of element types is empty, we get the unique empty tuple type \texttt{()}. -An unlabeled one-element tuple type cannot be formed at all; \texttt{(\textbf{T})} resolves to the same type as \ttbf{T}. Labeled one-element tuple types \texttt{(foo:\ \textbf{T})} are valid in the grammar, but are rejected by type resolution. \index{SILGen}SILGen creates them internally when it materializes the payload of an enum case (for instance, ``\texttt{case person(name:\ String)}''), but they do not appear as the types of expressions. +An unlabeled one-element tuple type cannot be formed at all; \texttt{(T)} resolves to the same type as \texttt{T}. Labeled one-element tuple types \texttt{(foo:\ T)} are valid in the grammar, but are rejected by type resolution. \index{SILGen}SILGen creates them internally when it materializes the payload of an enum case (for instance, ``\texttt{case person(name:\ String)}''), but they do not appear as the types of expressions. -\paragraph{Function types.} A \IndexDefinition{function type}function type is the type of the callee in a \index{call expression}call expression. Consists of a parameter list, a return type, and non-type attributes. The latter includes the function's effect, lifetime, and calling convention. The effects are \texttt{throws} and \texttt{async} (part of the Swift~5.5 concurrency model \cite{se0296}). Function values with non-escaping lifetime are second-class; they can only be passed to another function, captured by a non-escaping closure, or called. Only escaping functions can be returned or stored inside other values. The four calling conventions are: +\paragraph{Function types.} A \IndexDefinition{function type}function type is the type of the callee in a \index{call expression}call expression. It contains a parameter list, a return type, and non-type attributes. The attributes include the function's effect, lifetime, and calling convention. The effects are \texttt{throws} and \texttt{async} (part of the \IndexSwift{5.5}Swift~5.5 concurrency model \cite{se0296}). Function values with \index{non-escaping function type}non-escaping lifetime are second-class; they can only be passed to another function, captured by a non-escaping closure, or called. Only \index{escaping function type}escaping functions can be returned or stored inside other values. The four calling conventions are: \begin{itemize} \item The default ``thick'' convention, where the function is passed as a function pointer together with a reference-counted closure context. \item \texttt{@convention(thin)}: the function is passed as a single function pointer, without a closure context. Thin functions cannot capture values from outer scopes. @@ -501,14 +501,14 @@ \section{More Types}\label{more types} \begin{itemize} \item The \textbf{value ownership kind}, which can be the default, \texttt{inout}, \texttt{borrowing} or \texttt{consuming}. -The \texttt{inout} kind is key to Swift's mutable value type model; the interested reader can consult \cite{valuesemantics} for details. The last two were introduced in Swift~5.9 \cite{se0377}. +The \texttt{inout} kind is key to Swift's mutable value type model; the interested reader can consult \cite{valuesemantics} for details. The last two were introduced in \IndexSwift{5.9}Swift~5.9 \cite{se0377}. \item The \textbf{variadic} flag, in which case the parameter type must be an array type. When type checking a call to function value with a variadic parameter, the type checker collects multiple expressions from the call argument list into an implicit array expression. Otherwise, variadic parameters behave exactly like arrays once we get to \index{SILGen}SILGen and below. -\item The \IndexDefinition{autoclosure function type}\texttt{@autoclosure} attribute, in which case the parameter type must be another function type of the form \texttt{() -> \textbf{T}} for some type \ttbf{T}. +\item The \IndexDefinition{autoclosure function type}\texttt{@autoclosure} attribute, in which case the parameter type must be another function type of the form \verb|() -> T| for some type \texttt{T}. -This instructs the type checker to treat the corresponding argument in the caller as if it was a value of type \ttbf{T}, rather than a function type \texttt{()~->~\textbf{T}}. The argument is then wrapped inside an implicit \index{closure expression}closure \index{expression}expression. In the body of the callee, an \texttt{@autoclosure} parameter behaves exactly like an ordinary function value, and can be called to evaluate the expression provided by the caller. +This instructs the type checker to treat the corresponding argument in the caller as if it was a value of type \texttt{T}, rather than a function type \verb|() -> T|. The argument is then wrapped inside an implicit \index{closure expression}closure \index{expression}expression. In the body of the callee, an \texttt{@autoclosure} parameter behaves exactly like an ordinary function value, and can be called to evaluate the expression provided by the caller. \end{itemize} \begin{figure}[b!]\captionabove{Function type with two parameters, or a single tuple parameter}\label{function param tuple fig} @@ -540,7 +540,7 @@ \section{More Types}\label{more types} \end{center} \end{figure} -In functional languages such as \index{ML}ML and \index{Haskell}Haskell, all function types conceptually take a single parameter type. This is not the case in Swift, however. Figure~\ref{function param tuple fig} illustrates the distinction between \verb|(Int, Float) -> Bool| and \verb|((Int, Float)) -> Bool|; the first has two parameters, and the second has a single parameter that is of tuple type. +In functional languages such as \index{ML}ML and \index{Haskell}Haskell, all function types conceptually take a single parameter type. This is not the case in Swift, however. \FigRef{function param tuple fig} illustrates the distinction between \verb|(Int, Float) -> Bool| and \verb|((Int, Float)) -> Bool|; the first has two parameters, and the second has a single parameter that is of tuple type. For convenience, the type checker defines an implicit conversion, called a \index{tuple splat}``tuple splat,'' between function types taking a single tuple and function types of multiple arguments. This implicit conversion is only available when passing a function value as an argument to a call, and only when the function type's parameter list can be represented as a tuple (hence, it has no parameter attributes). An example: \begin{Verbatim} @@ -571,7 +571,7 @@ \section{More Types}\label{more types} The history of Swift function types is an interesting case study in language evolution. Originally, Swift followed the classical functional language model, where a function type always had a \emph{single} parameter type, which could be a tuple type to simulate a function of multiple parameters. Tuple types used to be able to contain \texttt{inout} and variadic elements, and furthermore, the argument labels of a function declaration were part of the function declaration's type. The existence of such ``non-materializable'' tuple types introduced complications throughout the type system, and argument labels had inconsistent behavior in different contexts. -The syntax for referencing a declaration name with argument labels was adopted in \index{history}Swift~2.2~\cite{se0021}. Subsequently, argument labels were dropped from function types in Swift~3~\cite{se0111}. The distinction between a function taking multiple arguments and a function taking a single tuple argument was first hinted at in Swift~3 with \cite{se0029} and \cite{se0066}, and became explicit in Swift~4 \cite{se0110}. At the same time, Swift~4 also introduced the ``tuple splat'' function conversion which simulated the Swift~3 model in a limited way for the cases where the old behavior was convenient. For example, the element type of \texttt{Dictionary} is a key/value tuple, but often it is more convenient to call the \texttt{Collection.map()} method with a closure taking two arguments, and not a closure with a single tuple argument. Even after the above proposals were implemented, the compiler continued to model a function type as having a single input type for quite some time, despite this being completely hidden from the user. After Swift~5, the function type representation fully converged with the semantic model of the language. +The syntax for referencing a declaration name with argument labels was adopted in \IndexSwift{2.2}Swift~2.2~\cite{se0021}. Subsequently, argument labels were dropped from function types in \IndexSwift{3.0}Swift~3~\cite{se0111}. The distinction between a function taking multiple arguments and a function taking a single tuple argument was first hinted at in Swift~3 with \cite{se0029} and \cite{se0066}, and became explicit in \IndexSwift{4.0}Swift~4 \cite{se0110}. At the same time, Swift~4 also introduced the ``tuple splat'' function conversion which simulated the Swift~3 model in a limited way for the cases where the old behavior was convenient. For example, the element type of \texttt{Dictionary} is a key/value tuple, but often it is more convenient to call the \texttt{Collection.map()} method with a closure taking two arguments, and not a closure with a single tuple argument. Even after the above proposals were implemented, the compiler continued to model a function type as having a single input type for quite some time, despite this being completely hidden from the user. After \IndexSwift{5.0}Swift~5, the function type representation fully converged with the semantic model of the language. \medskip @@ -579,19 +579,19 @@ \section{More Types}\label{more types} \medskip -We finish this section by turning to sugared types. Sugared generic parameter types were already described in Section~\ref{fundamental types}. Of the remaining kinds of \index{sugared type}sugared types, type alias types are defined by the user, and the other three are built-in to the language. +We finish this section by turning to sugared types. Sugared generic parameter types were already described in \SecRef{fundamental types}. Of the remaining kinds of \index{sugared type}sugared types, type alias types are defined by the user, and the other three are built-in to the language. -\paragraph{Type alias types.} A \IndexDefinition{type alias type}type alias type represents a reference to a type alias declaration. It contains an optional parent type, a substitution map, and the substituted \IndexDefinition{underlying type}underlying type. The canonical type of a type alias type is the substituted underlying type. The substitution map is formed in type resolution, from any generic arguments applied to the type alias type declaration itself, together with the generic arguments of the parent type (Section~\ref{identtyperepr}). Type resolution applies this substitution map to the original underlying type of the type alias declaration to compute the substituted underlying type. The substitution map is preserved for printing, and for requirement inference (Section~\ref{requirementinference}). +\paragraph{Type alias types.} A \IndexDefinition{type alias type}type alias type represents a reference to a type alias declaration. It contains an optional parent type, a substitution map, and the substituted \IndexDefinition{underlying type}underlying type. The canonical type of a type alias type is the substituted underlying type. The substitution map is formed in type resolution, from any generic arguments applied to the type alias type declaration itself, together with the generic arguments of the parent type (\SecRef{identtyperepr}). Type resolution applies this substitution map to the original underlying type of the type alias declaration to compute the substituted underlying type. The substitution map is preserved for printing, and for requirement inference (\SecRef{requirementinference}). -\paragraph{Optional types.} The \IndexDefinition{optional sugared type}optional type is written as \texttt{\textbf{T}?} for some object type \T; its canonical type is \texttt{Optional<\textbf{T}>}. +\paragraph{Optional types.} The \IndexDefinition{optional sugared type}optional type is written as \texttt{T?} for some object type \texttt{T}; its canonical type is \texttt{Optional}. -\paragraph{Array types.} The \IndexDefinition{array sugared type}array type is written as \texttt{[\textbf{E}]} for some element type \ttbf{E}; its canonical type is \texttt{Array<\textbf{E}>}. +\paragraph{Array types.} The \IndexDefinition{array sugared type}array type is written as \texttt{[E]} for some element type \texttt{E}; its canonical type is \texttt{Array}. -\paragraph{Dictionary types.} The \IndexDefinition{dictionary sugared type}dictionary type is written as \texttt{[\textbf{K}: \textbf{V}]} for some key type \ttbf{K} and value type \ttbf{V}; its canonical type is \texttt{Dictionary<\textbf{K}, \textbf{V}>}. +\paragraph{Dictionary types.} The \IndexDefinition{dictionary sugared type}dictionary type is written as \texttt{[K: V]} for some key type \texttt{K} and value type \texttt{V}; its canonical type is \texttt{Dictionary}. -\section{Sporadic Types}\label{misc types} +\section{Special Types}\label{misc types} -Each of the sporadic types (this is not a technical term) is weird in its own unique way. They tend to only be valid in specific contexts, and some do not represent actual types of values at all. Their unexpected appearance can be a source of counter-examples and failed assertions. They all play important roles in the expression type checker, but again, do not really give us anything new if we consider the generics model from a purely formal viewpoint. +We now discuss a few of the remaining types, which are all weird in their own unique ways. They tend to only be valid in specific contexts, and some do not represent actual types of values at all. Their unexpected appearance can be a source of counter-examples and failed assertions. They all play important roles in the expression type checker, but again, do not really give us anything new if we consider the generics model from a purely formal viewpoint. \paragraph{Generic function types.} A \IndexDefinition{generic function type}generic function type has the same structural components as a function type, except it also stores a generic signature: @@ -603,26 +603,26 @@ \section{Sporadic Types}\label{misc types} Generic function types are the \index{interface type}interface types of generic \index{function declaration}function and \index{subscript declaration}subscript declarations. A reference to a generic function declaration from an expression always applies substitutions first, so generic function types do not appear as the types of expressions. In particular, an unsubstituted generic function value cannot be a parameter to another function, thus the Swift type system does not support \index{higher-rank polymorphism}higher-rank polymorphism. Type inference with higher-rank types is known to be \index{halting problem}\index{undecidable problem}undecidable; see \cite{wells} and \cite{practicalhigherrank}. -Generic function types have a special behavior when their canonical type is computed. Since generic function types carry a generic signature, the parameter types and return type of a \emph{canonical} generic function type are actually \emph{reduced} types with respect to this generic signature (Section~\ref{reducedtypes}). +Generic function types have a special behavior when their canonical type is computed. Since generic function types carry a generic signature, the parameter types and return type of a \emph{canonical} generic function type are actually \emph{reduced} types with respect to this generic signature (\SecRef{reduced types}). \paragraph{Reference storage types.} A \IndexDefinition{reference storage type}reference storage type is the type of a variable declaration adorned with the \IndexDefinition{weak reference type}\texttt{weak}, \IndexDefinition{unowned reference type}\texttt{unowned} or \texttt{unowned(unsafe)} attribute. The wrapped type must be a class type, a class-constrained archetype, or class-constrained existential type. Reference storage types arise as the interface types of variable declarations, and as the types of SIL instructions. The types of \index{expression}expressions never contain reference storage types. \paragraph{Placeholder types.} -A \IndexDefinition{placeholder type}placeholder type represents a generic argument to be inferred by the type checker. The written representation is the underscore ``\texttt{\_}''. They can only appear in a handful of restricted contexts and do not remain after type checking. The expression type checker replaces placeholder types with type variables, solves the constraint system, and finally replaces the type variables with their fixed concrete types. For example, here the interface type of the \texttt{myPets} local variable is inferred as \texttt{Array}: +A \IndexDefinition{placeholder type}placeholder type represents a generic argument to be inferred by the type checker. The written representation is the underscore ``\texttt{\_}''. They can only appear in a handful of restricted contexts, and do not appear in the types of expressions or the interface types of declarations after type checking. The constraint solver replaces placeholder types with type variables when solving the constraint system. For example, here the interface type of the \texttt{myPets} local variable is inferred as \texttt{Array}: \begin{Verbatim} let myPets: Array<_> = ["Zelda", "Giblet"] \end{Verbatim} -Placeholder types were introduced in Swift 5.6~\cite{se0315}. +Placeholder types were introduced in \IndexSwift{5.6}Swift 5.6~\cite{se0315}. \paragraph{Unbound generic types.} -\IndexDefinition{unbound generic type}Unbound generic types predate placeholder types, and can be seen as a special case. An unbound generic type is written as a named reference to a generic type declaration, without generic arguments applied. An unbound generic type behaves like a generic nominal type where all generic arguments are placeholder types. In the example above, the generic nominal type \texttt{Array<\_>} contains a placeholder type. The unbound generic type \texttt{Array} could have been used instead: +\IndexDefinition{unbound generic type}Unbound generic types predate placeholder types, and can be seen as a special case. An unbound generic type is written as a named reference to a generic type declaration, without generic arguments applied. An unbound generic type behaves like a generic nominal type where all generic arguments are placeholder types. In previous example, we wrote the generic nominal type \texttt{Array<\_>} with a placeholder type. The unbound generic type \texttt{Array} could have been used instead: \begin{Verbatim} let myPets: Array = ["Zelda", "Giblet"] \end{Verbatim} -Unbound generic types are also occasionally useful in diagnostics, to print the name of a type declaration only (like \texttt{Outer.Inner}) without the generic parameters of its declared interface type (\texttt{Outer.Inner}, for example). Unbound generic types are discussed in the context of type resolution in Section~\ref{unbound generic types}. +Unbound generic types are also occasionally useful in \index{diagnostic!printing unbound generic type}diagnostics, to print the name of a type declaration only (like \texttt{Outer.Inner}) without the generic parameters of its declared interface type (\texttt{Outer.Inner}, for example). Unbound generic types are discussed in the context of type resolution in \SecRef{unbound generic types}. -\begin{listing}\captionabove{Dynamic Self type example}\label{dynamic self example} +\begin{listing}\captionabove{Dynamic \texttt{Self} type example}\label{dynamic self example} \begin{Verbatim} class Base { required init() {} @@ -652,9 +652,9 @@ \section{Sporadic Types}\label{misc types} \end{listing} \paragraph{Dynamic Self types.} -The \IndexDefinition{dynamic Self type@dynamic \texttt{Self} type}dynamic Self type appears when a class method declares a return type of \texttt{Self}. In this case, the object is known to have the same dynamic type as the base of the method call, which might be a subclass of the method's class. The dynamic Self type has one structural component, a class type, which is the static upper bound for the type. This concept comes from \index{Objective-C}Objective-C, where it is called \texttt{instancetype}. The dynamic Self type in many ways behaves like a generic parameter, but it is not represented as one; the type checker and \index{SILGen}SILGen implement support for it directly. Note that \texttt{Self} only has this interpretation inside a class declaration. In a protocol declaration, \texttt{Self} is the implicit generic parameter (Section~\ref{protocols}). In a struct or enum declaration, \texttt{Self} is the declared interface type (Section~\ref{identtyperepr}). +The \IndexDefinition{dynamic Self type@dynamic \texttt{Self} type}dynamic \texttt{Self} type appears when a class method declares a return type of \texttt{Self}. In this case, the object is known to have the same dynamic type as the base of the method call, which might be a subclass of the method's class. The dynamic \texttt{Self} type has one structural component, a class type, which is the static upper bound for the type. This concept comes from \index{Objective-C}Objective-C, where it is called \texttt{instancetype}. The dynamic \texttt{Self} type in many ways behaves like a generic parameter, but it is not represented as one; the type checker and \index{SILGen}SILGen implement support for it directly. Note that the identifier ``\texttt{Self}'' only has this interpretation inside a class. In a protocol declaration, \texttt{Self} is the implicit generic parameter (\SecRef{protocols}). In a struct or enum declaration, \texttt{Self} is the declared interface type (\SecRef{identtyperepr}). -Listing~\ref{dynamic self example} demonstrates some of the behaviors of the dynamic Self type. Two invalid cases are shown; \texttt{invalid1()} is rejected because the type checker cannot prove that the return type is always an instance of the dynamic type of \texttt{self}, and \texttt{invalid2()} is rejected because \texttt{Self} appears in contravariant position. +\ListingRef{dynamic self example} demonstrates some of the behaviors of the dynamic \texttt{Self} type. Two invalid cases are shown; \texttt{invalid1()} is rejected because the type checker cannot prove that the return type is always an instance of the dynamic type of \texttt{self}, and \texttt{invalid2()} is rejected because \texttt{Self} appears in contravariant position. \paragraph{Type variable types.} A \IndexDefinition{type variable type}type variable represents the future inferred type of an \index{expression}expression in the expression type checker's constraint system. The expression type checker builds the constraint system by walking an expression recursively, assigning new type variables to the types of sub-expressions and recording constraints between these type variables. Solving the constraint system can have three possible outcomes: @@ -664,7 +664,7 @@ \section{Sporadic Types}\label{misc types} \item \textbf{Multiple solutions}---the constraint system is underspecified and some type variables can have multiple valid fixed type assignments. \end{itemize} -In the case of multiple solutions, the type checker uses heuristics to pick the ``best'' solution for the entire expression; if none of the solutions are clearly better than the others, an ambiguity error is diagnosed. Otherwise, we proceed as if the solver only found the best solution. The final step applies the solution to the expression by replacing type variables appearing in the types of sub-expressions with their fixed types. +In the case of multiple solutions, the type checker uses heuristics to pick the ``best'' solution for the entire expression; if none of the solutions are clearly better than the others, an ambiguity error is \index{diagnostic!multiple solutions}diagnosed. Otherwise, we take the best solution and apply it to the expression, replacing type variables with their assigned fixed types. The utmost care must be taken when working with type variables because unlike all other types, they are not allocated with indefinite lifetime. Type variables live in the \IndexDefinition{constraint solver arena}constraint solver arena, which grows and shrinks as the solver explores branches of the solution space. Types that \emph{contain} type variables also need to be allocated in the constraint solver arena. This is also true of structures that contain types, such as substitution maps. Type variables ``escaping'' from the constraint solver can crash the compiler in odd ways. Assertions should be used to rule out type variables from appearing in the wrong places. @@ -672,16 +672,16 @@ \section{Sporadic Types}\label{misc types} The printed representation of a type variable is \texttt{\$Tn}, where \texttt{n} is an incrementing integer local to the constraint system. One way to see type variables in action is to pass the \texttt{-Xfrontend~-debug-constraints} compiler flag. \paragraph{L-value types.} -An \IndexDefinition{l-value type}l-value type represents the type of an \index{expression}expression appearing on the left hand side of an assignment operator (hence the ``l'' in l-value), or as an argument to an \texttt{inout} parameter in a function call. L-value types wrap an \IndexDefinition{object type}\emph{object type} which is the type of the stored value; they print out as \texttt{@lvalue~\textbf{T}} where \ttbf{T} is the object type, but this is not valid syntax in the language. +An \IndexDefinition{l-value type}l-value type represents the type of an \index{expression}expression appearing on the left hand side of an assignment operator (hence the ``l'' in l-value), or as an argument to an \texttt{inout} parameter in a function call. L-value types wrap an \IndexDefinition{object type}\emph{object type} which is the type of the stored value; they print out as \verb|@lvalue T| where \texttt{T} is the object type, but this is not valid syntax in the language. -L-value types appear in type-checked expressions. The reader familiar with C++ might think of an l-value type as somewhat analogous to a C++ mutable reference type ``\texttt{\textbf{T}~\&}''---unlike C++ though, they are not directly visible in the source language. L-value types do not appear in the types of SIL instructions; \index{SILGen}SILGen lowers l-value accesses into accessor calls or direct manipulation of memory. +L-value types appear in type-checked expressions. The reader familiar with C++ might think of an l-value type as somewhat analogous to a C++ mutable reference type ``\verb|T &|''---unlike C++ though, they are not directly visible in the source language. L-value types do not appear in the types of SIL instructions; \index{SILGen}SILGen lowers l-value accesses into accessor calls or direct manipulation of memory. \paragraph{Error types.} \IndexDefinition{error type} \index{expression} -Error types are returned when type substitution encounters an invalid or missing conformance (Chapter~\ref{substmaps}). In this case, the error type wraps the original type, and prints as the original type to make types coming from malformed conformances more readable in diagnostics. +Error types are returned when type substitution encounters an invalid or missing conformance (\ChapRef{substmaps}). In this case, the error type wraps the original type, and prints as the original type to make types coming from malformed conformances more readable in \index{diagnostic!printing error type}diagnostics. -The expression type checker also assigns error types to invalid declaration references. This uses the singleton form of the error type, which prints as \texttt{<>}. To avoid user confusion, diagnostics containing the singleton error type should not be emitted. Generally, any expression whose type contains an error type does not need to be diagnosed, because a diagnostic should have been emitted elsewhere. +Error types are also returned by \index{type resolution}type resolution if the \index{type representation}type representation is invalid in some way. This uses the singleton form of the error type, which prints as \texttt{<>}. To avoid user confusion, diagnostics containing the singleton error type should not be emitted. Generally, any expression whose type contains an error type does not need to be diagnosed, because a diagnostic should have been emitted elsewhere. \paragraph{Built-in types.} \IndexDefinition{compiler intrinsic} @@ -720,7 +720,7 @@ \section{Source Code Reference}\label{typesourceref} \begin{itemize} \item \textbf{Various traversals:} \texttt{walk()} is a general pre-order traversal where the callback returns a tri-state value---continue, stop, or skip a sub-tree. Built on top of this are two simpler variants; \texttt{findIf()} takes a boolean predicate, and \texttt{visit()} takes a void-returning callback which offers no way to terminate the traversal. \item \textbf{Transformations:} \texttt{transformWithPosition()}, \texttt{transformRec()}, \texttt{transform()}. As with the traversals, the first of the three is the most general, and the other two are built on top. In all three cases, the callback is invoked on all types contained within a type, recursively. It can either elect to replace a type with a new type, or leave a type unchanged and instead try to transform any of its child types. -\item \textbf{Substitution:} \texttt{subst()} implements type substitution, which is a particularly common kind of transform which replaces generic parameters or archetypes with concrete types (Section~\ref{substmapsourcecoderef}). +\item \textbf{Substitution:} \texttt{subst()} implements type substitution, which is a particularly common kind of transform which replaces generic parameters or archetypes with concrete types (\SecRef{substmapsourcecoderef}). \item \textbf{Printing:} \texttt{print()} outputs the string form of a type, with many customization options; \texttt{dump()} prints the \index{tree}tree structure of a type in an \index{s-expression}s-expression form. The latter is extremely useful for invoking from inside a debugger, or ad-hoc print debug statements. \end{itemize} \IndexSource{type pointer equality} @@ -824,7 +824,7 @@ \section{Source Code Reference}\label{typesourceref} \end{Verbatim} These template methods desugar the type if it is a sugared type, and the casted type can never itself be a sugared type. This is usually correct; for example, if \texttt{type} is the \texttt{Swift.Void} type alias type, then \texttt{type->is()} returns true, because it is for all intents and purposes a tuple (an empty tuple), except when printed in diagnostics. -There are also top-level template functions \verb|isa<>|, \verb|dyn_cast<>| and \verb|cast<>| that operate on \texttt{TypeBase *}. Using these with \texttt{Type} is an error; the pointer must be explicitly unwrapped with \texttt{getPointer()} first. These casts do not desugar, and permit casting to sugared types. This is the mechanism used when sugared types must be distinguished from canonical types for some reason: +There are also top-level template functions \verb|isa<>|, \verb|dyn_cast<>| and \verb|cast<>| that operate on \texttt{TypeBase *}. Using these with \texttt{Type} is an error; the pointer must be explicitly unwrapped with \texttt{getPointer()} first. These casts do not desugar, and permit casting to sugared types. This is the mechanism used when \IndexSource{sugared type}sugared types must be distinguished from canonical types for some reason: \begin{Verbatim} Type type = ...; @@ -889,7 +889,7 @@ \section{Source Code Reference}\label{typesourceref} \paragraph{Utility operations.} These encapsulate frequently-useful patterns. \begin{itemize} -\item \texttt{getOptionalObjectType()} returns the type \texttt{T} if the type is some \texttt{Optional}, otherwise it returns the null type. +\item \texttt{getOptionalObjectType()} \IndexSource{optional sugared type}returns the type \texttt{T} if the type is some \texttt{Optional}, otherwise it returns the null type. \item \texttt{getMetatypeInstanceType()} returns the type \texttt{T} if the type is some \texttt{T.Type}, otherwise it returns \texttt{T}. \item \texttt{mayHaveMembers()} answers if this is a nominal type, archetype, existential type or dynamic Self type. \end{itemize} @@ -950,4 +950,4 @@ \section{Source Code Reference}\label{typesourceref} \apiref{AnyFunctionType::ExtInfo}{class} This represents the non-type attributes of a function type. -\end{document} \ No newline at end of file +\end{document} diff --git a/docs/Generics/chapters/witness-thunks.tex b/docs/Generics/chapters/witness-thunks.tex deleted file mode 100644 index 57752a9fd795d..0000000000000 --- a/docs/Generics/chapters/witness-thunks.tex +++ /dev/null @@ -1,501 +0,0 @@ -\documentclass[../generics]{subfiles} - -\begin{document} - -\chapter{Witness Thunks}\label{valuerequirements} - -\ifWIP - -When protocol conformances were introduced in Chapter~\ref{conformances}, our main focus was the mapping from associated type requirements to type witnesses, and how conformances participate in type substitution. Now let's look at the other facet of conformances, which is how they map value requirements to value witnesses.\footnote{The term ``value witness'' is overloaded to have two meanings in Swift. The first is a witness to a value requirement in a protocol. The second is an implementation of an intrinsic operation all types support, like copy, move, destroy, etc., appearing in the value witness table of runtime type metadata. Here I'm talking about the first meaning.} Recording a witness for a protocol requirement requires more detail than simply stating the witness. - -What is the relationship between the generic signature of a protocol requirement and the generic signature of the witness? Well, ``it's complicated.'' A protocol requirement's generic signature has a \texttt{Self} generic parameter constrained to that protocol. If the witness is a default implementation from a protocol extension, it will have a \texttt{Self} generic parameter, too, but it might conform to a \emph{different} protocol. Or if the witness is a member of the conforming type and the conforming type has generic parameters of its own, it will have its own set of generic parameters, with different requirements. A witness might be ``more generic'' than a protocol requirement, where the requirement is satisfied by a fixed specialization of the witness. Conditional conformance and class inheritance introduce even more possibilities. (There will be examples of all of these different cases at the end of Section~\ref{witnessthunksignature}.) - -\index{SILGen} -All of this means that when the compiler generates a witness table to represent a conformance at runtime, the entries in the witness table cannot simply point directly to the witness implementations. The protocol requirement and the witness will have different calling conventions, so SILGen must emit a \emph{witness thunk} to translate the calling convention of the requirement into that of each witness. Conformance checking records a mapping between protocol requirements and witnesses together with the necessary details for witness thunk emission inside each normal conformance. - -The \texttt{ProtocolConformance::getWitness()} method takes the declaration of a protocol value requirement, and returns an instance of \texttt{Witness}, which stores all of the this information, obtainable by calling getter methods: -\begin{description} -\item[\texttt{getDecl()}] The witness declaration itself. -\item[\texttt{getWitnessThunkSignature()}] The \emph{witness thunk generic signature}, which bridges the gap between the protocol requirement's generic signature and the witness generic signature. Adopting this generic signature is what allows the witness thunk to have the correct calling convention that matches the caller's invocation of the protocol requirement, while providing the necessary type parameters and conformances to invoke a member of the concrete conforming type. -\item[\texttt{getSubstitutions()}] The \emph{witness substitution map}. Maps the witness generic signature to the type parameters of the witness thunk generic signature. This is the substitution map at the call of the actual witness from inside the witness thunk. -\item[\texttt{getRequirementToWitnessThunkSubs()}] The \emph{requirement substitution map}. Maps the protocol requirement generic signature to the type parameters of the witness thunk generic signature. This substituted map is used by SILGen to compute the interface type of the witness thunk, by applying it to the interface type of the protocol requirement. -\end{description} - -TODO: -\begin{itemize} -\item diagram with the protocol requirement caller, the protocol requirement type, the witness thunk signature/type, and the witness signature/type. -\item more details about how the witness\_method CC recovers self generic parameters in a special way -\end{itemize} - -\section{Covariant Self Problem} - -In Swift, subclasses inherit protocol conformances from their superclass. If a class conforms to a protocol, a requirement of this protocol can be called on an instance of a subclass. When the protocol requirement is witnessed by a default implementation in a protocol extension, the \texttt{Self} parameter of the protocol extension method is bound to the specific subclass substituted at the call site. The subclass can be observed if, for example, the protocol requirement returns an instance of \texttt{Self}, and the default implementation constructs a new instance via an \texttt{init()} requirement on the protocol. - -The protocol requirement can be invoked in one of two ways: -\begin{enumerate} -\item Directly on an instance of the class or one of its subclasses. Since the implementation is known to always be the default implementation, the call is statically dispatched to the default implementation without any indirection through the witness thunk. -\item Indirectly via some other generic function with a generic parameter constrained to the protocol. Since the implementation is unknown, the call inside the generic function is dynamically dispatched via the witness thunk stored in the witness table for the conformance. If the generic function in turn is called with an instance of the class or one of its subclasses, the witness thunk stored in the witness table for the conformance will statically dispatch to the default implementation. -\end{enumerate} -The two cases are demonstrated in Listing~\ref{covariantselfexample}. The \texttt{Animal} protocol, which defines a \texttt{clone()} requirement returning an instance of \texttt{Self}. This requirement has a default implementation which constructs a new instance of \texttt{Self} via the \texttt{init()} requirement on the protocol. The \texttt{Horse} class conforms to \texttt{Animal}, using the default implementation for \texttt{clone()}. The \texttt{Horse} class also has a subclass, \texttt{Pony}. It follows from substitution semantics that both \texttt{newPony1} and \texttt{newPony2} should have a type of \texttt{Pony}: -\begin{itemize} -\item The definition of \texttt{newPony1} calls \texttt{clone()} with the substitution map $\texttt{Self} := \texttt{Pony}$. The original return type of \texttt{clone()} is \texttt{Self}, so the substituted type is \texttt{Pony}. -\item Similarly, the definition of \texttt{newPonyIndirect} calls \texttt{cloneAnimal()} with the substitution map $\texttt{A} := \texttt{Pony}$. The original return type of \texttt{cloneAnimal()} is \texttt{A}, so the substituted type is also \texttt{Pony}. -\end{itemize} -The second call dispatches through the witness thunk, so the witness thunk must also ultimately call the default implementation of \texttt{Animal.clone()} with the substitution map $\texttt{Self} := \texttt{Pony}$. When the conforming type is a struct or an enum, the \texttt{self} parameter of a witness thunk has a concrete type. If the conforming type was a class though, it would not be correct to use the concrete \texttt{Horse} type, because the witness thunk would then invoke the default implementation with the substitution map $\texttt{Self} := \texttt{Horse}$, and the second call would return an instance of \texttt{Horse} at runtime and not \texttt{Pony}, which would be a type soundness hole. - -\begin{listing}\captionabove{Statically and dynamically dispatched calls to a default implementation}\label{covariantselfexample} -\begin{Verbatim} -protocol Animal { - init() - func clone() -> Self -} - -extension Animal { - func clone() -> Self { - return Self() - } -} - -class Horse: Animal {} -class Pony: Horse {} - -func cloneAnimal(_ animal: A) -> A { - return animal.clone() -} - -let newPonyDirect = Pony().clone() -let newPonyIndirect = cloneAnimal(Pony()) -\end{Verbatim} -\end{listing} - -\Index{protocol Self type@protocol \texttt{Self} type} -This soundness hole was finally discovered and addressed in Swift~4.1 \cite{sr617}. The solution is to model the covariant behavior of \texttt{Self} with a superclass-constrained generic parameter. When the conforming type is a class, witness thunks dispatching to a default implementation have this special generic parameter, in addition to the generic parameters of the class itself (there are none in our example, so the witness thunk just has the single generic parameter for \texttt{Self}). In the next section, the algorithms for building the substitution map and generic signature all take a boolean flag indicating if a covariant \texttt{Self} type should be introduced. The specific conditions under which this flag is set are a bit subtle: -\begin{enumerate} -\item The conforming type must be a non-final class. If the class is final, there is no need to preserve variance since \texttt{Self} is always the exact class type. -\item The witness must be in a protocol extension. If the witness is a method on the class, there is no way to observe the concrete substitution for the protocol \texttt{Self} type, because it is not a generic parameter of the class method. -\item (The hack) The interface type of the protocol requirement must not mention any associated types. -\end{enumerate} -The determination of whether to use a static or covariant \texttt{Self} type for a class conformance is implemented by the type cheker function \texttt{matchWitness()}. - -Indeed, Condition~3 is a hack; it opens up an exception where the soundness hole we worked so hard to close is once again allowed. In an ideal world, Conditions 1~and~2 would be sufficient, but by the time the soundness hole was discovered and closed, existing code had already been written taking advantage of it. The scenario necessitating Condition~3 is when the default implementation appears in a \emph{constrained} protocol extension: -\begin{Verbatim} -protocol P { - associatedtype T = Self - func f() -> T -} - -extension P where Self.T == Self { - func f() -> Self { return self } -} - -class C: P {} -class D: C {} -\end{Verbatim} -The non-final class \texttt{C} does not declare a type witness for associated type \texttt{T} of protocol~\texttt{P}. The associated type specifies a default, so conformance checking proceeds with the default type witness. The language model is that a conformance is checked once, at the declaration of \texttt{C}, so the default type \texttt{Self} is the ``static'' \texttt{Self} type of the conformance, which is \texttt{C}. Moving on to value requirements, class \texttt{C} does not provide an implementation of the protocol requirement \texttt{f()} either, and the original intent of this code is that the default implementation of \texttt{f()} from the constrained extension of \texttt{P} should used. - -Without Condition~3, the requirement \texttt{Self.T == Self} would not be satisfied when matching the requirement \texttt{f()} with its witness; the left hand side of the requirement, \texttt{C}, is not exactly equal to the right hand side, which is the covariant \texttt{Self} type that is only known to be \emph{some subclass} of \texttt{C}. The conformance would be rejected unless \texttt{C} was declared final. With Condition~3, \texttt{Self.T == Self} is satisfied because the static type \texttt{C} is used in place of \texttt{Self} during witness matching. - -The compiler therefore continued to accept the above code, because it worked prior to Swift~4.1. Unfortunately, it means that a call to \texttt{D().f()} via the witness thunk will still return an instance of \texttt{C}, and not \texttt{D} as expected. One day, we might remove this exception and close the soundness hole completely, breaking source compatibility for the above example until the developer makes it type safe by declaring \texttt{C} as final. For now, a good guideline to ensure type safety when mixing classes with protocols is \textsl{only final classes should conform to protocols with associated types}. - -\fi - -\section{Witness Thunk Signatures}\label{witnessthunksignature} - -\ifWIP - -Now we turn our attention to the construction of the data recorded in the \texttt{Witness} type. This is done with the aid of the \texttt{RequirementEnvironment} class, which implements the ``builder'' pattern. - -Building the witness thunk signature is an expensive operation. The below algorithms only depend on the conformance being checked, the generic signature of a protocol requirement, and whether the witness requires the use of a covariant \texttt{Self} type. These three pieces of information can be used as a uniquing key to cache the results of these algorithms. Conformance checking might need to consider a number of protocol requirements, each requirement having multiple candidate witnesses that have to be checked to find the best one. In the common case, many protocol requirements will share a generic signature---for example, any protocol requirement without generic parameters of its own has the simple generic signature \texttt{}. Therefore this caching can eliminate a fair amount of duplicated work. - -The \textbf{witness substitution map} is built by the constraint solver when matching the interface type of a witness to the interface type of a requirement. A description of this process is outside of the scope of this manual. - -The \textbf{requirement substitution map} is built by mapping the requirement's \texttt{Self} parameter either to the witness thunk's \texttt{Self} parameter (if the witness has a covariant class \texttt{Self} type), or to the concrete conforming type otherwise. All other generic parameters of the requirement map over to generic parameters of the witness thunk, possibly at a different depth. The requirement's \texttt{Self} conformance is always a concrete conformance, even in the covariant \texttt{Self} case, because \texttt{Self} is subject to a superclass requirement in that case. All other conformance requirements of the requirement's generic signature remain abstract. - -The \textbf{witness thunk generic signature} is constructed by stitching together the generic signature of the conformance context with the generic signature of the protocol requirement. - -\begin{algorithm}[Build the requirement to witness thunk substitution map] As input, takes a normal conformance~\texttt{N}, the generic signature of a protocol requirement~\texttt{G}, and a flag indicating if the witness has a covariant class \texttt{Self} type,~\texttt{F}. Outputs a substitution map for \texttt{G}. -\begin{enumerate} -\item Initialize \texttt{R} to an empty list of replacement types. -\item Initialize \texttt{C} to an empty list of conformances. -\item (Remapping) First compute the depth at which non-\texttt{Self} generic parameters of \texttt{G} appear in the witness thunk signature. Let $\texttt{G}'$ be the generic signature of \texttt{N}, and let \texttt{D} be one greater than the depth of the last generic parameter of $\texttt{G}'$. If $\texttt{G}'$ has no generic parameters, set $\texttt{D}=0$. If \texttt{F} is set, increment \texttt{d} again. -\item (Self replacement) If \texttt{F} is set, record the replacement $\ttgp{0}{0} := \ttgp{0}{0}$ in \texttt{R}. Otherwise, let \texttt{T} be the type of \texttt{N}, and record the replacement $\ttgp{0}{0} := \texttt{T}$. -\item (Remaining replacements) Any remaining generic parameters of \texttt{G} must have a depth of 1. For each remaining generic parameter \ttgp{1}{i}, record the replacement $\ttgp{1}{i}~:=~\ttgp{D}{i}$. -\item (Self conformance) If \texttt{F} is set, build a substitution map $\texttt{S}$ for $\texttt{G}'$ mapping each generic parameter \ttgp{d}{i} to \ttgp{(d+1)}{i}. Apply this substitution map to \texttt{N} to get a specialized conformance, and record this specialized conformance in \texttt{C}. -\item (Self conformance) Otherwise if \texttt{F} is not set, just record \texttt{N} in \texttt{C}. -\item (Remaining conformances) Any remaining conformance requirements in \texttt{G} have a subject type rooted in a generic parameter at depth~1. For each remaining conformance requirement \texttt{T:~P}, record an abstract conformance to \texttt{P} in \texttt{C}. Abstract conformances do not store a conforming type, but if they did, the same remapping process would be applied here. -\item (Return) Build a substitution map for \texttt{G} from \texttt{R} and \texttt{C}. -\end{enumerate} -\end{algorithm} - -\begin{algorithm}[Build the witness thunk generic signature] As input, takes a normal conformance~\texttt{N}, the generic signature of a protocol requirement~\texttt{G}, and a flag indicating if the witness has a covariant class \texttt{Self} type,~\texttt{F}. Outputs a substitution map for \texttt{G}. -\begin{enumerate} -\item Initialize \texttt{P} to an empty list of generic parameter types. -\item Initialize \texttt{R} to an empty list of generic requirements. -\item (Remapping) First compute the depth at which non-\texttt{Self} generic parameters of \texttt{G} appear in the witness thunk signature. Let $\texttt{G}'$ be the generic signature of \texttt{N}, and let \texttt{d} be one greater than the depth of the last generic parameter of $\texttt{G}'$. If $\texttt{G}'$ has no generic parameters, set $\texttt{d}=0$. If \texttt{F} is set, increment \texttt{d} again. -\item If \texttt{F} is set, we must first introduce a generic parameter and superclass requirement for the covariant \texttt{Self} type: -\begin{enumerate} -\item (Self parameter) Add the generic parameter \ttgp{0}{0} to \texttt{P}. This generic parameter will represent the covariant \texttt{Self} type. -\item (Remap Self type) Build a substitution map for $\texttt{G}'$ mapping each generic parameter \ttgp{d}{i} to \ttgp{(d+1)}{i}. Apply this substitution map to the type of \texttt{N}, and call the result \texttt{T}. -\item (Self requirement) Add a superclass requirement \texttt{\ttgp{0}{0}:\ T} to \texttt{R}. -\item (Context generic parameters) For each generic parameter \ttgp{d}{i} in $\texttt{G}'$, add the generic parameter \ttgp{(d+1)}{i} to \texttt{P}. -\item (Context generic requirements) For each requirement of $\texttt{G}'$, apply \texttt{S} to the requirement and add the substituted requirement to \texttt{R}. -\end{enumerate} -\item If \texttt{F} is not set, the generic parameters and requirements of the conformance context carry over unchanged: -\begin{enumerate} -\item (Context generic parameters) Add all generic parameters of $\texttt{G}'$ to \texttt{P}. -\item (Context generic requirements) Add all generic requirements of $\texttt{G}'$ to \texttt{R}. -\end{enumerate} -\item (Remaining generic parameters) All non-\texttt{Self} generic parameters of \texttt{G} must have a depth of 1. For each remaining generic parameter \ttgp{1}{i}, add \ttgp{D}{i} to \texttt{P}. -\item (Trivial case) If no generic parameters have been added to \texttt{P} so far, the witness thunk generic signature is empty. Return. -\item (Remaining generic requirements) For each generic requirement of \texttt{G}, apply the requirement to witness thunk substitution map to the requirement, and add the substituted requirement to \texttt{R}. -\item (Return) Build a minimized generic signature from \texttt{P} and \texttt{R} and return the result. - -\end{enumerate} -\end{algorithm} - -\vfill -\eject - -\begin{example} If the neither the conforming type nor the witness is generic, and there is no covariant \texttt{Self} parameter, the witness thunk signature is trivial. -\begin{Verbatim} -protocol Animal { - associatedtype CommodityType: Commodity - func produce() -> CommodityType -} - -struct Chicken: Animal { - func produce() -> Egg {...} -} -\end{Verbatim} -\begin{description} -\item[Witness thunk signature] None. -\item[Witness generic signature] None. -\item[Witness substitution map] None. -\item[Requirement generic signature] \vphantom{a} -\begin{quote} -\texttt{} -\end{quote} -\item[Requirement substitution map] The protocol requirement does not have its own generic parameter list, but it still inherits a generic signature from the protocol declaration. -\[ -\SubstMapC{ -\SubstType{Self}{Chicken} -}{ -\SubstConf{Self}{Chicken}{Animal} -} -\] -\end{description} -\end{example} - -\vfill -\eject - -\begin{example} Generic conforming type. -\begin{Verbatim} -protocol Habitat { - associatedtype AnimalType: Animal - func adopt(_: AnimalType) -} - -struct Barn: Habitat { - func adopt(_: AnimalType) {...} -} -\end{Verbatim} -\begin{description} -\item[Witness thunk signature] \vphantom{a} -\begin{quote} -\texttt{<\ttgp{0}{0}, \ttgp{0}{1} where \ttgp{0}{0}:\ AnimalType>} -\end{quote} -\item[Witness generic signature] \vphantom{a} -\begin{quote} -\texttt{} -\end{quote} -\item[Witness substitution map] This is actually the identity substitution map because each generic parameter is replaced with its canonical form. -\[ -\SubstMapC{ -\SubstType{AnimalType}{\ttgp{0}{0}}\\ -\SubstType{StallType}{\ttgp{0}{1}} -}{ -\SubstConf{AnimalType}{AnimalType}{Animal} -} -\] - -\item[Requirement generic signature] \vphantom{a} -\begin{quote} -\texttt{} -\end{quote} -\item[Requirement substitution map] \phantom{a} -\[ -\SubstMapC{ -\SubstType{Self}{Barn<\ttgp{0}{0}, \ttgp{0}{1}>} -}{ -\SubstConf{Self}{Barn<\ttgp{0}{0}, \ttgp{0}{1}>}{Habitat} -} -\] -\end{description} -\end{example} - -\vfill -\eject - -\begin{example} Conditional conformance. -\begin{Verbatim} -struct Dictionary {...} - -extension Dictionary: Equatable where Value: Equatable { - static func ==(lhs: Self, rhs: Self) -> Bool {...} -} -\end{Verbatim} -\begin{description} -\item[Witness thunk signature] \vphantom{a} -\begin{quote} -\texttt{<\ttgp{0}{0}, \ttgp{0}{1} where \ttgp{0}{0}:\ Hashable, \ttgp{0}{1}:\ Equatable>} -\end{quote} -\item[Witness generic signature] \vphantom{a} -\begin{quote} -\texttt{} -\end{quote} -\item[Witness substitution map] This is again the identity substitution map because each generic parameter is replaced with its canonical form. -\[ -\SubstMapLongC{ -\SubstType{Key}{\ttgp{0}{0}}\\ -\SubstType{Value}{\ttgp{0}{1}} -}{ -\SubstConf{Key}{\ttgp{0}{0}}{Hashable}\\ -\SubstConf{Value}{\ttgp{0}{1}}{Equatable} -} -\] - -\item[Requirement generic signature] \vphantom{a} -\begin{quote} -\texttt{} -\end{quote} -\item[Requirement substitution map] \vphantom{a} -\[ -\SubstMapC{ -\SubstType{Self}{Dictionary<\ttgp{0}{0}, \ttgp{0}{1}>} -}{ -\SubstConf{Self}{Dictionary<\ttgp{0}{0}, \ttgp{0}{1}>}{Equatable}\\ -\text{with conditional requirement \texttt{\ttgp{0}{1}:\ Equatable}} -} -\] -\end{description} -\end{example} - -\vfill -\eject - -\begin{example} Witness is in a protocol extension. -\begin{Verbatim} -protocol Shape { - var children: [any Shape] -} - -protocol PrimitiveShape:\ Shape { - var children: [any Shape] { return [] } -} - -struct Empty: PrimitiveShape {} -\end{Verbatim} -\begin{description} -\item[Witness thunk signature] None. -\item[Witness generic signature] \vphantom{a} -\begin{quote} -\texttt{} -\end{quote} -\item[Witness substitution map] \vphantom{a} -\[ -\SubstMapC{ -\SubstType{Self}{Empty} -}{ -\SubstConf{Self}{Empty}{PrimitiveShape} -} -\] - -\item[Requirement generic signature] \vphantom{a} -\begin{quote} -\texttt{} -\end{quote} -\item[Requirement substitution map] \phantom{a} -\[ -\SubstMapC{ -\SubstType{Self}{Empty} -}{ -\SubstConf{Self}{Empty}{Shape} -} -\] -\end{description} -\end{example} - -\vfill -\eject - -\begin{example} Conforming type is a generic class, and the witness is in a protocol extension. -\begin{Verbatim} -protocol Cloneable { - init(from: Self) - func clone() -> Self -} - -extension Cloneable { - func clone() -> Self { - return Self(from: self) - } -} - -class Box: Cloneable { - var contents: Contents - - required init(from other: Self) { - self.contents = other.contents - } -} -\end{Verbatim} -\begin{description} -\item[Witness thunk signature] \vphantom{a} -\begin{quote} -\texttt{<\ttgp{0}{0}, \ttgp{1}{0} where \ttgp{0}{0}:\ Box<\ttgp{1}{0}>>} -\end{quote} -\item[Witness generic signature] \vphantom{a} -\begin{quote} -\texttt{} -\end{quote} -\item[Witness substitution map] \vphantom{a} -\[ -\SubstMapC{ -\SubstType{Self}{\ttgp{0}{0}} -}{ -\SubstConf{Self}{Box<\ttgp{1}{0}>}{Cloneable} -} -\] - -\item[Requirement generic signature] \vphantom{a} -\begin{quote} -\texttt{} -\end{quote} -\item[Requirement substitution map] \phantom{a} -\[ -\SubstMapC{ -\SubstType{Self}{\ttgp{0}{0}} -}{ -\SubstConf{Self}{Box<\ttgp{1}{0}>}{Cloneable} -} -\] -\end{description} -\end{example} - -\vfill -\eject - -\begin{example} Requirement is generic. -\begin{Verbatim} -protocol Q {} - -protocol P { - func f(_: A) -} - -struct Outer { - struct Inner: P { - func f(_: A) {} - } -} -\end{Verbatim} -\begin{description} -\item[Witness thunk signature] \vphantom{a} -\begin{quote} -\texttt{<\ttgp{0}{0}, \ttgp{1}{0}, \ttgp{2}{0} where \ttgp{2}{0}:\ Q>} -\end{quote} -\item[Witness generic signature] \vphantom{a} -\begin{quote} -\texttt{} -\end{quote} -\item[Witness substitution map] \vphantom{a} -\[ -\SubstMapC{ -\SubstType{T}{\ttgp{0}{0}}\\ -\SubstType{U}{\ttgp{1}{0}}\\ -\SubstType{A}{\ttgp{2}{0}} -}{ -\SubstConf{A}{\ttgp{2}{0}}{Q} -} -\] - -\item[Requirement generic signature] \vphantom{a} -\begin{quote} -\texttt{} -\end{quote} -\item[Requirement substitution map] \phantom{a} -\[ -\SubstMapC{ -\SubstType{Self}{Outer<\ttgp{0}{0}>.Inner<\ttgp{1}{0}>}\\ -\SubstType{A}{\ttgp{2}{0}} -}{ -\SubstConf{A}{\ttgp{2}{0}}{Q} -} -\] -\end{description} -\end{example} - -\vfill -\eject - -\begin{example} Witness is more generic than the requirement. -\begin{Verbatim} -protocol P { - associatedtype A: Equatable - associatedtype B: Equatable - - func f(_: A, _: B) -} - -struct S: P { - typealias B = Int - - func f(_: T, _: U) {} -} -\end{Verbatim} -The type witness for \texttt{A} is the generic parameter \texttt{A}, and the type witness for \texttt{B} is the concrete type \texttt{Int}. -The witness \texttt{S.f()} for \texttt{P.f()} is generic, and can be called with any two types that conform to \texttt{Equatable}. Since the type witnesses for \texttt{A} and \texttt{B} are both \texttt{Equatable}, a fixed specialization of \texttt{S.f()} witnesses \texttt{P.f()}. - -\begin{description} -\item[Witness thunk signature] \vphantom{a} -\begin{quote} -\texttt{<\ttgp{0}{0} where \ttgp{0}{0}:\ Equatable>} -\end{quote} -\item[Witness generic signature] \vphantom{a} -\begin{quote} -\texttt{} -\end{quote} -\item[Witness substitution map] \vphantom{a} -\[ -\SubstMapC{ -\SubstType{A}{\ttgp{0}{0}}\\ -\SubstType{T}{\ttgp{0}{0}}\\ -\SubstType{U}{Int} -}{ -\SubstConf{A}{\ttgp{0}{0}}{Equatable}\\ -\SubstConf{T}{\ttgp{0}{0}}{Equatable}\\ -\SubstConf{U}{Int}{Equatable} -} -\] - -\item[Requirement generic signature] \vphantom{a} -\begin{quote} -\texttt{} -\end{quote} -\item[Requirement substitution map] \phantom{a} -\[ -\SubstMapC{ -\SubstType{Self}{S<\ttgp{0}{0}>} -}{ -\SubstConf{Self}{S<\ttgp{0}{0}>}{P} -} -\] -\end{description} -\end{example} - -\fi - -\section{Source Code Reference} - -\end{document} \ No newline at end of file diff --git a/docs/Generics/generics.bib b/docs/Generics/generics.bib index e9043796a5e01..b4f976d48d5bd 100644 --- a/docs/Generics/generics.bib +++ b/docs/Generics/generics.bib @@ -9,6 +9,45 @@ @misc{tspl year = {2014} } +@book{muchnick1997advanced, + title={Advanced Compiler Design and Implementation}, + author={Muchnick, S.}, + isbn={9781558603202}, + lccn={97013063}, + url={https://www.goodreads.com/en/book/show/887908}, + year={1997}, + publisher={Morgan Kaufmann Publishers} +} + +@book{cooper2004engineering, + title={Engineering a Compiler}, + author={Cooper, K.D. and Torczon, L.}, + isbn={9781558606982}, + lccn={2004268209}, + url={https://dl.acm.org/doi/pdf/10.5555/2737838}, + year={2004}, + publisher={Elsevier Science} +} + +@book{incrementalracket, + title={Essentials of Compilation}, + subtitle={An incremental approach in {R}acket}, + author={Jeremy G. Siek}, + isbn={9780262047760}, + url={https://mitpress.mit.edu/9780262047760/essentials-of-compilation/}, + year={2023}, + publisher={The MIT Press} +} + +@book{craftinginterpreter, + title={Crafting Interpreters}, + author={Robert Nystrom}, + year={2021}, + publisher={Genever Benning}, + isbn={9780990582946}, + url={https://craftinginterpreters.com} +} + @book{gregor, title={C++ Templates: The Complete Guide}, author={Vandevoorde, D. and Josuttis, N.M. and Gregor, D.}, @@ -22,6 +61,7 @@ @book{grimaldi title={Discrete and Combinatorial Mathematics: An Applied Introduction}, author={Grimaldi, R.P.}, year={1998}, + isbn={9780201199123}, publisher={Addison-Wesley Longman}, url={https://www.goodreads.com/en/book/show/1575542} } @@ -40,6 +80,7 @@ @book{postmodern title={Post-Modern Algebra}, author={Smith, J.D.H. and Romanowska, A.B.}, year={2011}, + isbn={9780471127383}, publisher={Wiley}, url={https://www.wiley.com/en-us/Post+Modern+Algebra-p-9780471127383} } @@ -56,7 +97,8 @@ @book{catprogrammer @book{alggraph, title={Algebraic Graph Theory: Morphisms, Monoids and Matrices}, author={Knauer, U.}, - year={2011}, + year={2019}, + isbn={9783110616125}, publisher={De Gruyter}, url={https://www.degruyter.com/document/doi/10.1515/9783110617368/html?lang=en} } @@ -77,6 +119,7 @@ @book{combinatorialgroup author={Magnus, W. and Karrass, A. and Solitar, D.}, year={1976}, publisher={Dover Publications}, + isbn={9780486438306}, url={https://www.goodreads.com/book/show/331129.Combinatorial_Group_Theory} } @@ -91,8 +134,16 @@ @book{book2012string publisher={Springer New York} } -@book{andallthat, place={Cambridge}, title={Term Rewriting and All That}, DOI={10.1017/CBO9781139172752}, publisher={Cambridge University Press}, author={Baader, Franz and Nipkow, Tobias}, year={1998}, -url={https://www21.in.tum.de/~nipkow/TRaAT/}} +@book{andallthat, + place={Cambridge}, + title={Term Rewriting and All That}, + DOI={10.1017/CBO9781139172752}, + publisher={Cambridge University Press}, + author={Baader, Franz and Nipkow, Tobias}, + year={1998}, + isbn={9780521779203}, + url={https://www21.in.tum.de/~nipkow/TRaAT/} +} @book{art1, title={The Art of Computer Programming: Volume 1: Fundamental Algorithms}, @@ -116,6 +167,7 @@ @book{konig title={Theory of Finite and Infinite Graphs}, author={K{\H{o}}nig, D. and McCoart, R. and Tutte, W.T.}, year={2013}, + isbn={9780817633899}, publisher={Birkh{\"a}user Boston}, url={https://www.goodreads.com/book/show/3359970-theory-of-finite-and-infinite-graphs} } @@ -124,8 +176,9 @@ @book{cutland title={Computability: An Introduction to Recursive Function Theory}, author={Nigel Cutland}, year={1980}, + isbn={9780521294652}, publisher={Cambridge University Press}, - url={https://www.goodreads.com/book/show/1190249.Computability} + url={https://www.cambridge.org/us/universitypress/subjects/computer-science/programming-languages-and-applied-logic/computability-introduction-recursive-function-theory?format=PB} } @book{collatzbook, @@ -137,6 +190,15 @@ @book{collatzbook url={https://bookstore.ams.org/mbk-78} } +@book{curry, + title={Foundations of Mathematical Logic}, + author={Curry, H.B.}, + isbn={9780486634623}, + year={1977}, + publisher={Dover Publications}, + url={https://store.doverpublications.com/products/9780486634623} +} + @book{combinatory, title={Combinatory Logic}, author={Curry, H.B. and Feys, R.}, @@ -836,18 +898,53 @@ @misc{java_faq } @misc{rust_chalk, - author="", + author="{Rust Traits Working Group}", title = "The {Chalk} Book", url = "https://rust-lang.github.io/chalk/book/", year = {2015} } +@misc{rust_bug, + author = "Aaron Turon", + title="{``where''} clauses are only elaborated for supertraits, and not other things", + url = "https://github.com/rust-lang/rust/issues/20671", + year = {2015} +} +@misc{rust_same, + author = "Jared Roesch", + title="Parse and accept type equality constraints in {``where''} clauses", + url = "https://github.com/rust-lang/rust/issues/20041", + year = {2014} +} + +@misc{rust_const, + title="{Rust} {RFC} 2000: Const Generics", + url = "https://rust-lang.github.io/rfcs/1598-generic_associated_types.html", + author="{Rust Traits Working Group}", + year = {2017} +} + +@misc{rust_gat, + title="{Rust} {RFC} 1598: Generic Associated Types", + author="{Rust Traits Working Group}", + url = "https://rust-lang.github.io/rfcs/1598-generic_associated_types.html", + year = {2016} +} + @misc{llvmtalk, author = "John McCall and Slava Pestov", title = "Implementing {S}wift generics", url = "https://www.youtube.com/watch?v=ctS8FzqcRug", year = {2017} } + +@misc{cvwtalk, + author = "Dario Rexin", + title = "Compact value witnesses in {S}wift", + url = "https://www.youtube.com/watch?v=ctS8FzqcRug", + year = {2023} +} + @misc{siltalk, author = "Joe Groff and Chris Lattner", title = "{S}wift's High-Level {IR}: A Case Study", @@ -910,6 +1007,13 @@ @misc{implrecursive year = {2016} } +@misc{swift57, + author = "Holly Borla", + title = "Swift 5.7 released", + year = {2022}, + url = {https://www.swift.org/blog/swift-5.7-released/}, +} + @misc{sr617, title = "{SR-617}: \texttt{Self} not always resolved dynamically with Generics", url = "https://github.com/apple/swift/issues/43234", @@ -936,6 +1040,12 @@ @misc{sr12120 year = {2020} } +@misc{evolution, + title = "Swift evolution process", + url = "https://www.swift.org/swift-evolution/", + year = {2016} +} + @misc{se0011, author = "Loïc Lecrenier", title = "{SE-0011}: Replace \texttt{typealias} keyword with \texttt{associatedtype} for associated type declarations", @@ -954,6 +1064,12 @@ @misc{se0029 url = "https://github.com/apple/swift-evolution/blob/main/proposals/0029-remove-implicit-tuple-splat.md", year = {2016} } +@misc{se0035, + author = "Joe Groff", + title = "{SE-0035}: Limiting \texttt{inout} capture to \texttt{@noescape} contexts", + url = "https://github.com/apple/swift-evolution/blob/main/proposals/0035-limit-inout-capture.md", + year = {2016} +} @misc{se0048, author = "Chris Lattner", title = "{SE-0048}: Generic type aliases", @@ -1062,6 +1178,12 @@ @misc{se0244 url = "https://github.com/apple/swift-evolution/blob/main/proposals/0244-opaque-result-types.md", year = {2019} } +@misc{se0254, + author = "Becca Royal-Gordon", + title = "{SE-0254}: Static and class subscripts", + url = "https://github.com/apple/swift-evolution/blob/main/proposals/0254-static-subscripts.md", + year = {2019} +} @misc{se0260, author = "Jordan Rose and Ben Cohen", title = "{SE-0260}: Library Evolution for Stable {ABIs}", @@ -1087,6 +1209,14 @@ @misc{se0296 url = "https://github.com/apple/swift-evolution/blob/main/proposals/0296-async-await.md", year = {2020} } + +@misc{se0306, + author = "John McCall and Doug Gregor and Konrad Malawski and Chris Lattner", + title = "{SE-0306}: Actors", + url = "https://github.com/apple/swift-evolution/blob/main/proposals/0306-actors.md", + year = {2020} +} + @misc{se0309, author = "Anthony Latsis and Filip Sakel and Suyash Srijan", title = "{SE-0309}: Unlock existentials for all protocols", @@ -1153,6 +1283,12 @@ @misc{se0377 url = "https://github.com/apple/swift-evolution/blob/main/proposals/0377-parameter-ownership-modifiers.md", year = {2023} } +@misc{se0383, + author = "Robert Widmann", + title = "{SE-0383}: Deprecate {@UIApplicationMain} and {@NSApplicationMain}", + url = "https://github.com/apple/swift-evolution/blob/main/proposals/0383-deprecate-uiapplicationmain-and-nsapplicationmain.md", + year = {2023} +} @misc{se0390, author = "Joe Groff and Michael Gottesman and Andrew Trick and Kavon Farvardin", title = "{SE-0390}: Noncopyable structs and enums", @@ -1183,3 +1319,15 @@ @misc{se0404 url = "https://github.com/apple/swift-evolution/blob/main/proposals/0404-nested-protocols.md", year = {2023} } +@misc{se0413, + author = "Jorge Revuelta and Torsten Lehmann and Doug Gregor", + title = "{SE-0413}: Typed throws", + url = "https://github.com/apple/swift-evolution/blob/main/proposals/0413-typed-throws.md", + year = {2023} +} +@misc{se0427, + author = "Kavon Farvardin and Tim Kientzle and Slava Pestov", + title = "{SE-0427}: Noncopyable generics", + url = "https://github.com/apple/swift-evolution/blob/main/proposals/0427-noncopyable-generics.md", + year = {2024} +} diff --git a/docs/Generics/generics.tex b/docs/Generics/generics.tex index b2200791704ac..3b51092841b9b 100644 --- a/docs/Generics/generics.tex +++ b/docs/Generics/generics.tex @@ -2,7 +2,7 @@ % % This source file is part of the Swift.org open source project % -% Copyright (c) 2021 - 2022 Apple Inc. and the Swift project authors +% Copyright (c) 2021 - 2024 Apple Inc. and the Swift project authors % Licensed under Apache License v2.0 with Runtime Library Exception % % See https://swift.org/LICENSE.txt for license information @@ -22,7 +22,7 @@ % - Use 'flat' ToC style because it looks better with double-digit chapter numbers % - Use 'fleqn' to left-align display math to make it consistent with the 'quote' environment which I use for tables and diagrams and stuff % -% - Changing `twoside=semi' to `twoside' is safe. The default mode is meant for reading on-screen. If you make this change, even pages have a wider left margin, and odd pages have a wider right margin, which looks silly when scrolling vertically, but makes sense if you actually print and bind the book. +% - Changing `twoside=semi' to `twoside' is safe. The default mode is meant for reading on-screen. If you make this change, even pages have a narrower left margin, and odd pages have a wider left margin, which looks silly when scrolling vertically, but makes sense if you actually print and bind the book. \documentclass[a4paper,headsepline,headings=standardclasses,headings=big,chapterprefix=false,bibliography=totoc,toc=flat,fleqn,twoside=semi]{scrbook} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% @@ -60,8 +60,8 @@ \makeatother -% Hyperlinks in PDF files. We enable numbering in PDF bookmark titles, and backreferences from bibliography entries. -\usepackage[pdfborder={0 0 0},linkcolor=blue,citecolor=blue,linktocpage=true,colorlinks=true,bookmarksnumbered,bookmarksopen=true,bookmarksopenlevel=0,pagebackref]{hyperref} +% Hyperlinks in PDF files. We enable numbering in PDF bookmark titles, and backreferences from bibliography entries, and make the section title in the TOC clickable. +\usepackage[pdfborder={0 0 0},linktocpage=true,colorlinks=true,bookmarksnumbered,bookmarksopen=true,bookmarksopenlevel=0,pagebackref,linktoc=all]{hyperref} % More control over PDF bookmarks. \usepackage{bookmark} @@ -85,7 +85,7 @@ \usepackage{tikz} \usepackage{tikz-cd} \usetikzlibrary{positioning,shapes,arrows,trees,calc,fit,backgrounds,shapes.multipart} -\tikzstyle{class}=[rounded corners,draw=black,thick,anchor=west] +\tikzstyle{class}=[rounded corners,draw=black,thick,anchor=west, outer sep=0.2em] \tikzstyle{stage}=[rounded corners, draw=darkgray, thick, fill=lightgray, text centered, outer sep=0.2em] \tikzstyle{data}=[draw=darkgray, thick, fill=light-gray, text centered, outer sep=0.2em] \tikzstyle{arrow}=[->,>=stealth] @@ -95,10 +95,10 @@ \tikzstyle{genericenvpart}=[rounded corners, fill=light-gray] \tikzstyle{abstractvertex}=[circle, fill=gray] \tikzstyle{abstractvertex2}=[circle, fill=lightgray] -\tikzstyle{SourceFile}=[draw=black,thick,fill=lightgray] +\tikzstyle{SourceFile}=[draw=black,thick,fill=lightgray, outer sep=0.2em] \tikzstyle{decl}=[draw=black,thick] \tikzstyle{type}=[rounded corners,draw=black,thick,fill=lighter-gray] -\tikzstyle{OtherEntity}=[draw=black,thick,fill=lighter-gray] +\tikzstyle{OtherEntity}=[draw=black,thick,fill=lighter-gray, outer sep=0.2em] % \DeclareMathOperator, \boxed, etc \usepackage{mathtools} @@ -131,8 +131,8 @@ % Text flowing around figures \usepackage{wrapfig} -% For \textbf inside \texttt -\usepackage{bold-extra} +% Set up hyperref link colors +\hypersetup{linkcolor=[rgb]{0,0.2,0.6},citecolor=[rgb]{0.5,0,0.5},urlcolor=[rgb]{0,0.4,0.2}} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Environments @@ -146,10 +146,6 @@ \newtheorem{algorithm}{Algorithm}[chapter] \renewcommand\thealgorithm{\thechapter\Alph{algorithm}} -% Hack so that the sample 'Algorithm 0A' from the Preface doesn't show up in the list. -\newtheorem{algorithmx}{Algorithm}[chapter] -\renewcommand\thealgorithmx{\thechapter\Alph{algorithmx}} - % Use the same counter for all three so you get Definition 12.1, Proposition 12.2, Example 12.3 etc. \newtheorem{example}[definition]{Example} @@ -157,6 +153,7 @@ % \theoremstyle{theorem} \newtheorem{proposition}[definition]{Proposition} \newtheorem{lemma}[definition]{Lemma} +\newtheorem{corollary}[definition]{Corollary} \newtheorem{theorem}[definition]{Theorem} % 'listing' floating environment for source listings @@ -167,17 +164,32 @@ ]{listing} % 'Verbatim' environment for fancy source listings -\RecustomVerbatimEnvironment{Verbatim}{Verbatim}{frame=single,rulecolor=\color{gray},numbers=left} +\RecustomVerbatimEnvironment{Verbatim}{Verbatim}{frame=single} % 'More Details' boxes in Introduction -\newenvironment{MoreDetails}{\medskip\begin{mdframed}[rightline=true,frametitlerule=true,frametitlerulecolor=gray,frametitlebackgroundcolor=light-gray,frametitlerulewidth=2pt,backgroundcolor=light-gray,linecolor=gray,frametitle={More details}] -\begin{itemize}}{\end{itemize} -\end{mdframed}} +\newenvironment{MoreDetails}{\medskip\begin{mdframed}[rightline=true,frametitlerule=true,frametitlerulecolor=gray,frametitlebackgroundcolor=light-gray,frametitlerulewidth=2pt,backgroundcolor=light-gray,linecolor=gray,frametitle={More details}]\begin{itemize}}{\end{itemize}\end{mdframed}} + +% Derived requirements +\newenvironment{derivation}{\begin{gather*}}{\end{gather*}} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Macros %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +% Cross-references. +\newcommand{\PartRef}[1]{\hyperref[#1]{Part~\ref*{#1}}} +\newcommand{\ChapRef}[1]{\hyperref[#1]{Chapter~\ref*{#1}}} +\newcommand{\SecRef}[1]{\hyperref[#1]{Section~\ref*{#1}}} +\newcommand{\AppendixRef}[1]{\hyperref[#1]{Appendix~\ref*{#1}}} +\newcommand{\DefRef}[1]{\hyperref[#1]{Definition~\ref*{#1}}} +\newcommand{\PropRef}[1]{\hyperref[#1]{Proposition~\ref*{#1}}} +\newcommand{\ThmRef}[1]{\hyperref[#1]{Theorem~\ref*{#1}}} +\newcommand{\LemmaRef}[1]{\hyperref[#1]{Lemma~\ref*{#1}}} +\newcommand{\AlgRef}[1]{\hyperref[#1]{Algorithm~\ref*{#1}}} +\newcommand{\ExRef}[1]{\hyperref[#1]{Example~\ref*{#1}}} +\newcommand{\FigRef}[1]{\hyperref[#1]{Figure~\ref*{#1}}} +\newcommand{\ListingRef}[1]{\hyperref[#1]{Listing~\ref*{#1}}} + % I use this in a few places. \definecolor{light-gray}{gray}{0.90} \definecolor{lighter-gray}{gray}{0.97} @@ -205,12 +217,70 @@ % Request evalator requests \newcommand{\Request}[1]{\textbf{#1}} -% Formal requirements in proof calculus -\newcommand{\FormalReq}[1]{[\texttt{#1}]} +% Derived requirements and type substitution notation +\newcommand{\SameReq}[2]{[\texttt{#1 == #2}]} \newcommand{\ConfReq}[2]{[\texttt{#1:~#2}]} -% Substitution algebra notation -\newcommand{\Type}[1]{[{\texttt{#1}}]} +\newcommand{\GenericStepDef}{\ttgp{d}{i}\tag{\textsc{Generic}}} + +\newcommand{\ConfStepDef}{\ConfReq{T}{P}\tag{\textsc{Conf}}} +\newcommand{\SameStepDef}{\SameReq{T}{U}\tag{\textsc{Same}}} +\newcommand{\ConcreteStepDef}{\SameReq{T}{X}\tag{\textsc{Concrete}}} +\newcommand{\SuperStepDef}{\ConfReq{T}{C}\tag{\textsc{Super}}} +\newcommand{\LayoutStepDef}{\ConfReq{T}{AnyObject}\tag{\textsc{Layout}}} + +\newcommand{\AssocNameStepDef}{\texttt{T.A}\tag{\textsc{AssocName} $\ConfReq{T}{P}$}} +\newcommand{\AssocDeclStepDef}{\texttt{T.[P]A}\tag{\textsc{AssocDecl} $\ConfReq{T}{P}$}} +\newcommand{\AssocBindStepDef}{\SameReq{T.[P]A}{T.A}\tag{\textsc{AssocBind} $\ConfReq{T}{P}$}} + +\newcommand{\AssocConfStepDef}{\ConfReq{T.U}{Q}\tag{\textsc{AssocConf} $\ConfReq{Self.U}{Q}_\texttt{P}$ $\ConfReq{T}{P}$}} +\newcommand{\AssocSameStepDef}{\SameReq{T.U}{T.V}\tag{\textsc{AssocSame} $\SameReq{Self.U}{Self.V}_\texttt{P}$ $\ConfReq{T}{P}$}} +\newcommand{\AssocConcreteStepDef}{\SameReq{T.U}{$\Xprime$}\tag{\textsc{AssocConcrete} $\SameReq{Self.U}{X}_\texttt{P}$ $\ConfReq{T}{P}$}} +\newcommand{\AssocSuperStepDef}{\ConfReq{T.U}{$\Cprime$}\tag{\textsc{AssocSuper} $\ConfReq{Self.U}{C}_\texttt{P}$ $\ConfReq{T}{P}$}} +\newcommand{\AssocLayoutStepDef}{\ConfReq{T.U}{AnyObject}\tag{\textsc{AssocLayout} $\ConfReq{Self.U}{AnyObject}_\texttt{P}$ $\ConfReq{T}{P}$}} + +\newcommand{\ReflexStepDef}{\SameReq{T}{T}\tag{\textsc{Reflex} $\texttt{T}$}} +\newcommand{\SymStepDef}{\SameReq{U}{T}\tag{\textsc{Sym} $\SameReq{T}{U}$}} +\newcommand{\TransStepDef}{\SameReq{T}{V}\tag{\textsc{Trans} $\SameReq{T}{U}$ $\SameReq{U}{V}$}} + +\newcommand{\SameConfStepDef}{\ConfReq{T}{P}\tag{\textsc{SameConf} $\ConfReq{U}{P}$ $\SameReq{T}{U}$}} +\newcommand{\SameConcreteStepDef}{\SameReq{T}{X}\tag{\textsc{SameConcrete} $\SameReq{U}{X}$ $\SameReq{T}{U}$}} +\newcommand{\SameSuperStepDef}{\ConfReq{T}{C}\tag{\textsc{SameSuper} $\ConfReq{U}{C}$ $\SameReq{T}{U}$}} +\newcommand{\SameLayoutStepDef}{\ConfReq{T}{AnyObject}\tag{\textsc{SameLayout} $\ConfReq{U}{AnyObject}$ $\SameReq{T}{U}$}} + +\newcommand{\SameNameStepDef}{\SameReq{T.A}{U.A}\tag{\textsc{SameName} $\ConfReq{U}{P}$ $\SameReq{T}{U}$}} +\newcommand{\SameDeclStepDef}{\SameReq{T.[P]A}{U.[P]A}\tag{\textsc{SameDecl} $\ConfReq{U}{P}$ $\SameReq{T}{U}$}} + +\newcommand{\AnyStep}[2]{{#2}.\ #1\tag{$\ldots$}} + +\newcommand{\GenericStep}[2]{{#2}.\ \texttt{#1}\tag{\textsc{Generic}}} + +\newcommand{\ConfStep}[3]{{#3}.\ \ConfReq{#1}{#2}\tag{\textsc{Conf}}} +\newcommand{\SameStep}[3]{{#3}.\ \SameReq{#1}{#2}\tag{\textsc{Same}}} +\newcommand{\ConcreteStep}[3]{{#3}.\ \SameReq{#1}{#2}\tag{\textsc{Concrete}}} + +\newcommand{\AssocNameStep}[3]{{#3}.\ \texttt{#2}\tag{\textsc{AssocName} {#1}}} +\newcommand{\AssocDeclStep}[3]{{#3}.\ \texttt{#2}\tag{\textsc{AssocDecl} {#1}}} +\newcommand{\AssocBindStep}[4]{{#4}.\ \SameReq{#2}{#3}\tag{\textsc{AssocBind} {#1}}} + +\newcommand{\AssocConfStep}[4]{{#4}.\ \ConfReq{#2}{#3}\tag{\textsc{AssocConf} {#1}}} +\newcommand{\AssocSameStep}[4]{{#4}.\ \SameReq{#2}{#3}\tag{\textsc{AssocSame} {#1}}} +\newcommand{\AssocConcreteStep}[4]{{#4}.\ \SameReq{#2}{#3}\tag{\textsc{AssocConcrete} {#1}}} +\newcommand{\AssocSuperStep}[4]{{#4}.\ \ConfReq{#2}{#3}\tag{\textsc{AssocSuper} {#1}}} +\newcommand{\AssocLayoutStep}[3]{{#3}.\ \ConfReq{#2}{AnyObject}\tag{\textsc{AssocLayout} {#1}}} + +\newcommand{\SameConfStep}[5]{{#5}.\ \ConfReq{#3}{#4}\tag{\textsc{SameConf} {#1} {#2}}} + +\newcommand{\ReflexStep}[3]{{#3}.\ \SameReq{#2}{#2}\tag{\textsc{Reflex} {#1}}} +\newcommand{\SymStep}[4]{{#4}.\ \SameReq{#2}{#3}\tag{\textsc{Sym} {#1}}} +\newcommand{\TransStep}[5]{{#5}.\ \SameReq{#3}{#4}\tag{\textsc{Trans} {#1} {#2}}} + +\newcommand{\SameNameStep}[5]{{#5}.\ \SameReq{#3}{#4}\tag{\textsc{SameName} {#1} {#2}}} +\newcommand{\SameDeclStep}[5]{{#5}.\ \SameReq{#3}{#4}\tag{\textsc{SameDecl} {#1} {#2}}} + +\newcommand{\Tprime}{\texttt{T}^\prime} +\newcommand{\Xprime}{\texttt{X}^\prime} +\newcommand{\Cprime}{\texttt{C}^\prime} \newcommand{\SubstMap}[1]{\{#1\}} \newcommand{\SubstMapC}[2]{\{#1;\,#2\}} @@ -221,33 +291,43 @@ \newcommand{\SubstMapLong}[1]{\left\{\begin{array}{l}#1\end{array}\right\}} \newcommand{\SubstMapLongC}[2]{\left\{\begin{array}{l}#1;\\[\medskipamount] #2\end{array}\right\}} -\newcommand{\Proto}[1]{[\texttt{#1}]} - \newcommand{\ProtoObj}{\textsc{Proto}} -\newcommand{\Conf}[2]{[#1\texttt{:~#2}]} - \newcommand{\AssocType}[1]{\pi(\texttt{#1})} \newcommand{\AssocConf}[2]{\pi(\texttt{#1:~#2})} \newcommand{\SigObj}{\textsc{Sig}} \newcommand{\TypeObj}[1]{\textsc{Type}({#1})} \newcommand{\SubMapObj}[2]{\textsc{Sub}({#1, #2})} +\newcommand{\ReqObj}[1]{\textsc{Req}({#1})} \newcommand{\ConfObj}[1]{\textsc{Conf}({#1})} \newcommand{\ConfPObj}[2]{\textsc{Conf}_{\texttt{#1}}({#2})} \newcommand{\AssocTypeObj}[1]{\textsc{AssocType}_{\texttt{#1}}} \newcommand{\AssocConfObj}[1]{\textsc{AssocConf}_{\texttt{#1}}} -\newcommand{\ttbf}[1]{\texttt{\textbf{#1}}} -\newcommand{\T}{\ttbf{T}} +\newcommand{\ttgp}[2]{\texttt{$\uptau$\_#1\_#2}} +\newcommand{\rT}{\ttgp{0}{0}} +\newcommand{\rU}{\ttgp{0}{1}} +\newcommand{\rV}{\ttgp{0}{2}} \newcommand{\PhiInv}{\varphi^{-1}} \newcommand{\AR}{\langle A;\,R\rangle} +% Used in `Generic Signature Queries' section +\newcommand{\Query}[2]{\texttt{#1}(#2)} + +\newcommand{\QueryDef}[5]{ +\item \textbf{Query:} \IndexDefinition{#1()@\texttt{#1()}}$\texttt{#1}(#2)$ + +\textbf{Input:} #3 + +\textbf{Output:} #4 + +#5 +} % Old substitution algebra notation \newcommand{\mathboxed}[1]{\boxed{\mbox{\vphantom{pI\texttt{pI}}#1}}} \newcommand{\ttbox}[1]{\boxed{\mbox{\vphantom{pI\texttt{pI}}\texttt{#1}}}} -\newcommand{\ttgp}[2]{\texttt{$\uptau$\_#1\_#2}} % Archetypes print [[like this]]. Must be in math mode \newcommand{\archetype}[1]{[\![\texttt{#1}]\!]} @@ -344,12 +424,16 @@ % Derivation steps \newcommand{\IndexStep}[1]{\index{#1 derivation step@\textsc{#1} derivation step|textrm}} +\newcommand{\IndexStepTwo}[2]{\index{#1 derivation step@\textsc{#1} derivation step!#2}} \newcommand{\IndexStepDefinition}[1]{\index{#1 derivation step@\textsc{#1} derivation step|textbf}} % Sets in the type substitution algebra \newcommand{\IndexSet}[2]{\index{#1@$#2$|textrm}} \newcommand{\IndexSetDefinition}[2]{\index{#1@$#2$|textbf}} +% Swift version history +\newcommand{\IndexSwift}[1]{\index{Swift!#1}\index{history}} + % Normally a seealso index entry elides the page number, but sometimes we want index entries that look like this: % % foo, 123 @@ -368,6 +452,7 @@ \titlehead{\GenericsLogo} \title{Compiling Swift Generics} +\subtitle{Implementation Reference Manual (DRAFT v2)} \author{Slava Pestov} % Put chapter/section headings at the top of each page @@ -405,7 +490,7 @@ % The Good Stuff %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\part{Fundamental Concepts}\label{part fundamentals} +\part{Syntax}\label{part fundamentals} \subfile{chapters/introduction} @@ -415,7 +500,7 @@ \part{Fundamental Concepts}\label{part fundamentals} \subfile{chapters/declarations} -\subfile{chapters/generic-declarations} +\part{Semantics}\label{part blocks} \subfile{chapters/generic-signatures} @@ -423,9 +508,9 @@ \part{Fundamental Concepts}\label{part fundamentals} \subfile{chapters/conformances} -\subfile{chapters/generic-environments} +\subfile{chapters/archetypes} -\part{Further Explorations}\label{part odds and ends} +\part{Processes}\label{part odds and ends} \subfile{chapters/type-resolution} @@ -435,14 +520,14 @@ \part{Further Explorations}\label{part odds and ends} \subfile{chapters/conformance-paths} +\part[]{More Features}\label{part features} + \subfile{chapters/opaque-return-types} \subfile{chapters/existential-types} \subfile{chapters/class-inheritance} -\subfile{chapters/witness-thunks} - \part{The Requirement Machine}\label{part rqm} \subfile{chapters/basic-operation} @@ -459,17 +544,17 @@ \part{The Requirement Machine}\label{part rqm} \subfile{chapters/rewrite-system-minimization} -% Move subsequent PDF bookmarks to the top level since the Appendix, Bibliography and Index are not logically contained in Part III +% Move subsequent PDF bookmarks to the top level since the Appendix, Bibliography and Index are not logically contained in Part V \bookmarksetup{startatroot} \appendix +\subfile{chapters/math-summary} + \subfile{chapters/derived-requirements-summary} \subfile{chapters/type-substitution-summary} -\subfile{chapters/runtime-representation} - %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Postmatter %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%