You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jan 1, 2022. It is now read-only.
In an effort to get 3.x underway this will be the summary issue linking to all tracking issues. Many of the issues that will come out of this will be excellent "first time" issues that I'd be willing to mentor!
I will be updating this summary periodically (in addition to working on all the other issues in the queue) as well as linking to tracking issues for individual tracking issues as I create them.
Necessities(to whom they matter, they really matter)
Public dependencies of a stable crate are stable (C-STABLE)
Crate and its dependencies have a permissive license (C-PERMISSIVE)
Organization
Crate root re-exports common functionality (C-REEXPORT)
Crates pub use the most common types for convenience, so that clients do not
have to remember or write the crate's module hierarchy to use these types.
Re-exporting is covered in more detail in the The Rust Programming Language
under Crates and Modules.
Examples from serde_json
The serde_json::Value type is the most commonly used type from serde_json.
It is a re-export of a type that lives elsewhere in the module hierarchy, at serde_json::value::Value. The serde_json::value module defines
other JSON-value-related things that are not re-exported. For example serde_json::value::Index is the trait that defines types that can be used to
index into a Value using square bracket indexing notation. The Index trait
is not re-exported at the crate root because it would be comparatively rare for
a client crate to need to refer to it.
In addition to types, functions can be re-exported as well. In serde_json the serde_json::from_str function is a re-export of a function from the serde_json::de deserialization module, which contains other less common
deserialization-related functionality that is not re-exported.
Modules provide a sensible API hierarchy (C-HIERARCHY)
Examples from Serde
The serde crate is two independent frameworks in one crate - a serialization
half and a deserialization half. The crate is divided accordingly into serde::ser and serde::de. Part of the deserialization framework is
isolated under serde::de::value because it is a relatively large API surface
that is relatively unimportant, and it would crowd the more common, more
important functionlity located in serde::de if it were to share the same
namespace.
Naming
Casing conforms to RFC 430 (C-CASE)
Basic Rust naming conventions are described in RFC 430.
In general, Rust tends to use CamelCase for "type-level" constructs (types and
traits) and snake_case for "value-level" constructs. More precisely:
concise CamelCase, usually single uppercase letter: T
Lifetimes
short lowercase, usually a single letter: 'a, 'de, 'src
In CamelCase, acronyms count as one word: use Uuid rather than UUID. In snake_case, acronyms are lower-cased: is_xid_start.
In snake_case or SCREAMING_SNAKE_CASE, a "word" should never consist of a
single letter unless it is the last "word". So, we have btree_map rather than b_tree_map, but PI_2 rather than PI2.
Examples from the standard library
The whole standard library. This guideline should be easy!
Conversions should be provided as methods, with names prefixed as follows:
Prefix
Cost
Ownership
as_
Free
borrowed -> borrowed
to_
Expensive
borrowed -> owned
into_
Variable
owned -> owned
For example:
str::as_bytes() gives a &[u8] view into a &str, which is free.
str::to_owned() copies a &str to a new String, which may require memory
allocation.
String::into_bytes() takes ownership a String and yields the underlying Vec<u8>, which is free.
BufReader::into_inner() takes ownership of a buffered reader and extracts
out the underlying reader, which is free. Data in the buffer is
discarded.
BufWriter::into_inner() takes ownership of a buffered writer and extracts
out the underlying writer, which requires a potentially expensive flush of any
buffered data.
Conversions prefixed as_ and into_ typically decrease abstraction, either
exposing a view into the underlying representation (as) or deconstructing data
into its underlying representation (into). Conversions prefixed to_, on the
other hand, typically stay at the same level of abstraction but do some work to
change one representation into another.
This guideline applies to data structures that are conceptually homogeneous
collections. As a counterexample, the str type is slice of bytes that are
guaranteed to be valid UTF-8. This is conceptually more nuanced than a
homogeneous collection so rather than providing the iter/iter_mut/into_iter group of iterator methods, it provides str::bytes to iterate as bytes and str::chars to iterate as chars.
This guideline applies to methods only, not functions. For example percent_encode from the url crate returns an iterator over percent-encoded
string fragments. There would be no clarity to be had by using an iter/iter_mut/into_iter convention.
Iterator type names match the methods that produce them (C-ITER-TY)
A method called into_iter() should return a type called IntoIter and
similarly for all other methods that return iterators.
This guideline applies chiefly to methods, but often makes sense for functions
as well. For example the percent_encode function from the url crate
returns an iterator type called PercentEncode.
These type names make the most sense when prefixed with their owning module, for
example vec::IntoIter.
Functions often come in multiple variants: immutably borrowed, mutably borrowed,
and owned.
The right default depends on the function in question. Variants should be marked
through suffixes.
Exceptions
In the case of iterators, the moving variant can also be understood as an into
conversion, into_iter, and for x in v.into_iter() reads arguably better than for x in v.iter_move(), so the convention is into_iter.
For mutably borrowed variants, if the mut qualifier is part of a type name,
it should appear as it would appear in the type. For example Vec::as_mut_slice returns a mut slice; it does what it says.
Immutably borrowed by default
If foo uses/produces an immutable borrow by default, use:
The _mut suffix (e.g. foo_mut) for the mutably borrowed variant.
The _move suffix (e.g. foo_move) for the owned variant.
Single-element contains where accessing the element cannot fail should implement get and get_mut, with the following signatures.
fnget(&self) -> &V;fnget_mut(&mutself) -> &mutV;
Single-element containers where the element is Copy (e.g. Cell-like
containers) should instead return the value directly, and not implement a
mutable accessor. TODO rust-api-guidelines#44
fnget(&self) -> V;
For getters that do runtime validation, consider adding unsafe _unchecked
variants.
Types eagerly implement common traits (C-COMMON-TRAITS)
Rust's trait system does not allow orphans: roughly, every impl must live
either in the crate that defines the trait or the implementing type.
Consequently, crates that define new types should eagerly implement all
applicable, common traits.
To see why, consider the following situation:
Crate std defines trait Display.
Crate url defines type Url, without implementing Display.
Crate webapp imports from both std and url,
There is no way for webapp to add Display to url, since it defines
neither. (Note: the newtype pattern can provide an efficient, but inconvenient
workaround.
The most important common traits to implement from std are:
FromIterator is for creating a new collection containing items from an
iterator, and Extend is for adding items from an iterator onto an existing
collection.
Examples from the standard library
Vec<T> implements both FromIterator<T> and Extend<T>.
Data structures implement Serde's Serialize, Deserialize (C-SERDE)
Types that play the role of a data structure should implement Serialize and Deserialize.
An example of a type that does not play the role of a data structure is byteorder::LittleEndian.
Crate has a "serde" cfg option that enables Serde (C-SERDE-CFG)
If the crate relies on serde_derive to provide Serde impls, the name of the
cfg can still be simply "serde" by using this workaround. Do not use a
different name for the cfg like "serde_impls" or "serde_serialization".
Types are Send and Sync where possible (C-SEND-SYNC)
Send and Sync are automatically implemented when the compiler determines
it is appropriate.
In types that manipulate raw pointers, be vigilant that the Send and Sync
status of your type accurately reflects its thread safety characteristics. Tests
like the following can help catch unintentional regressions in whether the type
implements Send or Sync.
An error that is not Send cannot be returned by a thread run with thread::spawn. An error that is not Sync cannot be passed across threads
using an Arc. These are common requirements for basic error handling in a
multithreaded application.
Binary number types provide Hex, Octal, Binary formatting (C-NUM-FMT)
These traits control the representation of a type under the {:X}, {:x}, {:o}, and {:b} format specifiers.
Implement these traits for any number type on which you would consider doing
bitwise manipulations like | or &. This is especially appropriate for
bitflag types. Numeric quantity types like struct Nanoseconds(u64) probably do
not need these.
When defining functions that return Result, and the error carries no
useful additional information, do not use () as the error type. ()
does not implement std::error::Error, and this causes problems for
callers that expect to be able to convert errors to Error. Common
error handling libraries like error-chain expect errors to implement Error.
Instead, define a meaningful error type specific to your crate.
Examples from the standard library
ParseBoolError is returned when failing to parse a bool from a string.
Macros
Input syntax is evocative of the output (C-EVOCATIVE)
Rust macros let you dream up practically whatever input syntax you want. Aim to
keep input syntax familiar and cohesive with the rest of your users' code by
mirroring existing Rust syntax where possible. Pay attention to the choice and
placement of keywords and punctuation.
A good guide is to use syntax, especially keywords and punctuation, that is
similar to what will be produced in the output of the macro.
For example if your macro declares a struct with a particular name given in the
input, preface the name with the keyword struct to signal to readers that a
struct is being declared with the given name.
// Prefer this...bitflags!{structS:u32{/* ... */}}// ...over no keyword...bitflags!{S:u32{/* ... */}}// ...or some ad-hoc word.bitflags!{
flags S:u32{/* ... */}}
Another example is semicolons vs commas. Constants in Rust are followed by
semicolons so if your macro declares a chain of constants, they should likely be
followed by semicolons even if the syntax is otherwise slightly different from
Rust's.
Macros are so diverse that these specific examples won't be relevant, but think
about how to apply the same principles to your situation.
Item macros compose well with attributes (C-MACRO-ATTR)
Macros that produce more than one output item should support adding attributes
to any one of those items. One common use case would be putting individual items
behind a cfg.
Item macros work anywhere that items are allowed (C-ANYWHERE)
Rust allows items to be placed at the module level or within a tighter scope
like a function. Item macros should work equally well as ordinary items in all
of these places. The test suite should include invocations of the macro in at
least the module scope and function scope.
As a simple example of how things can go wrong, this macro works great in a
module scope but fails in a function scope.
macro_rules! broken {($m:ident :: $t:ident) => {pubstruct $t;pubmod $m {pubusesuper::$t;}}}broken!(m::T);// okay, expands to T and m::Tfng(){broken!(m::U);// fails to compile, super::U refers to the containing module not g}
Item macros support visibility specifiers (C-MACRO-VIS)
Follow Rust syntax for visibility of items produced by a macro. Private by
default, public if pub is specified.
Every public module, trait, struct, enum, function, method, macro, and type
definition should have an example that exercises the functionality.
The purpose of an example is not always to show how to use the item. For
example users can be expected to know how to instantiate and match on an enum
like enum E { A, B }. Rather, an example is often intended to show why
someone would want to use the item.
This guideline should be applied within reason.
A link to an applicable example on another item may be sufficient. For example
if exactly one function uses a particular type, it may be appropriate to write a
single example on either the function or the type and link to it from the other.
Examples use ?, not try!, not unwrap (C-QUESTION-MARK)
Like it or not, example code is often copied verbatim by users. Unwrapping an
error should be a conscious decision that the user needs to make.
A common way of structuring fallible example code is the following. The lines
beginning with # are compiled by cargo test when building the example but
will not appear in user-visible rustdoc.
This applies to trait methods as well. Trait methods for which the
implementation is allowed or expected to return an error should be documented
with an "Errors" section.
Examples from the standard library
Some implementations of the std::io::Read::read trait method may return an
error.
/// Pull some bytes from this source into the specified buffer, returning
/// how many bytes were read.
///
/// ... lots more info ...
///
/// # Errors
///
/// If this function encounters any form of I/O or other error, an error
/// variant will be returned. If an error is returned then it must be
/// guaranteed that no bytes were read.
Function docs include panic conditions in "Panics" section (C-PANIC-DOC)
This applies to trait methods as well. Traits methods for which the
implementation is allowed or expected to panic should be documented with a
"Panics" section.
/// Inserts an element at position `index` within the vector, shifting all
/// elements after it to the right.
///
/// # Panics
///
/// Panics if `index` is out of bounds.
Prose contains hyperlinks to relevant things (C-LINK)
Links to methods within the same type usually look like this:
If this were defined as an inherent method instead, it would be confusing at the
call site whether the method being called is a method on Box<T> or a method on T.
impl<T>Box<T>whereT: ?Sized{// Do not do this.fninto_raw(self) -> *mutT{/* ... */}}let boxed_str:Box<str> = /* ... */;// This is a method on str accessed through the smart pointer Deref impl.
boxed_str.chars()// This is a method on Box<str>...?
boxed_str.into_raw()
Conversions live on the most specific type involved (C-CONV-SPECIFIC)
When in doubt, prefer to_/as_/into_ to from_, because they are more
ergonomic to use (and can be chained with other methods).
For many conversions between two types, one of the types is clearly more
"specific": it provides some additional invariant or interpretation that is not
present in the other type. For example, str is more specific than &[u8],
since it is a UTF-8 encoded sequence of bytes.
Conversions should live with the more specific of the involved types. Thus, str provides both the as_bytes method and the from_utf8 constructor
for converting to and from &[u8] values. Besides being intuitive, this
convention avoids polluting concrete types like &[u8] with endless conversion
methods.
Functions with a clear receiver are methods (C-METHOD)
Prefer
implFoo{pubfnfrob(&self,w:widget){/* ... */}}
over
pubfnfrob(foo:&Foo,w:widget){/* ... */}
for any operation that is clearly associated with a particular type.
Methods have numerous advantages over functions:
They do not need to be imported or qualified to be used: all you need is a
value of the appropriate type.
Their invocation performs autoborrowing (including mutable borrows).
They make it easy to answer the question "what can I do with a value of type T" (especially when using rustdoc).
They provide self notation, which is more concise and often more clearly
conveys ownership distinctions.
Functions do not take out-parameters (C-NO-OUT)
Prefer
fnfoo() -> (Bar,Bar)
over
fnfoo(output:&mutBar) -> Bar
for returning multiple Bar values.
Compound return types like tuples and structs are efficiently compiled and do
not require heap allocation. If a function needs to return multiple values, it
should do so via one of these types.
The primary exception: sometimes a function is meant to modify data that the
caller already owns, for example to re-use a buffer:
Operators with built in syntax (*, |, and so on) can be provided for a type
by implementing the traits in std::ops. These operators come with strong
expectations: implement Mul only for an operation that bears some resemblance
to multiplication (and shares the expected properties, e.g. associativity), and
so on for the other traits.
Only smart pointers implement Deref and DerefMut (C-DEREF)
The Deref traits are used implicitly by the compiler in many circumstances,
and interact with method resolution. The relevant rules are designed
specifically to accommodate smart pointers, and so the traits should be used
only for that purpose.
Constructors are static (no self) inherent methods for the type that they
construct. Combined with the practice of fully importing type names, this
convention leads to informative but concise construction:
use example::Example;// Construct a new Example.let ex = Example::new();
This convention also applied to conversion constructors (prefix from rather
than new).
Constructors for structs with sensible defaults allow clients to concisely
override using the struct update syntax.
Functions expose intermediate results to avoid duplicate work (C-INTERMEDIATE)
Many functions that answer a question also compute interesting related data. If
this data is potentially of interest to the client, consider exposing it in the
API.
Examples from the standard library
Vec::binary_search does not return a bool of whether the value was
found, nor an Option<usize> of the index at which the value was maybe found.
Instead it returns information about the index if found, and also the index at
which the value would need to be inserted if not found.
String::from_utf8 may fail if the input bytes are not UTF-8. In the error
case it returns an intermediate result that exposes the byte offset up to
which the input was valid UTF-8, as well as handing back ownership of the
input bytes.
Caller decides where to copy and place data (C-CALLER-CONTROL)
If a function requires ownership of an argument, it should take ownership of the
argument rather than borrowing and cloning the argument.
// Prefer this:fnfoo(b:Bar){/* use b as owned, directly */}// Over this:fnfoo(b:&Bar){let b = b.clone();/* use b as owned after cloning */}
If a function does not require ownership of an argument, it should take a
shared or exclusive borrow of the argument rather than taking ownership and
dropping the argument.
// Prefer this:fnfoo(b:&Bar){/* use b as borrowed */}// Over this:fnfoo(b:Bar){/* use b as borrowed, it is implicitly dropped before function returns */}
The Copy trait should only be used as a bound when absolutely needed, not as a
way of signaling that copies should be cheap to make.
Functions minimize assumptions about parameters by using generics (C-GENERIC)
The fewer assumptions a function makes about its inputs, the more widely usable
it becomes.
if the function only needs to iterate over the data.
More generally, consider using generics to pinpoint the assumptions a function
needs to make about its arguments.
Advantages of generics
Reusability. Generic functions can be applied to an open-ended collection of
types, while giving a clear contract for the functionality those types must
provide.
Static dispatch and optimization. Each use of a generic function is
specialized ("monomorphized") to the particular types implementing the trait
bounds, which means that (1) invocations of trait methods are static, direct
calls to the implementation and (2) the compiler can inline and otherwise
optimize these calls.
Inline layout. If a struct and enum type is generic over some type
parameter T, values of type T will be laid out inline in the struct/enum, without any indirection.
Inference. Since the type parameters to generic functions can usually be
inferred, generic functions can help cut down on verbosity in code where
explicit conversions or other method calls would usually be necessary.
Precise types. Because generic give a name to the specific type
implementing a trait, it is possible to be precise about places where that
exact type is required or produced. For example, a function
fnbinary<T:Trait>(x:T,y:T) -> T
is guaranteed to consume and produce elements of exactly the same type T; it
cannot be invoked with parameters of different types that both implement Trait.
Disadvantages of generics
Code size. Specializing generic functions means that the function body is
duplicated. The increase in code size must be weighed against the performance
benefits of static dispatch.
Homogeneous types. This is the other side of the "precise types" coin: if T is a type parameter, it stands for a single actual type. So for example
a Vec<T> contains elements of a single concrete type (and, indeed, the
vector representation is specialized to lay these out in line). Sometimes
heterogeneous collections are useful; see trait objects.
Signature verbosity. Heavy use of generics can make it more difficult to
read and understand a function's signature.
Examples from the standard library
std::fs::File::open takes an argument of generic type AsRef<Path>. This
allows files to be opened conveniently from a string literal "f.txt", a Path, an OsString, and a few other types.
Traits are object-safe if they may be useful as a trait object (C-OBJECT)
Trait objects have some significant limitations: methods invoked through a trait
object cannot use generics, and cannot use Self except in receiver position.
When designing a trait, decide early on whether the trait will be used as an
object or as a bound on generics.
If a trait is meant to be used as an object, its methods should take and return
trait objects rather than use generics.
A where clause of Self: Sized may be used to exclude specific methods from
the trait's object. The following trait is not object-safe due to the generic
method.
The Iterator trait has several generic methods marked with where Self: Sized to retain the ability to use Iterator as an object.
Type safety
Newtypes provide static distinctions (C-NEWTYPE)
Newtypes can statically distinguish between different interpretations of an
underlying type.
For example, a f64 value might be used to represent a quantity in miles or in
kilometers. Using newtypes, we can keep track of the intended interpretation:
cannot accidentally be called with a Kilometers value. The compiler will
remind us to perform the conversion, thus averting certain catastrophic bugs.
Arguments convey meaning through types, not bool or Option (C-CUSTOM-TYPE)
Prefer
let w = Widget::new(Small,Round)
over
let w = Widget::new(true,false)
Core types like bool, u8 and Option have many possible interpretations.
Use custom types (whether enums, struct, or tuples) to convey interpretation
and invariants. In the above example, it is not immediately clear what true
and false are conveying without looking up the argument names, but Small and Round are more suggestive.
Using custom types makes it easier to expand the options later on, for example
by adding an ExtraLarge variant.
See the newtype pattern for a no-cost way to wrap existing types
with a distinguished name.
Types for a set of flags are bitflags, not enums (C-BITFLAG)
Rust supports enum types with explicitly specified discriminants:
Custom discriminants are useful when an enum type needs to be serialized to an
integer value compatibly with some other system/language. They support
"typesafe" APIs: by taking a Color, rather than an integer, a function is
guaranteed to get well-formed inputs, even if it later views those inputs as
integers.
An enum allows an API to request exactly one choice from among many. Sometimes
an API's input is instead the presence or absence of a set of flags. In C code,
this is often done by having each flag correspond to a particular bit, allowing
a single integer to represent, say, 32 or 64 flags. Rust's bitflags crate
provides a typesafe representation of this pattern.
Builders enable construction of complex values (C-BUILDER)
Some data structures are complicated to construct, due to their construction
needing:
a large number of inputs
compound data (e.g. slices)
optional configuration data
choice between several flavors
which can easily lead to a large number of distinct constructors with many
arguments each.
If T is such a data structure, consider introducing a Tbuilder:
Introduce a separate data type TBuilder for incrementally configuring a T
value. When possible, choose a better name: e.g. Command is the builder
for a child process, Url can be created from a ParseOptions.
The builder constructor should take as parameters only the data required to
make a T.
The builder should offer a suite of convenient methods for configuration,
including setting up compound inputs (like slices) incrementally. These
methods should return self to allow chaining.
The builder should provide one or more "terminal" methods for actually
building a T.
The builder pattern is especially appropriate when building a T involves side
effects, such as spawning a task or launching a process.
In Rust, there are two variants of the builder pattern, differing in the
treatment of ownership, as described below.
Non-consuming builders (preferred):
In some cases, constructing the final T does not require the builder itself to
be consumed. The follow variant on std::process::Command is one example:
// NOTE: the actual Command API does not use owned Strings;// this is a simplified version.pubstructCommand{program:String,args:Vec<String>,cwd:Option<String>,// etc}implCommand{pubfnnew(program:String) -> Command{Command{program: program,args:Vec::new(),cwd:None,}}/// Add an argument to pass to the program.pubfnarg(&mutself,arg:String) -> &mutCommand{self.args.push(arg);self}/// Add multiple arguments to pass to the program.pubfnargs(&mutself,args:&[String]) -> &mutCommand{self.args.extend_from_slice(args);self}/// Set the working directory for the child process.pubfncurrent_dir(&mutself,dir:String) -> &mutCommand{self.cwd = Some(dir);self}/// Executes the command as a child process, which is returned.pubfnspawn(&self) -> io::Result<Child>{/* ... */}}
Note that the spawn method, which actually uses the builder configuration to
spawn a process, takes the builder by immutable reference. This is possible
because spawning the process does not require ownership of the configuration
data.
Because the terminal spawn method only needs a reference, the configuration
methods take and return a mutable borrow of self.
The benefit
By using borrows throughout, Command can be used conveniently for both
one-liner and more complex constructions:
Sometimes builders must transfer ownership when constructing the final type T,
meaning that the terminal methods must take self rather than &self.
implTaskBuilder{/// Name the task-to-be.pubfnnamed(mutself,name:String) -> TaskBuilder{self.name = Some(name);self}/// Redirect task-local stdout.pubfnstdout(mutself,stdout:Box<io::Write + Send>) -> TaskBuilder{self.stdout = Some(stdout);self}/// Creates and executes a new child task.pubfnspawn<F>(self,f:F)whereF:FnOnce() + Send{/* ... */}}
Here, the stdout configuration involves passing ownership of an io::Write,
which must be transferred to the task upon construction (in spawn).
When the terminal methods of the builder require ownership, there is a basic
tradeoff:
If the other builder methods take/return a mutable borrow, the complex
configuration case will work well, but one-liner configuration becomes
impossible.
If the other builder methods take/return an owned self, one-liners continue
to work well but complex configuration is less convenient.
Under the rubric of making easy things easy and hard things possible, all
builder methods for a consuming builder should take and returned an owned self. Then client code works as follows:
One-liners work as before, because ownership is threaded through each of the
builder methods until being consumed by spawn. Complex configuration, however,
is more verbose: it requires re-assigning the builder at each step.
Dependability
Functions validate their arguments (C-VALIDATE)
Rust APIs do not generally follow the robustness principle: "be conservative
in what you send; be liberal in what you accept".
Instead, Rust code should enforce the validity of input whenever practical.
Enforcement can be achieved through the following mechanisms (listed in order of
preference).
Static enforcement:
Choose an argument type that rules out bad inputs.
For example, prefer
fnfoo(a:Ascii){/* ... */}
over
fnfoo(a:u8){/* ... */}
where Ascii is a wrapper around u8 that guarantees the highest bit is
zero; see newtype patterns for more details on creating typesafe
wrappers.
Static enforcement usually comes at little run-time cost: it pushes the costs to
the boundaries (e.g. when a u8 is first converted into an Ascii). It also
catches bugs early, during compilation, rather than through run-time failures.
On the other hand, some properties are difficult or impossible to express using
types.
Dynamic enforcement:
Validate the input as it is processed (or ahead of time, if necessary). Dynamic
checking is often easier to implement than static checking, but has several
downsides:
Runtime overhead (unless checking can be done as part of processing the
input).
Delayed detection of bugs.
Introduces failure cases, either via fail! or Result/Option types,
which must then be dealt with by client code.
Dynamic enforcement with debug_assert!:
Same as dynamic enforcement, but with the possibility of easily turning off
expensive checks for production builds.
Dynamic enforcement with opt-out:
Same as dynamic enforcement, but adds sibling functions that opt out of the
checking.
The convention is to mark these opt-out functions with a suffix like _unchecked or by placing them in a raw submodule.
The unchecked functions can be used judiciously in cases where (1) performance
dictates avoiding checks and (2) the client is otherwise confident that the
inputs are valid.
Destructors never fail (C-DTOR-FAIL)
Destructors are executed on task failure, and in that context a failing
destructor causes the program to abort.
Instead of failing in a destructor, provide a separate method for checking for
clean teardown, e.g. a close method, that returns a Result to signal
problems.
Destructors that may block have alternatives (C-DTOR-BLOCK)
Similarly, destructors should not invoke blocking operations, which can make
debugging much more difficult. Again, consider providing a separate method for
preparing for an infallible, nonblocking teardown.
Debuggability
All public types implement Debug (C-DEBUG)
If there are exceptions, they are rare.
Debug representation is never empty (C-DEBUG-NONEMPTY)
Even for conceptually empty values, the Debug representation should never be
empty.
let empty_str = "";assert_eq!(format!("{:?}", empty_str),"\"\"");let empty_vec = Vec::<bool>::new();assert_eq!(format!("{:?}", empty_vec),"[]");
Future proofing
Structs have private fields (C-STRUCT-PRIVATE)
Making a field public is a strong commitment: it pins down a representation
choice, and prevents the type from providing any validation or maintaining any
invariants on the contents of the field, since clients can mutate it arbitrarily.
Public fields are most appropriate for struct types in the C spirit: compound,
passive data structures. Otherwise, consider providing getter/setter methods and
hiding fields instead.
A newtype can be used to hide representation details while making precise
promises to the client.
For example, consider a function my_transform that returns a compound iterator
type.
use std::iter::{Enumerate,Skip};pubfnmy_transform<I:Iterator>(input:I) -> Enumerate<Skip<I>>{
input.skip(3).enumerate()}
We wish to hide this type from the client, so that the client's view of the
return type is roughly Iterator<Item = (usize, T)>. We can do so using the
newtype pattern:
use std::iter::{Enumerate,Skip};pubstructMyTransformResult<I>(Enumerate<Skip<I>>);impl<I:Iterator>IteratorforMyTransformResult<I>{typeItem = (usize,I::Item);fnnext(&mutself) -> Option<Self::Item>{self.0.next()}}pubfnmy_transform<I:Iterator>(input:I) -> MyTransformResult<I>{MyTransformResult(input.skip(3).enumerate())}
Aside from simplifying the signature, this use of newtypes allows us to promise
less to the client. The client does not know how the result iterator is
constructed or represented, which means the representation can change in the
future without breaking client code.
In the future the same thing can be accomplished more concisely with the impl Trait feature but this is currently unstable.
A crate containing this function cannot be stable unless other_crate is also
stable.
Be careful because public dependencies can sneak in at unexpected places.
pubstructError{private:ErrorImpl,}enumErrorImpl{Io(io::Error),// Should be okay even if other_crate isn't// stable, because ErrorImpl is private.Dep(other_crate::Error),}// Oh no! This puts other_crate into the public API// of the current crate.implFrom<other_crate::Error>forError{fnfrom(err: other_crate::Error) -> Self{Error{private:ErrorImpl::Dep(err)}}}
Crate and its dependencies have a permissive license (C-PERMISSIVE)
The software produced by the Rust project is dual-licensed, under
either the MIT or Apache 2.0 licenses. Crates that simply need the
maximum compatibility with the Rust ecosystem are recommended to do
the same, in the manner described herein. Other options are described
below.
These API guidelines do not provide a detailed explanation of Rust's
license, but there is a small amount said in the Rust FAQ. These
guidelines are concerned with matters of interoperability with Rust,
and are not comprehensive over licensing options.
To apply the Rust license to your project, define the license field
in your Cargo.toml as:
## License
Licensed under either of
* Apache License, Version 2.0
([LICENSE-APACHE](LICENSE-APACHE) or http://www.apache.org/licenses/LICENSE-2.0)
* MIT license
([LICENSE-MIT](LICENSE-MIT) or http://opensource.org/licenses/MIT)
at your option.
### Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted
for inclusion in the work by you, as defined in the Apache-2.0 license, shall be
dual licensed as above, without any additional terms or conditions.
Besides the dual MIT/Apache-2.0 license, another common licensing approach
used by Rust crate authors is to apply a single permissive license such as
MIT or BSD. This license scheme is also entirely compatible with Rust's,
because it imposes the minimal restrictions of Rust's MIT license.
Crates that desire perfect license compatibility with Rust are not
recommended to choose only the Apache license. The Apache license,
though it is a permissive license, imposes restrictions beyond the MIT
and BSD licenses that can discourage or prevent their use in some
scenarios, so Apache-only software cannot be used in some situations
where most of the Rust runtime stack can.
The license of a crate's dependencies can affect the restrictions on
distribution of the crate itself, so a permissively-licensed crate
should generally only depend on permissively-licensed crates.
Unless you explicitly state otherwise, any contribution intentionally submitted
for inclusion in this document by you, as defined in the Apache-2.0 license,
shall be dual licensed as above, without any additional terms or conditions.
The text was updated successfully, but these errors were encountered:
Issue by kbknapp
Tuesday May 09, 2017 at 19:50 GMT
Originally opened as clap-rs/clap#950
In an effort to get 3.x underway this will be the summary issue linking to all tracking issues. Many of the issues that will come out of this will be excellent "first time" issues that I'd be willing to mentor!
I will be updating this summary periodically (in addition to working on all the other issues in the queue) as well as linking to tracking issues for individual tracking issues as I create them.
From Rust API Guidelines
Rust API guidelines
Crate conformance checklist
as_
,to_
,into_
conventions (C-CONV)iter
,iter_mut
,into_iter
(C-ITER)_mut
and_ref
(C-OWN-SUFFIX)Copy
,Clone
,Eq
,PartialEq
,Ord
,PartialOrd
,Hash
Debug
,Display
,Default
From
,AsRef
,AsMut
(C-CONV-TRAITS)FromIterator
andExtend
(C-COLLECT)Serialize
,Deserialize
(C-SERDE)"serde"
cfg option that enables Serde (C-SERDE-CFG)Send
andSync
where possible (C-SEND-SYNC)Send
andSync
(C-SEND-SYNC-ERR)()
(C-MEANINGFUL-ERR)Hex
,Octal
,Binary
formatting (C-NUM-FMT)?
, nottry!
, notunwrap
(C-QUESTION-MARK)readme, keywords, categories
Deref
andDerefMut
(C-DEREF)Deref
andDerefMut
never fail (C-DEREF-FAIL)bool
orOption
(C-CUSTOM-TYPE)bitflags
, not enums (C-BITFLAG)Debug
(C-DEBUG)Debug
representation is never empty (C-DEBUG-NONEMPTY)Organization
Crate root re-exports common functionality (C-REEXPORT)
Crates
pub use
the most common types for convenience, so that clients do nothave to remember or write the crate's module hierarchy to use these types.
Re-exporting is covered in more detail in the The Rust Programming Language
under Crates and Modules.
Examples from
serde_json
The
serde_json::Value
type is the most commonly used type fromserde_json
.It is a re-export of a type that lives elsewhere in the module hierarchy, at
serde_json::value::Value
. Theserde_json::value
module definesother JSON-value-related things that are not re-exported. For example
serde_json::value::Index
is the trait that defines types that can be used toindex into a
Value
using square bracket indexing notation. TheIndex
traitis not re-exported at the crate root because it would be comparatively rare for
a client crate to need to refer to it.
In addition to types, functions can be re-exported as well. In
serde_json
theserde_json::from_str
function is a re-export of a function from theserde_json::de
deserialization module, which contains other less commondeserialization-related functionality that is not re-exported.
Modules provide a sensible API hierarchy (C-HIERARCHY)
Examples from Serde
The
serde
crate is two independent frameworks in one crate - a serializationhalf and a deserialization half. The crate is divided accordingly into
serde::ser
andserde::de
. Part of the deserialization framework isisolated under
serde::de::value
because it is a relatively large API surfacethat is relatively unimportant, and it would crowd the more common, more
important functionlity located in
serde::de
if it were to share the samenamespace.
Naming
Casing conforms to RFC 430 (C-CASE)
Basic Rust naming conventions are described in RFC 430.
In general, Rust tends to use
CamelCase
for "type-level" constructs (types andtraits) and
snake_case
for "value-level" constructs. More precisely:snake_case
CamelCase
CamelCase
CamelCase
snake_case
snake_case
new
orwith_more_details
from_some_other_type
snake_case
SCREAMING_SNAKE_CASE
SCREAMING_SNAKE_CASE
CamelCase
, usually single uppercase letter:T
lowercase
, usually a single letter:'a
,'de
,'src
In
CamelCase
, acronyms count as one word: useUuid
rather thanUUID
. Insnake_case
, acronyms are lower-cased:is_xid_start
.In
snake_case
orSCREAMING_SNAKE_CASE
, a "word" should never consist of asingle letter unless it is the last "word". So, we have
btree_map
rather thanb_tree_map
, butPI_2
rather thanPI2
.Examples from the standard library
The whole standard library. This guideline should be easy!
Ad-hoc conversions follow
as_
,to_
,into_
conventions (C-CONV)Conversions should be provided as methods, with names prefixed as follows:
as_
to_
into_
For example:
str::as_bytes()
gives a&[u8]
view into a&str
, which is free.str::to_owned()
copies a&str
to a newString
, which may require memoryallocation.
String::into_bytes()
takes ownership aString
and yields the underlyingVec<u8>
, which is free.BufReader::into_inner()
takes ownership of a buffered reader and extractsout the underlying reader, which is free. Data in the buffer is
discarded.
BufWriter::into_inner()
takes ownership of a buffered writer and extractsout the underlying writer, which requires a potentially expensive flush of any
buffered data.
Conversions prefixed
as_
andinto_
typically decrease abstraction, eitherexposing a view into the underlying representation (
as
) or deconstructing datainto its underlying representation (
into
). Conversions prefixedto_
, on theother hand, typically stay at the same level of abstraction but do some work to
change one representation into another.
More examples from the standard library
Result::as_ref
RefCell::as_ptr
Path::to_str
slice::to_vec
Option::into_iter
AtomicBool::into_inner
Methods on collections that produce iterators follow
iter
,iter_mut
,into_iter
(C-ITER)Per RFC 199.
For a container with elements of type
U
, iterator methods should be named:This guideline applies to data structures that are conceptually homogeneous
collections. As a counterexample, the
str
type is slice of bytes that areguaranteed to be valid UTF-8. This is conceptually more nuanced than a
homogeneous collection so rather than providing the
iter
/iter_mut
/into_iter
group of iterator methods, it providesstr::bytes
to iterate as bytes andstr::chars
to iterate as chars.This guideline applies to methods only, not functions. For example
percent_encode
from theurl
crate returns an iterator over percent-encodedstring fragments. There would be no clarity to be had by using an
iter
/iter_mut
/into_iter
convention.Examples from the standard library
Vec::iter
Vec::iter_mut
Vec::into_iter
BTreeMap::iter
BTreeMap::iter_mut
Iterator type names match the methods that produce them (C-ITER-TY)
A method called
into_iter()
should return a type calledIntoIter
andsimilarly for all other methods that return iterators.
This guideline applies chiefly to methods, but often makes sense for functions
as well. For example the
percent_encode
function from theurl
cratereturns an iterator type called
PercentEncode
.These type names make the most sense when prefixed with their owning module, for
example
vec::IntoIter
.Examples from the standard library
Vec::iter
returnsIter
Vec::iter_mut
returnsIterMut
Vec::into_iter
returnsIntoIter
BTreeMap::keys
returnsKeys
BTreeMap::values
returnsValues
Ownership suffixes use
_mut
,_ref
(C-OWN-SUFFIX)Functions often come in multiple variants: immutably borrowed, mutably borrowed,
and owned.
The right default depends on the function in question. Variants should be marked
through suffixes.
Exceptions
In the case of iterators, the moving variant can also be understood as an
into
conversion,
into_iter
, andfor x in v.into_iter()
reads arguably better thanfor x in v.iter_move()
, so the convention isinto_iter
.For mutably borrowed variants, if the
mut
qualifier is part of a type name,it should appear as it would appear in the type. For example
Vec::as_mut_slice
returns a mut slice; it does what it says.Immutably borrowed by default
If
foo
uses/produces an immutable borrow by default, use:_mut
suffix (e.g.foo_mut
) for the mutably borrowed variant._move
suffix (e.g.foo_move
) for the owned variant.Examples from the standard library
TODO rust-api-guidelines#37
Owned by default
If
foo
uses/produces owned data by default, use:_ref
suffix (e.g.foo_ref
) for the immutably borrowed variant._mut
suffix (e.g.foo_mut
) for the mutably borrowed variant.Examples from the standard library
std::io::BufReader::get_ref
std::io::BufReader::get_mut
Single-element containers implement appropriate getters (C-GETTERS)
Single-element contains where accessing the element cannot fail should implement
get
andget_mut
, with the following signatures.Single-element containers where the element is
Copy
(e.g.Cell
-likecontainers) should instead return the value directly, and not implement a
mutable accessor. TODO rust-api-guidelines#44
For getters that do runtime validation, consider adding unsafe
_unchecked
variants.
Examples from the standard library
std::io::Cursor::get_mut
std::ptr::Unique::get_mut
std::sync::PoisonError::get_mut
std::sync::atomic::AtomicBool::get_mut
std::collections::hash_map::OccupiedEntry::get_mut
<[_]>::get_unchecked
Interoperability
Types eagerly implement common traits (C-COMMON-TRAITS)
Rust's trait system does not allow orphans: roughly, every
impl
must liveeither in the crate that defines the trait or the implementing type.
Consequently, crates that define new types should eagerly implement all
applicable, common traits.
To see why, consider the following situation:
std
defines traitDisplay
.url
defines typeUrl
, without implementingDisplay
.webapp
imports from bothstd
andurl
,There is no way for
webapp
to addDisplay
tourl
, since it definesneither. (Note: the newtype pattern can provide an efficient, but inconvenient
workaround.
The most important common traits to implement from
std
are:Copy
Clone
Eq
PartialEq
Ord
PartialOrd
Hash
Debug
Display
Default
Conversions use the standard traits
From
,AsRef
,AsMut
(C-CONV-TRAITS)The following conversion traits should be implemented where it makes sense:
From
TryFrom
AsRef
AsMut
The following conversion traits should never be implemented:
Into
TryInto
These traits have a blanket impl based on
From
andTryFrom
. Implement thoseinstead.
Examples from the standard library
From<u16>
is implemented foru32
because a smaller integer can always beconverted to a bigger integer.
From<u32>
is not implemented foru16
because the conversion may not bepossible if the integer is too big.
TryFrom<u32>
is implemented foru16
and returns an error if the integer istoo big to fit in
u16
.From<Ipv6Addr>
is implemented forIpAddr
, which is a type that canrepresent both v4 and v6 IP addresses.
Collections implement
FromIterator
andExtend
(C-COLLECT)FromIterator
andExtend
enable collections to be used conveniently withthe following iterator methods:
Iterator::collect
Iterator::partition
Iterator::unzip
FromIterator
is for creating a new collection containing items from aniterator, and
Extend
is for adding items from an iterator onto an existingcollection.
Examples from the standard library
Vec<T>
implements bothFromIterator<T>
andExtend<T>
.Data structures implement Serde's
Serialize
,Deserialize
(C-SERDE)Types that play the role of a data structure should implement
Serialize
andDeserialize
.An example of a type that plays the role of a data structure is
linked_hash_map::LinkedHashMap
.An example of a type that does not play the role of a data structure is
byteorder::LittleEndian
.Crate has a
"serde"
cfg option that enables Serde (C-SERDE-CFG)If the crate relies on
serde_derive
to provide Serde impls, the name of thecfg can still be simply
"serde"
by using this workaround. Do not use adifferent name for the cfg like
"serde_impls"
or"serde_serialization"
.Types are
Send
andSync
where possible (C-SEND-SYNC)Send
andSync
are automatically implemented when the compiler determinesit is appropriate.
In types that manipulate raw pointers, be vigilant that the
Send
andSync
status of your type accurately reflects its thread safety characteristics. Tests
like the following can help catch unintentional regressions in whether the type
implements
Send
orSync
.Error types are
Send
andSync
(C-SEND-SYNC-ERR)An error that is not
Send
cannot be returned by a thread run withthread::spawn
. An error that is notSync
cannot be passed across threadsusing an
Arc
. These are common requirements for basic error handling in amultithreaded application.
Binary number types provide
Hex
,Octal
,Binary
formatting (C-NUM-FMT)std::fmt::UpperHex
std::fmt::LowerHex
std::fmt::Octal
std::fmt::Binary
These traits control the representation of a type under the
{:X}
,{:x}
,{:o}
, and{:b}
format specifiers.Implement these traits for any number type on which you would consider doing
bitwise manipulations like
|
or&
. This is especially appropriate forbitflag types. Numeric quantity types like
struct Nanoseconds(u64)
probably donot need these.
Error types are meaningful, not
()
(C-MEANINGFUL-ERR)When defining functions that return
Result
, and the error carries nouseful additional information, do not use
()
as the error type.()
does not implement
std::error::Error
, and this causes problems forcallers that expect to be able to convert errors to
Error
. Commonerror handling libraries like error-chain expect errors to implement
Error
.Instead, define a meaningful error type specific to your crate.
Examples from the standard library
ParseBoolError
is returned when failing to parse a bool from a string.Macros
Input syntax is evocative of the output (C-EVOCATIVE)
Rust macros let you dream up practically whatever input syntax you want. Aim to
keep input syntax familiar and cohesive with the rest of your users' code by
mirroring existing Rust syntax where possible. Pay attention to the choice and
placement of keywords and punctuation.
A good guide is to use syntax, especially keywords and punctuation, that is
similar to what will be produced in the output of the macro.
For example if your macro declares a struct with a particular name given in the
input, preface the name with the keyword
struct
to signal to readers that astruct is being declared with the given name.
Another example is semicolons vs commas. Constants in Rust are followed by
semicolons so if your macro declares a chain of constants, they should likely be
followed by semicolons even if the syntax is otherwise slightly different from
Rust's.
Macros are so diverse that these specific examples won't be relevant, but think
about how to apply the same principles to your situation.
Item macros compose well with attributes (C-MACRO-ATTR)
Macros that produce more than one output item should support adding attributes
to any one of those items. One common use case would be putting individual items
behind a cfg.
Macros that produce a struct or enum as output should support attributes so that
the output can be used with derive.
Item macros work anywhere that items are allowed (C-ANYWHERE)
Rust allows items to be placed at the module level or within a tighter scope
like a function. Item macros should work equally well as ordinary items in all
of these places. The test suite should include invocations of the macro in at
least the module scope and function scope.
As a simple example of how things can go wrong, this macro works great in a
module scope but fails in a function scope.
Item macros support visibility specifiers (C-MACRO-VIS)
Follow Rust syntax for visibility of items produced by a macro. Private by
default, public if
pub
is specified.Type fragments are flexible (C-MACRO-TY)
If your macro accepts a type fragment like
$t:ty
in the input, it should beusable with all of the following:
u8
,&str
m::Data
::base::Data
super::Data
Vec<String>
As a simple example of how things can go wrong, this macro works great with
primitives and absolute paths but fails with relative paths.
Documentation
Crate level docs are thorough and include examples (C-CRATE-DOC)
See RFC 1687.
All items have a rustdoc example (C-EXAMPLE)
Every public module, trait, struct, enum, function, method, macro, and type
definition should have an example that exercises the functionality.
The purpose of an example is not always to show how to use the item. For
example users can be expected to know how to instantiate and match on an enum
like
enum E { A, B }
. Rather, an example is often intended to show whysomeone would want to use the item.
This guideline should be applied within reason.
A link to an applicable example on another item may be sufficient. For example
if exactly one function uses a particular type, it may be appropriate to write a
single example on either the function or the type and link to it from the other.
Examples use
?
, nottry!
, notunwrap
(C-QUESTION-MARK)Like it or not, example code is often copied verbatim by users. Unwrapping an
error should be a conscious decision that the user needs to make.
A common way of structuring fallible example code is the following. The lines
beginning with
#
are compiled bycargo test
when building the example butwill not appear in user-visible rustdoc.
Function docs include error conditions in "Errors" section (C-ERROR-DOC)
Per RFC 1574.
This applies to trait methods as well. Trait methods for which the
implementation is allowed or expected to return an error should be documented
with an "Errors" section.
Examples from the standard library
Some implementations of the
std::io::Read::read
trait method may return anerror.
Function docs include panic conditions in "Panics" section (C-PANIC-DOC)
Per RFC 1574.
This applies to trait methods as well. Traits methods for which the
implementation is allowed or expected to panic should be documented with a
"Panics" section.
Examples from the standard library
The
Vec::insert
method may panic.Prose contains hyperlinks to relevant things (C-LINK)
Links to methods within the same type usually look like this:
Links to other types usually look like this:
Links may also point to a parent or child module:
This guideline is officially recommended by RFC 1574 under the heading "Link
all the things".
Cargo.toml publishes CI badges for tier 1 platforms (C-CI)
The Rust compiler regards tier 1 platforms as "guaranteed to work."
Specifically they will each satisfy the following requirements:
tests passing.
Stable, high-profile crates should meet the same level of rigor when it comes to
tier 1. To prove it, Cargo.toml should publish CI badges.
Cargo.toml includes all common metadata (C-METADATA)
authors
description
license
homepage
(though see rust-api-guidelines#26)documentation
repository
readme
keywords
categories
Crate sets html_root_url attribute (C-HTML-ROOT)
It should point to
"https://docs.rs/$crate/$version"
.Cargo.toml should contain a note next to the version to remember to bump the
html_root_url
when bumping the crate version.Cargo.toml documentation key points to docs.rs (C-DOCS-RS)
It should point to
"https://docs.rs/$crate"
.Predictability
Smart pointers do not add inherent methods (C-SMART-PTR)
For example, this is why the
Box::into_raw
function is defined the way itis.
If this were defined as an inherent method instead, it would be confusing at the
call site whether the method being called is a method on
Box<T>
or a method onT
.Conversions live on the most specific type involved (C-CONV-SPECIFIC)
When in doubt, prefer
to_
/as_
/into_
tofrom_
, because they are moreergonomic to use (and can be chained with other methods).
For many conversions between two types, one of the types is clearly more
"specific": it provides some additional invariant or interpretation that is not
present in the other type. For example,
str
is more specific than&[u8]
,since it is a UTF-8 encoded sequence of bytes.
Conversions should live with the more specific of the involved types. Thus,
str
provides both theas_bytes
method and thefrom_utf8
constructorfor converting to and from
&[u8]
values. Besides being intuitive, thisconvention avoids polluting concrete types like
&[u8]
with endless conversionmethods.
Functions with a clear receiver are methods (C-METHOD)
Prefer
over
for any operation that is clearly associated with a particular type.
Methods have numerous advantages over functions:
value of the appropriate type.
T
" (especially when using rustdoc).self
notation, which is more concise and often more clearlyconveys ownership distinctions.
Functions do not take out-parameters (C-NO-OUT)
Prefer
over
for returning multiple
Bar
values.Compound return types like tuples and structs are efficiently compiled and do
not require heap allocation. If a function needs to return multiple values, it
should do so via one of these types.
The primary exception: sometimes a function is meant to modify data that the
caller already owns, for example to re-use a buffer:
Operator overloads are unsurprising (C-OVERLOAD)
Operators with built in syntax (
*
,|
, and so on) can be provided for a typeby implementing the traits in
std::ops
. These operators come with strongexpectations: implement
Mul
only for an operation that bears some resemblanceto multiplication (and shares the expected properties, e.g. associativity), and
so on for the other traits.
Only smart pointers implement
Deref
andDerefMut
(C-DEREF)The
Deref
traits are used implicitly by the compiler in many circumstances,and interact with method resolution. The relevant rules are designed
specifically to accommodate smart pointers, and so the traits should be used
only for that purpose.
Examples from the standard library
Box<T>
String
is a smartpointer to
str
Rc<T>
Arc<T>
Cow<'a, T>
Deref
andDerefMut
never fail (C-DEREF-FAIL)Because the
Deref
traits are invoked implicitly by the compiler in sometimessubtle ways, failure during dereferencing can be extremely confusing.
Constructors are static, inherent methods (C-CTOR)
In Rust, "constructors" are just a convention:
Constructors are static (no
self
) inherent methods for the type that theyconstruct. Combined with the practice of fully importing type names, this
convention leads to informative but concise construction:
This convention also applied to conversion constructors (prefix
from
ratherthan
new
).Constructors for structs with sensible defaults allow clients to concisely
override using the struct update syntax.
Examples from the standard library
std::io::Error::new
is the commonly used constructor for an IO error.std::io::Error::from_raw_os_error
is a constructor based on an error codereceived from the operating system.
Flexibility
Functions expose intermediate results to avoid duplicate work (C-INTERMEDIATE)
Many functions that answer a question also compute interesting related data. If
this data is potentially of interest to the client, consider exposing it in the
API.
Examples from the standard library
Vec::binary_search
does not return abool
of whether the value wasfound, nor an
Option<usize>
of the index at which the value was maybe found.Instead it returns information about the index if found, and also the index at
which the value would need to be inserted if not found.
String::from_utf8
may fail if the input bytes are not UTF-8. In the errorcase it returns an intermediate result that exposes the byte offset up to
which the input was valid UTF-8, as well as handing back ownership of the
input bytes.
Caller decides where to copy and place data (C-CALLER-CONTROL)
If a function requires ownership of an argument, it should take ownership of the
argument rather than borrowing and cloning the argument.
If a function does not require ownership of an argument, it should take a
shared or exclusive borrow of the argument rather than taking ownership and
dropping the argument.
The
Copy
trait should only be used as a bound when absolutely needed, not as away of signaling that copies should be cheap to make.
Functions minimize assumptions about parameters by using generics (C-GENERIC)
The fewer assumptions a function makes about its inputs, the more widely usable
it becomes.
Prefer
over any of
if the function only needs to iterate over the data.
More generally, consider using generics to pinpoint the assumptions a function
needs to make about its arguments.
Advantages of generics
Reusability. Generic functions can be applied to an open-ended collection of
types, while giving a clear contract for the functionality those types must
provide.
Static dispatch and optimization. Each use of a generic function is
specialized ("monomorphized") to the particular types implementing the trait
bounds, which means that (1) invocations of trait methods are static, direct
calls to the implementation and (2) the compiler can inline and otherwise
optimize these calls.
Inline layout. If a
struct
andenum
type is generic over some typeparameter
T
, values of typeT
will be laid out inline in thestruct
/enum
, without any indirection.Inference. Since the type parameters to generic functions can usually be
inferred, generic functions can help cut down on verbosity in code where
explicit conversions or other method calls would usually be necessary.
Precise types. Because generic give a name to the specific type
implementing a trait, it is possible to be precise about places where that
exact type is required or produced. For example, a function
is guaranteed to consume and produce elements of exactly the same type
T
; itcannot be invoked with parameters of different types that both implement
Trait
.Disadvantages of generics
Code size. Specializing generic functions means that the function body is
duplicated. The increase in code size must be weighed against the performance
benefits of static dispatch.
Homogeneous types. This is the other side of the "precise types" coin: if
T
is a type parameter, it stands for a single actual type. So for examplea
Vec<T>
contains elements of a single concrete type (and, indeed, thevector representation is specialized to lay these out in line). Sometimes
heterogeneous collections are useful; see trait objects.
Signature verbosity. Heavy use of generics can make it more difficult to
read and understand a function's signature.
Examples from the standard library
std::fs::File::open
takes an argument of generic typeAsRef<Path>
. Thisallows files to be opened conveniently from a string literal
"f.txt"
, aPath
, anOsString
, and a few other types.Traits are object-safe if they may be useful as a trait object (C-OBJECT)
Trait objects have some significant limitations: methods invoked through a trait
object cannot use generics, and cannot use
Self
except in receiver position.When designing a trait, decide early on whether the trait will be used as an
object or as a bound on generics.
If a trait is meant to be used as an object, its methods should take and return
trait objects rather than use generics.
A
where
clause ofSelf: Sized
may be used to exclude specific methods fromthe trait's object. The following trait is not object-safe due to the generic
method.
Adding a requirement of
Self: Sized
to the generic method excludes it from thetrait object and makes the trait object-safe.
Advantages of trait objects
(monomorphized) versions of code, which can greatly reduce code size.
Disadvantages of trait objects
indirection and vtable dispatch, which can carry a performance penalty.
cannot use the
Self
type.Examples from the standard library
io::Read
andio::Write
traits are often used as objects.Iterator
trait has several generic methods marked withwhere Self: Sized
to retain the ability to useIterator
as an object.Type safety
Newtypes provide static distinctions (C-NEWTYPE)
Newtypes can statically distinguish between different interpretations of an
underlying type.
For example, a
f64
value might be used to represent a quantity in miles or inkilometers. Using newtypes, we can keep track of the intended interpretation:
Once we have separated these two types, we can statically ensure that we do not
confuse them. For example, the function
cannot accidentally be called with a
Kilometers
value. The compiler willremind us to perform the conversion, thus averting certain catastrophic bugs.
Arguments convey meaning through types, not
bool
orOption
(C-CUSTOM-TYPE)Prefer
over
Core types like
bool
,u8
andOption
have many possible interpretations.Use custom types (whether
enum
s,struct
, or tuples) to convey interpretationand invariants. In the above example, it is not immediately clear what
true
and
false
are conveying without looking up the argument names, butSmall
andRound
are more suggestive.Using custom types makes it easier to expand the options later on, for example
by adding an
ExtraLarge
variant.See the newtype pattern for a no-cost way to wrap existing types
with a distinguished name.
Types for a set of flags are
bitflags
, not enums (C-BITFLAG)Rust supports
enum
types with explicitly specified discriminants:Custom discriminants are useful when an
enum
type needs to be serialized to aninteger value compatibly with some other system/language. They support
"typesafe" APIs: by taking a
Color
, rather than an integer, a function isguaranteed to get well-formed inputs, even if it later views those inputs as
integers.
An
enum
allows an API to request exactly one choice from among many. Sometimesan API's input is instead the presence or absence of a set of flags. In C code,
this is often done by having each flag correspond to a particular bit, allowing
a single integer to represent, say, 32 or 64 flags. Rust's
bitflags
crateprovides a typesafe representation of this pattern.
Builders enable construction of complex values (C-BUILDER)
Some data structures are complicated to construct, due to their construction
needing:
which can easily lead to a large number of distinct constructors with many
arguments each.
If
T
is such a data structure, consider introducing aT
builder:TBuilder
for incrementally configuring aT
value. When possible, choose a better name: e.g.
Command
is the builderfor a child process,
Url
can be created from aParseOptions
.make a
T
.including setting up compound inputs (like slices) incrementally. These
methods should return
self
to allow chaining.building a
T
.The builder pattern is especially appropriate when building a
T
involves sideeffects, such as spawning a task or launching a process.
In Rust, there are two variants of the builder pattern, differing in the
treatment of ownership, as described below.
Non-consuming builders (preferred):
In some cases, constructing the final
T
does not require the builder itself tobe consumed. The follow variant on
std::process::Command
is one example:Note that the
spawn
method, which actually uses the builder configuration tospawn a process, takes the builder by immutable reference. This is possible
because spawning the process does not require ownership of the configuration
data.
Because the terminal
spawn
method only needs a reference, the configurationmethods take and return a mutable borrow of
self
.The benefit
By using borrows throughout,
Command
can be used conveniently for bothone-liner and more complex constructions:
Consuming builders:
Sometimes builders must transfer ownership when constructing the final type
T
,meaning that the terminal methods must take
self
rather than&self
.Here, the
stdout
configuration involves passing ownership of anio::Write
,which must be transferred to the task upon construction (in
spawn
).When the terminal methods of the builder require ownership, there is a basic
tradeoff:
If the other builder methods take/return a mutable borrow, the complex
configuration case will work well, but one-liner configuration becomes
impossible.
If the other builder methods take/return an owned
self
, one-liners continueto work well but complex configuration is less convenient.
Under the rubric of making easy things easy and hard things possible, all
builder methods for a consuming builder should take and returned an owned
self
. Then client code works as follows:One-liners work as before, because ownership is threaded through each of the
builder methods until being consumed by
spawn
. Complex configuration, however,is more verbose: it requires re-assigning the builder at each step.
Dependability
Functions validate their arguments (C-VALIDATE)
Rust APIs do not generally follow the robustness principle: "be conservative
in what you send; be liberal in what you accept".
Instead, Rust code should enforce the validity of input whenever practical.
Enforcement can be achieved through the following mechanisms (listed in order of
preference).
Static enforcement:
Choose an argument type that rules out bad inputs.
For example, prefer
over
where
Ascii
is a wrapper aroundu8
that guarantees the highest bit iszero; see newtype patterns for more details on creating typesafe
wrappers.
Static enforcement usually comes at little run-time cost: it pushes the costs to
the boundaries (e.g. when a
u8
is first converted into anAscii
). It alsocatches bugs early, during compilation, rather than through run-time failures.
On the other hand, some properties are difficult or impossible to express using
types.
Dynamic enforcement:
Validate the input as it is processed (or ahead of time, if necessary). Dynamic
checking is often easier to implement than static checking, but has several
downsides:
input).
fail!
orResult
/Option
types,which must then be dealt with by client code.
Dynamic enforcement with
debug_assert!
:Same as dynamic enforcement, but with the possibility of easily turning off
expensive checks for production builds.
Dynamic enforcement with opt-out:
Same as dynamic enforcement, but adds sibling functions that opt out of the
checking.
The convention is to mark these opt-out functions with a suffix like
_unchecked
or by placing them in araw
submodule.The unchecked functions can be used judiciously in cases where (1) performance
dictates avoiding checks and (2) the client is otherwise confident that the
inputs are valid.
Destructors never fail (C-DTOR-FAIL)
Destructors are executed on task failure, and in that context a failing
destructor causes the program to abort.
Instead of failing in a destructor, provide a separate method for checking for
clean teardown, e.g. a
close
method, that returns aResult
to signalproblems.
Destructors that may block have alternatives (C-DTOR-BLOCK)
Similarly, destructors should not invoke blocking operations, which can make
debugging much more difficult. Again, consider providing a separate method for
preparing for an infallible, nonblocking teardown.
Debuggability
All public types implement
Debug
(C-DEBUG)If there are exceptions, they are rare.
Debug
representation is never empty (C-DEBUG-NONEMPTY)Even for conceptually empty values, the
Debug
representation should never beempty.
Future proofing
Structs have private fields (C-STRUCT-PRIVATE)
Making a field public is a strong commitment: it pins down a representation
choice, and prevents the type from providing any validation or maintaining any
invariants on the contents of the field, since clients can mutate it arbitrarily.
Public fields are most appropriate for
struct
types in the C spirit: compound,passive data structures. Otherwise, consider providing getter/setter methods and
hiding fields instead.
Newtypes encapsulate implementation details (C-NEWTYPE-HIDE)
A newtype can be used to hide representation details while making precise
promises to the client.
For example, consider a function
my_transform
that returns a compound iteratortype.
We wish to hide this type from the client, so that the client's view of the
return type is roughly
Iterator<Item = (usize, T)>
. We can do so using thenewtype pattern:
Aside from simplifying the signature, this use of newtypes allows us to promise
less to the client. The client does not know how the result iterator is
constructed or represented, which means the representation can change in the
future without breaking client code.
In the future the same thing can be accomplished more concisely with the
impl Trait
feature but this is currently unstable.Necessities
Public dependencies of a stable crate are stable (C-STABLE)
A crate cannot be stable (>=1.0.0) without all of its public dependencies being
stable.
Public dependencies are crates from which types are used in the public API of
the current crate.
A crate containing this function cannot be stable unless
other_crate
is alsostable.
Be careful because public dependencies can sneak in at unexpected places.
Crate and its dependencies have a permissive license (C-PERMISSIVE)
The software produced by the Rust project is dual-licensed, under
either the MIT or Apache 2.0 licenses. Crates that simply need the
maximum compatibility with the Rust ecosystem are recommended to do
the same, in the manner described herein. Other options are described
below.
These API guidelines do not provide a detailed explanation of Rust's
license, but there is a small amount said in the Rust FAQ. These
guidelines are concerned with matters of interoperability with Rust,
and are not comprehensive over licensing options.
To apply the Rust license to your project, define the
license
fieldin your
Cargo.toml
as:And toward the end of your README.md:
Besides the dual MIT/Apache-2.0 license, another common licensing approach
used by Rust crate authors is to apply a single permissive license such as
MIT or BSD. This license scheme is also entirely compatible with Rust's,
because it imposes the minimal restrictions of Rust's MIT license.
Crates that desire perfect license compatibility with Rust are not
recommended to choose only the Apache license. The Apache license,
though it is a permissive license, imposes restrictions beyond the MIT
and BSD licenses that can discourage or prevent their use in some
scenarios, so Apache-only software cannot be used in some situations
where most of the Rust runtime stack can.
The license of a crate's dependencies can affect the restrictions on
distribution of the crate itself, so a permissively-licensed crate
should generally only depend on permissively-licensed crates.
External Links
License
This guidelines document is licensed under either of
http://www.apache.org/licenses/LICENSE-2.0)
http://opensource.org/licenses/MIT)
at your option.
Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted
for inclusion in this document by you, as defined in the Apache-2.0 license,
shall be dual licensed as above, without any additional terms or conditions.
The text was updated successfully, but these errors were encountered: