Skip to content

Commit

Permalink
Auto merge of rust-lang#126963 - runtimeverification:smir_serde_deriv…
Browse files Browse the repository at this point in the history
…e, r=celinval

Add basic Serde serialization capabilities to Stable MIR

This PR adds basic Serde serialization capabilities to Stable MIR. It is intentionally minimal (just wrapping all stable MIR types with a Serde `derive`), so that any important design decisions can be discussed before going further. A simple test is included with this PR to validate that JSON can actually be emitted.

## Notes

When I wrapped the Stable MIR error types in `compiler/stable_mir/src/error.rs`, it caused test failures (though I'm not sure why) so I backed those out.

## Future Work

So, this PR will support serializing basic stable MIR, but it _does not_ support serializing interned values beneath `Ty`s and `AllocId`s, etc... My current thinking about how to handle this is as follows:

1.  Add new `visited_X` fields to the `Tables` struct for each interned category of interest.

2.  As serialization is occuring, serialize interned values as usual _and_ also record the interned value we referenced in `visited_X`.

    (Possibly) In addition, if an interned value recursively references other interned values, record those interned values as well.

3.  Teach the stable MIR `Context` how to access the `visited_X` values and expose them with wrappers in `stable_mir/src/lib.rs` to users (e.g. to serialize and/or further analyze them).

### Pros

This approach does not commit to any specific serialization format regarding interned values or other more complex cases, which avoids us locking into any behaviors that may not be desired long-term.

### Cons

The user will need to manually handle serializing interned values.

### Alternatives

1.  We can directly provide access to the underlying `Tables` maps for interned values; the disadvantage of this approach is that it either requires extra processing for users to filter out to only use the values that they need _or_ users may serialize extra values that they don't need. The advantage is that the implementation is even simpler. The other pros/cons are similar to the above.

2.  We can directly serialize interned values by expanding them in-place. The pro is that this may make some basic inputs easier to consume. However, the cons are that there will need to be special provisions for dealing with cyclical values on both the producer and consumer _and_ global values will possibly need to be de-duplicated on the consumer side.
  • Loading branch information
bors committed Jul 25, 2024
2 parents aa877bc + 414ebea commit 7120fda
Show file tree
Hide file tree
Showing 11 changed files with 242 additions and 139 deletions.
1 change: 1 addition & 0 deletions Cargo.lock
Original file line number Diff line number Diff line change
Expand Up @@ -5205,6 +5205,7 @@ name = "stable_mir"
version = "0.1.0-preview"
dependencies = [
"scoped-tls",
"serde",
]

[[package]]
Expand Down
1 change: 1 addition & 0 deletions compiler/stable_mir/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@ edition = "2021"

[dependencies]
scoped-tls = "1.0"
serde = { version = "1.0.125", features = [ "derive" ] }
35 changes: 18 additions & 17 deletions compiler/stable_mir/src/abi.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,13 @@ use crate::target::{MachineInfo, MachineSize as Size};
use crate::ty::{Align, IndexedVal, Ty, VariantIdx};
use crate::Error;
use crate::Opaque;
use serde::Serialize;
use std::fmt::{self, Debug};
use std::num::NonZero;
use std::ops::RangeInclusive;

/// A function ABI definition.
#[derive(Clone, Debug, PartialEq, Eq, Hash)]
#[derive(Clone, Debug, PartialEq, Eq, Hash, Serialize)]
pub struct FnAbi {
/// The types of each argument.
pub args: Vec<ArgAbi>,
Expand All @@ -31,15 +32,15 @@ pub struct FnAbi {
}

/// Information about the ABI of a function's argument, or return value.
#[derive(Clone, Debug, PartialEq, Eq, Hash)]
#[derive(Clone, Debug, PartialEq, Eq, Hash, Serialize)]
pub struct ArgAbi {
pub ty: Ty,
pub layout: Layout,
pub mode: PassMode,
}

/// How a function argument should be passed in to the target function.
#[derive(Clone, Debug, PartialEq, Eq, Hash)]
#[derive(Clone, Debug, PartialEq, Eq, Hash, Serialize)]
pub enum PassMode {
/// Ignore the argument.
///
Expand All @@ -60,14 +61,14 @@ pub enum PassMode {
}

/// The layout of a type, alongside the type itself.
#[derive(Copy, Clone, Debug, PartialEq, Eq, Hash)]
#[derive(Copy, Clone, Debug, PartialEq, Eq, Hash, Serialize)]
pub struct TyAndLayout {
pub ty: Ty,
pub layout: Layout,
}

/// The layout of a type in memory.
#[derive(Clone, Debug, PartialEq, Eq, Hash)]
#[derive(Clone, Debug, PartialEq, Eq, Hash, Serialize)]
pub struct LayoutShape {
/// The fields location withing the layout
pub fields: FieldsShape,
Expand Down Expand Up @@ -108,7 +109,7 @@ impl LayoutShape {
}
}

#[derive(Copy, Clone, Debug, PartialEq, Eq, Hash)]
#[derive(Copy, Clone, Debug, PartialEq, Eq, Hash, Serialize)]
pub struct Layout(usize);

impl Layout {
Expand All @@ -127,7 +128,7 @@ impl IndexedVal for Layout {
}

/// Describes how the fields of a type are shaped in memory.
#[derive(Clone, Debug, PartialEq, Eq, Hash)]
#[derive(Clone, Debug, PartialEq, Eq, Hash, Serialize)]
pub enum FieldsShape {
/// Scalar primitives and `!`, which never have fields.
Primitive,
Expand Down Expand Up @@ -177,7 +178,7 @@ impl FieldsShape {
}
}

#[derive(Clone, Debug, PartialEq, Eq, Hash)]
#[derive(Clone, Debug, PartialEq, Eq, Hash, Serialize)]
pub enum VariantsShape {
/// Single enum variants, structs/tuples, unions, and all non-ADTs.
Single { index: VariantIdx },
Expand All @@ -196,7 +197,7 @@ pub enum VariantsShape {
},
}

#[derive(Clone, Debug, PartialEq, Eq, Hash)]
#[derive(Clone, Debug, PartialEq, Eq, Hash, Serialize)]
pub enum TagEncoding {
/// The tag directly stores the discriminant, but possibly with a smaller layout
/// (so converting the tag to the discriminant can require sign extension).
Expand All @@ -221,7 +222,7 @@ pub enum TagEncoding {

/// Describes how values of the type are passed by target ABIs,
/// in terms of categories of C types there are ABI rules for.
#[derive(Clone, Debug, PartialEq, Eq, Hash)]
#[derive(Clone, Debug, PartialEq, Eq, Hash, Serialize)]
pub enum ValueAbi {
Uninhabited,
Scalar(Scalar),
Expand Down Expand Up @@ -250,7 +251,7 @@ impl ValueAbi {
}

/// Information about one scalar component of a Rust type.
#[derive(Clone, Copy, PartialEq, Eq, Hash, Debug)]
#[derive(Clone, Copy, PartialEq, Eq, Hash, Debug, Serialize)]
pub enum Scalar {
Initialized {
/// The primitive type used to represent this value.
Expand Down Expand Up @@ -280,7 +281,7 @@ impl Scalar {
}

/// Fundamental unit of memory access and layout.
#[derive(Copy, Clone, PartialEq, Eq, Hash, Debug)]
#[derive(Copy, Clone, PartialEq, Eq, Hash, Debug, Serialize)]
pub enum Primitive {
/// The `bool` is the signedness of the `Integer` type.
///
Expand Down Expand Up @@ -310,7 +311,7 @@ impl Primitive {
}

/// Enum representing the existing integer lengths.
#[derive(Copy, Clone, PartialEq, Eq, PartialOrd, Ord, Hash, Debug)]
#[derive(Copy, Clone, PartialEq, Eq, PartialOrd, Ord, Hash, Debug, Serialize)]
pub enum IntegerLength {
I8,
I16,
Expand All @@ -320,7 +321,7 @@ pub enum IntegerLength {
}

/// Enum representing the existing float lengths.
#[derive(Copy, Clone, PartialEq, Eq, PartialOrd, Ord, Hash, Debug)]
#[derive(Copy, Clone, PartialEq, Eq, PartialOrd, Ord, Hash, Debug, Serialize)]
pub enum FloatLength {
F16,
F32,
Expand Down Expand Up @@ -354,7 +355,7 @@ impl FloatLength {
/// An identifier that specifies the address space that some operation
/// should operate on. Special address spaces have an effect on code generation,
/// depending on the target and the address spaces it implements.
#[derive(Copy, Clone, Debug, PartialEq, Eq, PartialOrd, Ord, Hash)]
#[derive(Copy, Clone, Debug, PartialEq, Eq, PartialOrd, Ord, Hash, Serialize)]
pub struct AddressSpace(pub u32);

impl AddressSpace {
Expand All @@ -369,7 +370,7 @@ impl AddressSpace {
/// sequence:
///
/// 254 (-2), 255 (-1), 0, 1, 2
#[derive(Clone, Copy, PartialEq, Eq, Hash)]
#[derive(Clone, Copy, PartialEq, Eq, Hash, Serialize)]
pub struct WrappingRange {
pub start: u128,
pub end: u128,
Expand Down Expand Up @@ -420,7 +421,7 @@ impl Debug for WrappingRange {
}

/// General language calling conventions.
#[derive(Copy, Clone, Debug, PartialEq, Eq, Hash)]
#[derive(Copy, Clone, Debug, PartialEq, Eq, Hash, Serialize)]
pub enum CallConvention {
C,
Rust,
Expand Down
3 changes: 2 additions & 1 deletion compiler/stable_mir/src/crate_def.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,10 @@
use crate::ty::{GenericArgs, Span, Ty};
use crate::{with, Crate, Symbol};
use serde::Serialize;

/// A unique identification number for each item accessible for the current compilation unit.
#[derive(Clone, Copy, PartialEq, Eq, Hash)]
#[derive(Clone, Copy, PartialEq, Eq, Hash, Serialize)]
pub struct DefId(pub(crate) usize);

/// A trait for retrieving information about a particular definition.
Expand Down
10 changes: 6 additions & 4 deletions compiler/stable_mir/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ pub use crate::error::*;
use crate::mir::Body;
use crate::mir::Mutability;
use crate::ty::{ForeignModuleDef, ImplDef, IndexedVal, Span, TraitDef, Ty};
use serde::Serialize;

pub mod abi;
#[macro_use]
Expand Down Expand Up @@ -74,7 +75,7 @@ pub type TraitDecls = Vec<TraitDef>;
pub type ImplTraitDecls = Vec<ImplDef>;

/// Holds information about a crate.
#[derive(Clone, PartialEq, Eq, Debug)]
#[derive(Clone, PartialEq, Eq, Debug, Serialize)]
pub struct Crate {
pub id: CrateNum,
pub name: Symbol,
Expand All @@ -98,15 +99,15 @@ impl Crate {
}
}

#[derive(Copy, Clone, PartialEq, Eq, Debug, Hash)]
#[derive(Copy, Clone, PartialEq, Eq, Debug, Hash, Serialize)]
pub enum ItemKind {
Fn,
Static,
Const,
Ctor(CtorKind),
}

#[derive(Copy, Clone, PartialEq, Eq, Debug, Hash)]
#[derive(Copy, Clone, PartialEq, Eq, Debug, Hash, Serialize)]
pub enum CtorKind {
Const,
Fn,
Expand All @@ -116,6 +117,7 @@ pub type Filename = String;

crate_def_with_ty! {
/// Holds information about an item in a crate.
#[derive(Serialize)]
pub CrateItem;
}

Expand Down Expand Up @@ -188,7 +190,7 @@ pub fn all_trait_impls() -> ImplTraitDecls {
}

/// A type that provides internal information but that can still be used for debug purpose.
#[derive(Clone, PartialEq, Eq, Hash)]
#[derive(Clone, PartialEq, Eq, Hash, Serialize)]
pub struct Opaque(String);

impl std::fmt::Display for Opaque {
Expand Down
5 changes: 3 additions & 2 deletions compiler/stable_mir/src/mir/alloc.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,12 @@ use crate::mir::mono::{Instance, StaticDef};
use crate::target::{Endian, MachineInfo};
use crate::ty::{Allocation, Binder, ExistentialTraitRef, IndexedVal, Ty};
use crate::{with, Error};
use serde::Serialize;
use std::io::Read;

/// An allocation in the SMIR global memory can be either a function pointer,
/// a static, or a "real" allocation with some data in it.
#[derive(Debug, Clone, Eq, PartialEq)]
#[derive(Debug, Clone, Eq, PartialEq, Serialize)]
pub enum GlobalAlloc {
/// The alloc ID is used as a function pointer.
Function(Instance),
Expand Down Expand Up @@ -41,7 +42,7 @@ impl GlobalAlloc {
}

/// A unique identification number for each provenance
#[derive(Clone, Copy, PartialEq, Eq, Debug, Hash)]
#[derive(Clone, Copy, PartialEq, Eq, Debug, Hash, Serialize)]
pub struct AllocId(usize);

impl IndexedVal for AllocId {
Expand Down
Loading

0 comments on commit 7120fda

Please sign in to comment.