Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reproducer: fails to query trait imported from a dependecy #333

Open
wants to merge 1 commit into
base: rustdoc-v29
Choose a base branch
from

Conversation

oknozor
Copy link

@oknozor oknozor commented May 14, 2024

Hello,

I am trying to use trustfall with rust doc to get all implementors of a given trait.
Unfortunately, only trait from the crate and trait from std are queried.

Instead of an issue I opened this as a pull request containing a reproducer, hoping that you might guide me through a fix.
This might just be me missing something really obvious in the query though.

Let me know !

@oknozor oknozor force-pushed the rustdoc-v29 branch 2 times, most recently from 43fc672 to 23c5d6d Compare May 14, 2024 13:27
@oknozor
Copy link
Author

oknozor commented May 14, 2024

Okay I dived a bit more into the adapter implementation and this problem is well documented.

Would it be ok to make MANUAL_TRAIT_ITEMS mutable using once_cell, and to provide a public api to push arbitrary trait items ?
This could be behind a feature flag.

What do you think ?

Copy link
Owner

@obi1kenobi obi1kenobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be ok to make MANUAL_TRAIT_ITEMS mutable using once_cell, and to provide a public api to push arbitrary trait items ?

Unfortunately this wouldn't help. The external trait item is simply not present in the rustdoc the adapter sees, so there's nothing to push to MANUAL_TRAIT_ITEMS. We don't know its ID and we can't determine what types implement it.

The ultimate underlying issue here is described in this Rust T-compiler MCP: rust-lang/compiler-team#635

The TL;DR is that it's currently impossible to link the data from different rustdoc JSON files together into one cohesive whole. There is key information that rustc has at compile time that doesn't ever make it to rustdoc, and as a result we cannot reliably combine rustdoc JSON files to get the full picture back.

Until that underlying issue gets resolved, anything done in trustfall-rustdoc-adapter is at best a hacky and unreliable workaround, and one I would likely prefer to not merge since it would massively increase the maintenance and support cost of this free open-source library.

That said, I'd love to hear more about your use case! Perhaps there's a narrower solution than "reliably merge rustdoc JSON files together" that we can come up with to address it?

Comment on lines +104 to +111
let path = "./localdata/test_data/impl_for_ref/rustdoc.json";
let content = std::fs::read_to_string(path)
.with_context(|| format!("Could not load {path} file, did you forget to run ./scripts/regenerate_test_rustdocs.sh ?"))
.expect("failed to load rustdoc");

let crate_ = serde_json::from_str(&content).expect("failed to parse rustdoc");
let indexed_crate = IndexedCrate::new(&crate_);
let adapter = RustdocAdapter::new(&indexed_crate, None);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note how only rustdoc from impl_for_ref is included in the adapter. No dummy_ext rustdoc is loaded, so it's as if none of its types exist from the perspective of the RustdocAdapter.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the internals but that's not entirely true.

I have two reference to the external trait in the rustdoc json generated for impl_for_ref:

{
  "0:8": {
    "id": "0:8",
    "crate_id": 0,
    "name": null,
    "span": {
      "filename": "src/lib.rs",
      "begin": [
        9,
        0
      ],
      "end": [
        11,
        1
      ]
    },
    "visibility": "default",
    "docs": null,
    "links": {},
    "attrs": [],
    "deprecation": null,
    "inner": {
      "impl": {
        "is_unsafe": false,
        "generics": {
          "params": [],
          "where_predicates": []
        },
        "provided_trait_methods": [],
        "trait": {
          "name": "DummyExternalTrait",
          "id": "20:3:2305",
          "args": {
            "angle_bracketed": {
              "args": [],
              "bindings": []
            }
          }
        },
        "for": {
          "resolved_path": {
            "name": "StringHolder",
            "id": "0:5:2306",
            "args": {
              "angle_bracketed": {
                "args": [],
                "bindings": []
              }
            }
          }
        },
        "items": [
          "0:9:2308"
        ],
        "negative": false,
        "synthetic": false,
        "blanket_impl": null
      }
    }
  }
}
{
  "20:3:2305": {
    "crate_id": 20,
    "path": [
      "dummy_ext",
      "DummyExternalTrait"
    ],
    "kind": "trait"
  }
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately this wouldn't help.

I am puzzled. I turned to whole thing into a static mutable, add added this to the test and now it passes.

    crate::indexed_crate::add_manual_trait_item(ManualTraitItem {
        name: "DummyExternalTrait",
        is_auto: false,
        is_unsafe: false,
    });

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is that the actual definition of item 20:3:2305 AKA DummyExternalTrait is not present in the file: we don't know its attributes, its doc comment, its items, etc. We just know it as a foreign item from an unknown version of a crate called dummy_ext.

With the approach you propose, queries like the following will produce unexpected results: playground

query {
  Crate {
    item {
      ... on ImplOwner {
        name @output
        impl {
          implemented_trait {
            name @output(name: "impls")

            impls_: trait {
              docs @output  # <-- no data here
              doc_hidden @output  # <-- no data here either

              importable_path {  # <-- no known instances of this edge
                path @output
              }
            }
          }
        }
      }
    }
  }
}

It will claim that the implemented DummyExternalTrait trait is not actually importable from other crates (obviously false — it's implemented by a type in another crate!), because the data to determine its import paths was not present.

That sounds like a pretty serious footgun to me — one made even more unpredictable and dangerous if we add add_manual_trait_item() to the crate's public API. This is why I'd love to hear more about your use case — there may be a less footgun-y way to accomplish what you're after.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why I'd love to hear more about your use case

I am trying to generate automated documentation for the zed editor actions (those are struct implementing the Action trait spread across several crates in a cargo workspace).
I could probably parse everything with syn but I was curious about trustfall.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that use case, is it guaranteed that you'll have at most one major version of each crate you are querying in the workspace? (This isn't true in general, for example it isn't true when semver-checking cargo-semver-checks.)

If that's the case, you should be able to accomplish the goal with Trustfall in a better (but slightly more work) way: you could generate the rustdoc for all the crates, and use the external item description

{
  "20:3:2305": {
    "crate_id": 20,
    "path": [
      "dummy_ext",
      "DummyExternalTrait"
    ],
    "kind": "trait"
  }
}

to then look up the item in the other rustdoc using the import path index: crate dummy_ext import path dummy_ext::DummyExternalTrait.

Such a prototype would lay the groundwork for trustfall-rustdoc-adapter supporting cross-crate querying once the compiler MCP is approved and the rustdoc linking limitation is lifted. I'd love to see it and if you're interested in building it, I'd love to support you in it.

But I totally understand if that sounds like too much work, and you'd prefer to just fork the crate and make the add_manual_trait_item() function public in your build.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will give a try, this sound like a lot of fun! I'll let you know if I need your help later.

Thanks a lot for the help so far !

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Always happy to help :)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! I wanted to check in and see how you're doing on this. Are you still working on the prototype, or did something else take priority? Anything I can do to help you along?

@obi1kenobi
Copy link
Owner

Hi @oknozor! Are you still looking into using Trustfall for your use case, or have you switched to a different approach?

There's a Rust project goal to start clearing out the blockers for querying across crate boundaries, but it will require many more months of work before we're able to resolve the remaining challenges across rustc + rustdoc + this repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants