Replies: 6 comments 9 replies
-
Semantic FaçadeThis is what Roslyn does: var tree = CSharpSyntaxTree.ParseText(@"
public class MyClass
{
int MyMethod() { return 0; }
}");
var compilation = CSharpCompilation.Create("MyCompilation", ...);
var model = compilation.GetSemanticModel(tree); Where the So to implement the "go to definition" that we implemented in the LSP, we would do
In the code above we used: https://docs.microsoft.com/en-us/dotnet/api/microsoft.codeanalysis.modelextensions.getsymbolinfo let result = parse(...);
let model: SemanticModel = result.root().semantic_model(...);
let node = get_syntaxnode_at(result.root(), line_number, column_number);
let semantic_info = model.get(&node);
dbg!(semantic_info.declared_at()); Roslyn also uses the semantic model as the façade for control/data flow. We can find unused variables doing this: // Get a list of statements to run the dataflow analysis
var someMethod = tree.GetRoot().DescendantNodes().OfType<MethodDeclarationSyntax>().First();
var start = someMethod.Body.Statements.First();
var end = someMethod.Body.Statements.Last();
// Run the data flow
var df = model.AnalyzeDataFlow(start, end);
// Find which variables are not used
var unused = new System.Collections.Generic.HashSet<string>();
Console.WriteLine("Variables Declared");
Console.WriteLine("-----------");
foreach (var symbol in df.VariablesDeclared)
{
unused.Add(symbol.Name);
Console.WriteLine(symbol.Name);
}
Console.WriteLine();
Console.WriteLine("Read Inside");
Console.WriteLine("-----------");
foreach (var symbol in df.ReadInside)
{
unused.Remove(symbol.Name);
Console.WriteLine(symbol.Name);
}
Console.WriteLine();
Console.WriteLine("Unused Variables");
Console.WriteLine("-----------");
foreach (var symbol in unused)
{
Console.WriteLine(symbol);
} In our case, this could be let result = parse(...);
let decl = get_function_syntax_node(result.root()).cast::<JsFunctionDeclaration>();
let statements = decl.body().statements().into_iter();
let first = statements.next();
let last = statements.last();
let model: SemanticModel = result.root().semantic_model();
let result = model.dataflow(first, last);
// same strategy here Of course we can split this into multiple façades if needed: let analyzer: DataFlowAnalyzer = result.root().data_flow_analyzer(...);
let df = analyzer.run(first, last); One important detail is that in both examples above I used |
Beta Was this translation helpful? Give feedback.
-
Semantic TreeThe semantic tree is probably the most obvious step because we already let result = parse(...);
let node = get_syntaxnode_at(result.root(), line_number, column_number);
let semantic_info = node.semantic(...);
dbg!(semantic_info.declared_at()); and the unused case would be: let result = parse(...);
let decl = get_function_syntax_node(result.root()).cast::<JsFunctionDeclaration>();
let statements = decl.body().statements().into_iter();
let first = statements.next();
let last = statements.last();
let start= first.semantic(...);
let df = node.dataflow_until(last); Nothing stopping us from having both. They would be only different ways to access the Semantic data. The "catch" here is that we also need to pass "something" into these functions. This happens because there is no easy way to make Today this is how our tree is stored in memory:
This means that
So we have something like this:
The "green world" is cached; and thus, if we insert a "service pointer" there, we would kill the ability to cache nodes from different workspaces. In the red world, we have two options:
SyntaxNode is a very simple struct that is cloneable. This means that we can have many more instances of #[derive(Clone)]
pub(crate) struct SyntaxNode {
pub(super) ptr: Rc<NodeData>,
}
#[derive(Debug)]
struct NodeData {
_c: Count<_SyntaxElement>,
kind: NodeKind,
slot: u32,
offset: TextSize,
} Including an
Challenges for this solution would be: 2 - How to type this data. We can make We could type erase this context storing it as This gives us a "cheap clone" and no mutation: or the object allows mutation through use std::sync::*;
use std::any::*;
#[derive(Debug)]
struct ServiceContext {
}
impl ServiceContext {
pub fn new() -> Self {
Self {}
}
pub fn to_arc(self) -> Arc<dyn Any> {
Arc::new(self)
}
}
struct NodeData{
tag: Arc<dyn Any>
}
impl NodeData {
pub fn tag<T: 'static>(&self) -> Option<&T> {
self.tag.downcast_ref()
}
}
fn main() {
let ctx = ServiceContext::new().to_arc();
let node = NodeData {
tag: ctx
};
println!("{:?}", node.tag::<ServiceContext>());
} |
Beta Was this translation helpful? Give feedback.
-
How would things like |
Beta Was this translation helpful? Give feedback.
-
If we use the Roslyn definition for symbols:
In the "Semantic Façade" we would do something similar to Roslyn. pub enum JsSymbol {
Module(...),
Class(...),
Function(...),
Variable(JsVariableSymbol),
...
} We can do something very similar with scope. For example how Roslyn do this: pub enum JsScope {
Function { ... },
If { ... },
Try { ... },
...
} And we could use this like : let var_decl_node = ...;
let symbol: JsSymbol = model.symbol(var_decl_node);
if let JsSymbol::Variable(var_decl) = symbol {
if var_decl.is_constant() {
// do whatever you need here
}
}
let parent_try = symbol.scope().ancestors().find(|x| x.is_try())?;
dbg!(parent_try); // do whatever you need here In the "Semantic Tree", instead of an enum, we would have: pub struct SemanticJsModule { ... }
pub struct SemanticJsClassDeclaration { ... }
pub struct SemanticJsFunctionDeclaration { ... }
pub struct SemanticJsVariableDeclaration { ... } Scope would be similar, and the usage would be: let var_decl_node = ...;
let var_decl: SemanticJsVariableDeclaration = var_decl_node.cast::<JsVariableDeclaration>()?.semantic(model);
if var_decl.is_constant() {
// do whatever you need here
}
let parent_try = var_decl.scope().ancestors().find(|x| x.is_try())?;
dbg!(parent_try); // do whatever you need here |
Beta Was this translation helpful? Give feedback.
-
Overall I think having a generic semantic tree + a type safe facade on top sounds like the best option as this is how we already work with syntax trees (generic Rowan For the generic tree I think having a single
The last one raises an interesting question in how we solve externals eg. in For the type-safe facade the exact design is a bit more unclear to me, in general I think we should try to have an abstract model that doesn't necessarily closely mirror the syntax (for instance The last part of all this would be how it would actually be implemented: we probably won't be using the same red-green tree structure as the syntax tree for the semantic tree, then what data structures are internally being used to support this ? Are the semantic nodes |
Beta Was this translation helpful? Give feedback.
-
This isn't just a problem with unknown globals but is the case for all cases where our semantic analysis isn't able to resolve a symbol. That may be because of a syntax error or simply because it's (impossible?) to have a precise symbol resolution in JS. For these semantic nodes? How do we access them and what do we return for nodes that don't have any semantic information. For example, a token
I think this works with either of the proposed approaches? The facade approach could return different
What would be the unique identifier that |
Beta Was this translation helpful? Give feedback.
-
We have a discussion on how we want to use the semantic model inside linters: #2603
This will be implemented by #2488
This discussion is more specifically about the API around all the semantic data.
We have two options, that can somehow coexist.
1 - We offer a façade with specific methods that give access to the whole semantic model;
2 - We offer a semantic tree that gives access to common semantic functions for that specific node, for example
SemanticJsFunctionDeclaration
;Beta Was this translation helpful? Give feedback.
All reactions