Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support that servers can provide file system implementations and read from remote file systems as well #1264

Open
dbaeumer opened this issue May 10, 2021 · 39 comments
Labels
Milestone

Comments

@dbaeumer
Copy link
Member

Currently servers are restricted to read files from their local file systems. LSP should offer ways that servers can:

  • implement file systems
  • read from and write to other remote file systems.
@dbaeumer
Copy link
Member Author

@NTaylorMullen

@rwols
Copy link
Contributor

rwols commented Jun 13, 2021

implement file systems

I don't understand this bullet point, can you clarify? Do you mean that a client can query the remote server's FS? Why would the client be interested in that?

read from and write to other remote file systems.

This makes sense... A client can provide a kind of "virtual FS" to the remote-running language server and the server can then query the VFS.

@rwols
Copy link
Contributor

rwols commented Jun 13, 2021

implement file systems

Ah, if the entire project is also remote, then it makes sense as a client would have to discover the directory layout somehow.

@NTaylorMullen
Copy link
Contributor

@rwols if you're curious: https://github.com/NTaylorMullen/LSPVirtualDocuments/blob/master/Documents/FileSystemSpec.md

@aslakhellesoy
Copy link
Contributor

Is anyone working actively on this? It would be amazing to have this feature in order to make my vscode extension work as a web extension (it relies on a language server that needs to read files).

atscott added a commit to atscott/vscode-ng-language-service that referenced this issue Jun 9, 2022
…lighting

You can find more information about virtual workspaces here: https://code.visualstudio.com/api/extension-guides/virtual-workspaces

The LSP does not support access to virtual resources: microsoft/language-server-protocol#1264
As a result, we cannot provide much in the way of features since the
extension relies on the LSP for every provider.

While we cannot provide any features from the @angular/language-server,
we still can provide "limited" support to enable syntax highlighting in
virtual workspaces.
atscott added a commit to atscott/vscode-ng-language-service that referenced this issue Jun 9, 2022
…lighting

You can find more information about virtual workspaces here: https://code.visualstudio.com/api/extension-guides/virtual-workspaces

The LSP does not support access to virtual resources: microsoft/language-server-protocol#1264
As a result, we cannot provide much in the way of features since the
extension relies on the LSP for every provider.

While we cannot provide any features from the @angular/language-server,
we still can provide "limited" support to enable syntax highlighting in
virtual workspaces.
atscott added a commit to angular/vscode-ng-language-service that referenced this issue Jun 13, 2022
…lighting (#1694)

You can find more information about virtual workspaces here: https://code.visualstudio.com/api/extension-guides/virtual-workspaces

The LSP does not support access to virtual resources: microsoft/language-server-protocol#1264
As a result, we cannot provide much in the way of features since the
extension relies on the LSP for every provider.

While we cannot provide any features from the @angular/language-server,
we still can provide "limited" support to enable syntax highlighting in
virtual workspaces.
@XeroOl
Copy link

XeroOl commented Sep 10, 2022

I would love to see this. One possible use case is if your language server has a decompiler built in. The language server would be able to reference things in the decompiled version of the file to the editor without needing to place decompiled sources into the filesystem.

@nelak2
Copy link

nelak2 commented Nov 17, 2022

Just to make sure I understand this issue: Virtual file systems exist in VSCode's memory. Language servers run as a separate process so they can't read that memory and therefore can't access the virtual file system. The lsp protocol handles transferring the text of the current file being worked between the two so the language server will be able to process that just fine. It's the references to other files in the form of includes or other project references that will break because it can't resolve a virtual file path.

My question then is - does VS code send these uri's across to the language server at all or do they get filtered out? I'm working on an internal use only language server. What I'm wondering is if VS code will send across the full Uri is there anything stopping me from adding logic to handle the virtual file system in my language server as well? (beyond the duplication of work of course and logic needed to keep the language server and client in sync)

@r3m0t
Copy link

r3m0t commented Nov 17, 2022 via email

@dselman
Copy link

dselman commented Nov 22, 2022

Not sure if this is the right place to comment so please redirect if necessary.

My use case is I'm trying to port an existing node.js Language Server to a Web Extension. I've got the basics working thanks to the useful sample however the challenge is now how to implement global (cross file) consistency checks for my language (think import checking etc).

I can no longer use fs in the Language Server, as I'd like the functionality to work in a web extension and the client documentSelector only sends LSP events for documents that are opened in the editor. I tried using vscode.workspace.findFiles on the client side and sending the files to the LSP server via a custom message, but that doesn't work with vscode-test-web because it doesn't support search for its mount scheme.

Is it possible to implement cross-file consistency checks within a web extension, or will it have to operate in a degraded "single file" mode?

@hugocaillard
Copy link

@dselman I had the same issue while porting an LSP server (written in Rust) to the web through WASM.
I handled it by creating a few request handlers on the client (in TypeScript) that can be requested by the server to simulate the FS.
See:
client code
request from server

Feel free to DM me on twitter for more details

@dbaeumer
Copy link
Member Author

@hugocaillard is the porting of the LSP Server to WASM available on Github. We are working on WASM support for VS Code and I would be interested in looking at what you did. Our implementation is here https://github.com/microsoft/vscode-wasm and an extension that executes Python in the Browser is here: https://github.com/microsoft/vscode-python-web-wasm

@hugocaillard
Copy link

hugocaillard commented Nov 22, 2022

@dbaeumer Yes, and we made a blog post about it: https://www.hiro.so/blog/write-clarity-smart-contracts-with-zero-installations-how-we-built-an-in-browser-language-server-using-wasm.

Everything is in this repo: https://github.com/hirosystems/clarinet
In ./components/clarity-vscode -> the TypeScript parts
In ./components/clarity-lsp> the Rust part

The LSP server wasn't build for web from the ground up and the project is still under active development, but it's running in production and used by many developers every days (marketplace)

@dselman
Copy link

dselman commented Nov 22, 2022

@hugocaillard thanks to your code I now have something working. When the Language Server is initialised it requests that the client open all the .cto files, which triggers onDidChangeContent on the Language Server, allowing it to rebuild global state from all the .cto files in the workspace.

@hugocaillard
Copy link

@dselman Awesome, glad it helps! In the end you probably won't need to trigger false onDidChange, the server should be able to discover all the .cto files from the workspace URI or some other base location sent by the client.
I don't want to spam this issue, but my Twitter DMs are open if you want to pursue this discussion (link in github profiles)

@dbaeumer
Copy link
Member Author

@hugocaillard thanks for the pointers. Looks actually really cool.

Do you know if your RUST code compiles to WASM-WASI. If so, you could get rid of all your custom file system provider calls. What I implemented is a WASI host that maps the whole WASI API to the VS Code API. So you can right normal Rust, C/C++ code with normal file system operations and it will transparently be mapped to the VS Code file system API.

What we want to achieve is that someone can take a normal Rust, ... program compile it down to WASM_WASI and run it inside VS Code where the file system available in the WASM execution is VS Code's workspace file system (and more since the vscode-wasm implementation support arbitrary mount points)

@DanTup
Copy link
Contributor

DanTup commented Nov 23, 2022

@dbaeumer this sounds great!

I'm looking at the code at https://github.com/microsoft/vscode-wasm/tree/a703168627ea8937829add349055a16640962227/wasm-wasi - do I understand correctly that the npm package here contains the VS Code WASI bindings, and that package wraps/hosts the CPython wasm binary in a way that provides it with the implementations for those file APIs? (eg. the compilation of CPython is just standard WASM/WASI and doesn't need anything VS Code-specific at compile time)?

@dbaeumer
Copy link
Member Author

Yes, but you need sync-api-client and sync-api-service as well which implements the VS Code API in a sync way since WASI is sync :-).

You might want to look inside testbeds to see how it is put together.

@DanTup
Copy link
Contributor

DanTup commented Nov 23, 2022

I did wonder how that would work, thanks for the pointers! I don't know how likely it is that the Dart server will ever compile to wasm, but it's good to know that if it does, it may not need as many changes as I'd thought to be able to handle some of these use cases. Thanks! 🙂

@brettcannon
Copy link
Member

I'm now wondering if you were talking about the browser at all, or whether this is intended for local VS Code's, just using WASM instead of JS?

In case you didn't notice Dirk's demo was via vscode.dev, WASI works anywhere WebAssembly works, so both browser and Node in our case. Think of WASI as POSIX for WebAssembly; it's just a spec and WebAssembly runtimes implement that spec to let code do stuff in a secure, portable way like accessing files.

In other words, WASI works wherever VS Code works, desktop and web. 🙂

@dkattan
Copy link

dkattan commented Feb 22, 2023

What are the odds that we can get some of @NTaylorMullen 's suggestions into 3.18?
Specifically reading directories/files

https://github.com/NTaylorMullen/LSPVirtualDocuments/blob/master/Documents/FileSystemSpec.md#readDirectory

@dbaeumer
Copy link
Member Author

We would need someone who drives this in both the spec and an implementation.

@d01010101
Copy link

Maybe I am wrong, but I am not convinced that things like this would scale well:

export interface ReadFileResponse {
    /**
     * The entire contents of the file `base64` encoded.
     */
    content: string;
}

That's ok for what language servers do now, but the mere client FS access suggests a more involved/complete source code analysis by the language server which in turn sounds a lot like what also a debugger needs (in-editor expression evaluation for example). Which may eventually make it interesting to somehow integrate LSP and debugging/runtime like the Debug Adapter Protocol. Which in turn might require a fully fledged parallel FS with locking, links, access rights and so on which is not necessarily trivial to implement. Instead, an IDE may configure a local FS server like SSH FS and the server may access it like any other local FS. Extending LSP to support file lock/update notifications might be useful in the case of operations like refactoring.

Then, I did not study the subject too much and maybe that shows.

@dbaeumer
Copy link
Member Author

dbaeumer commented Apr 2, 2023

For simplicity reason we might want to think about starting with a read only access on the server. This might handle most use cases.

@nelak2
Copy link

nelak2 commented Apr 2, 2023

For simplicity reason we might want to think about starting with a read only access on the server. This might handle most use cases.

I think that makes the most sense. To me, a language server is intended to provide contextual data about a file to an editing tool, not be the editing tool itself.

@d01010101
Copy link

For simplicity reason we might want to think about starting with a read only access on the server. This might handle most use cases.

How a custom read-only FS tied to LSP is more simple than for example LDAP (initialized or tunneled by LSP), when it is LDAP which already has a lot of libraries and tools virtually everywhere from client JS to server Java? Not an opinion, just a question.

@d01010101
Copy link

On a second thought, a simple file read access as proposed by dbaeumer might still serve a lot of functions without, depending on the approach,

  • a possible burden of an additional network connection going via a restricted firewall/NAT,
  • possible issues with making existing tools compatible with a tunneled FS like LDAP.

@d01010101
Copy link

To me, a language server is intended to provide contextual data about a file to an editing tool, not be the editing tool itself.

So perhaps you'd find this interesting https://news.ycombinator.com/item?id=16875685. With such an approach, even a source-wide refactoring could be done with contextual data only and without any write access.

@brettcannon
Copy link
Member

So perhaps you'd find this interesting https://news.ycombinator.com/item?id=16875685. With such an approach, even a source-wide refactoring could be done with contextual data only and without any write access.

Do note that the AST approach has its own drawbacks, e.g., you need to make sure that AST representation can represent every potential structure needed for every language VS Code supports (which is a lot since it's any and all languages 😉). Plus not every e.g. refactoring will work the same for each language, so it doesn't necessarily save you from having to either re-implement or do a ton of special-casing for various languages (once again, needs to work with any language out there).

@nelak2
Copy link

nelak2 commented Apr 11, 2023

To me, a language server is intended to provide contextual data about a file to an editing tool, not be the editing tool itself.

So perhaps you'd find this interesting https://news.ycombinator.com/item?id=16875685. With such an approach, even a source-wide refactoring could be done with contextual data only and without any write access.

The idea of just exposing an AST feels great in theory but I'm not sure it would work in practice. After all isn't that largely how editors handled different languages before the LSP?

Not to say that it's inherently the wrong approach or that approach wouldn't have worked if there was a standard protocol around it but in general that approach was explored for decades without a standard developing or without editors being full of special handling for different languages.

My conclusion after reading that discussion is that the world might need a language server framework to help language server developers build their AST which can be used by framework provided default implementations of common LSP features.

@d01010101
Copy link

d01010101 commented Apr 11, 2023

Yes, the "ton of special casing" can be a problem which may happen when a protocol tries to actually define in detail the AST layer, in order to put above it yet another layer of generic language-independent operations, which the link seems to suggest.

In order to avoid the problem in question, AST could be an otherwise undefined layer above the raw edited files. LSP would only provide a grammar-independent low-level API for a bidirectional translating service raw <-> AST and possibly its own example or default implementation of a "typical" LL(k) translating service. Such services could reside by default on the IDE side but the server could use any independent service on its side.

Examples of what could be done when a grammar with error recovery and whitespace handling is transferred to the said LL(k) service:

  1. A faulty statement is replaced by a "warning node" and the parser skips after the next configurable token, like ";". Then the service automatically underscores the faulty statement via the raw layer.
  2. Refactoring an identifier is as simple as asking the service to enumerate the referred node and its all references and then rename them. All relevant source updates automatically.
  3. Moving a class automatically moves a file in the directory structure because a file path is one of the said references, all in one grammar tree. If it's Java and you also need to update the imports, the server asks the service to enumerate all relevant import sections and modifies them respectively. Again, it is the service itself which updates the source. So each language indeed defines its own refactoring as it is now, but with the AST layer, it is much simpler.
  4. All whitespace retaining or generating is the task of the said service.
  5. Adding some basic model transform rules to the grammar would enable changing each fragment of an identifier (but not of a quoted string) of the form \alpha into α, without even engaging the server.

See that I do not have much experience with language protocols, so I can miss some detail here.

@DanTup
Copy link
Contributor

DanTup commented Nov 2, 2023

Is anybody currently working on this? In particular, I'm interested in the server being able to provide file contents for some virtual files (and being able to include URIs for the scheme it uses in other requests - for example being able to Go-to-Definition from a real file:/// document on the client side to a foo://bar provided by the server).

I think @dbaeumer's comment above:

For simplicity reason we might want to think about starting with a read only access on the server. This might handle most use cases.

... probably covers what I need. If I understand correctly, I think VS Code also already has implementation for this (registerFileSystemProvider).

I may be interested in helping (with this narrower-scoped version), but I don't want to duplicate effort if anything is already in progress.

@puremourning
Copy link

Is anybody currently working on this? In particular, I'm interested in the server being able to provide file contents for some virtual files (and being able to include URIs for the scheme it uses in other requests - for example being able to Go-to-Definition from a real file:/// document on the client side to a foo://bar provided by the server).

I think @dbaeumer's comment above:

For simplicity reason we might want to think about starting with a read only access on the server. This might handle most use cases.

... probably covers what I need. If I understand correctly, I think VS Code also already has implementation for this (registerFileSystemProvider).

I may be interested in helping (with this narrower-scoped version), but I don't want to duplicate effort if anything is already in progress.

For prior art, jdt.ls implements this in a custom thing. In fact they used to send their jdt:// URIs to all clients and had a side channel to retrieve the contents. I'd welcome standardisation of a mechanism for this. Pinging @snjeza and @fbricon

@mickaelistria
Copy link

What JDT-LS implements is more or less what has already been long discussed in #336 , a custom operation to attempt resolving of URI (of whichever scheme that client cannot process directly, in JDT-LS it's jdt:) to actual document content to display when attempting to open the document. In such case, the client needs to replace usual read from filesystem by a custom query to the language server.
The topic of this particular issue if slightly different (and so far there is no obvious need for it in most LS I've used); #336 is really where the "get document content" topic discussion (and maybe a PR) should continue.

@dkattan
Copy link

dkattan commented Nov 17, 2023

https://code.visualstudio.com/api/extension-guides/virtual-workspaces currently states

What about support in the Language Server Protocol (LSP) for accessing virtual resources?
Work is under way that will add file system provider support to LSP. Tracked in Language Server Protocol issue #1264.

Which brings us here. What is going on??

@dkattan
Copy link

dkattan commented Nov 17, 2023

Perhaps the LSIF format will help:

The purpose of the Language Server Index Format (LSIF) is it to define a standard format for language servers or other programming tools to dump their knowledge about a workspace.

The Project Context looks like it will help enumerate files/folders

The Embedding Contents feature could be used for retrieving the contents of a given file.

It can be valuable to embed the contents of a document or project file into the dump as well. For example, if the content of the document is a virtual document generated from program meta data. The index format therefore supports an optional contents property on the document and project vertex. If used the content needs to be base64 encoded.

The explanation of the feature makes it sound like it is specifically for virtual files. If it also included something to the effect of "This can also be used to facilitate web-based editors that lack filesystem access"

@dbaeumer this appears to be your baby, thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests