Skip to content

Commit

Permalink
Add treesitter textobjects (#728)
Browse files Browse the repository at this point in the history
* Add treesitter textobject queries

Only for Go, Python and Rust for now.

* Add tree-sitter textobjects

Only has functions and class objects as of now.

* Fix tests

* Add docs for tree-sitter textobjects

* Add guide for creating new textobject queries

* Add parameter textobject

Only parameter.inside is implemented now, parameter.around
will probably require custom predicates akin to nvim' `make-range`
since we want to select a trailing comma too (a comma will be
an anonymous node and matching against them doesn't work similar
to named nodes)

* Simplify TextObject cell init
  • Loading branch information
sudormrfbin authored Oct 23, 2021
1 parent c5298ca commit 4ee92ca
Show file tree
Hide file tree
Showing 11 changed files with 219 additions and 5 deletions.
2 changes: 2 additions & 0 deletions book/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,5 @@
- [Keymap](./keymap.md)
- [Key Remapping](./remapping.md)
- [Hooks](./hooks.md)
- [Guides](./guides/README.md)
- [Adding Textobject Queries](./guides/textobject.md)
4 changes: 4 additions & 0 deletions book/src/guides/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Guides

This section contains guides for adding new language server configurations,
tree-sitter grammers, textobject queries, etc.
30 changes: 30 additions & 0 deletions book/src/guides/textobject.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Adding Textobject Queries

Textobjects that are language specific ([like functions, classes, etc][textobjects])
require an accompanying tree-sitter grammar and a `textobjects.scm` query file
to work properly. Tree-sitter allows us to query the source code syntax tree
and capture specific parts of it. The queries are written in a lisp dialect.
More information on how to write queries can be found in the [official tree-sitter
documentation](tree-sitter-queries).

Query files should be placed in `runtime/queries/{language}/textobjects.scm`
when contributing. Note that to test the query files locally you should put
them under your local runtime directory (`~/.config/helix/runtime` on Linux
for example).

The following [captures][tree-sitter-captures] are recognized:

| Capture Name |
| --- |
| `function.inside` |
| `function.around` |
| `class.inside` |
| `class.around` |
| `parameter.inside` |

[Example query files][textobject-examples] can be found in the helix GitHub repository.

[textobjects]: ../usage.md#textobjects
[tree-sitter-queries]: https://tree-sitter.github.io/tree-sitter/using-parsers#query-syntax
[tree-sitter-captures]: https://tree-sitter.github.io/tree-sitter/using-parsers#capturing-nodes
[textobject-examples]: https://github.com/search?q=repo%3Ahelix-editor%2Fhelix+filename%3Atextobjects.scm&type=Code&ref=advsearch&l=&l=
13 changes: 10 additions & 3 deletions book/src/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,9 +51,10 @@ Multiple characters are currently not supported, but planned.

## Textobjects

Currently supported: `word`, `surround`.
Currently supported: `word`, `surround`, `function`, `class`, `parameter`.

![textobject-demo](https://user-images.githubusercontent.com/23398472/124231131-81a4bb00-db2d-11eb-9d10-8e577ca7b177.gif)
![textobject-treesitter-demo](https://user-images.githubusercontent.com/23398472/132537398-2a2e0a54-582b-44ab-a77f-eb818942203d.gif)

- `ma` - Select around the object (`va` in vim, `<alt-a>` in kakoune)
- `mi` - Select inside the object (`vi` in vim, `<alt-i>` in kakoune)
Expand All @@ -62,5 +63,11 @@ Currently supported: `word`, `surround`.
| --- | --- |
| `w` | Word |
| `(`, `[`, `'`, etc | Specified surround pairs |

Textobjects based on treesitter, like `function`, `class`, etc are planned.
| `f` | Function |
| `c` | Class |
| `p` | Parameter |

Note: `f`, `c`, etc need a tree-sitter grammar active for the current
document and a special tree-sitter query file to work properly. [Only
some grammars](https://github.com/search?q=repo%3Ahelix-editor%2Fhelix+filename%3Atextobjects.scm&type=Code&ref=advsearch&l=&l=)
currently have the query file implemented. Contributions are welcome !
1 change: 1 addition & 0 deletions helix-core/src/indent.rs
Original file line number Diff line number Diff line change
Expand Up @@ -464,6 +464,7 @@ where
unit: String::from(" "),
}),
indent_query: OnceCell::new(),
textobject_query: OnceCell::new(),
}],
});

Expand Down
43 changes: 41 additions & 2 deletions helix-core/src/syntax.rs
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ pub struct Configuration {
#[serde(rename_all = "kebab-case")]
pub struct LanguageConfiguration {
#[serde(rename = "name")]
pub(crate) language_id: String,
pub language_id: String,
pub scope: String, // source.rust
pub file_types: Vec<String>, // filename ends_with? <Gemfile, rb, etc>
pub roots: Vec<String>, // these indicate project roots <.git, Cargo.toml>
Expand All @@ -76,6 +76,8 @@ pub struct LanguageConfiguration {

#[serde(skip)]
pub(crate) indent_query: OnceCell<Option<IndentQuery>>,
#[serde(skip)]
pub(crate) textobject_query: OnceCell<Option<TextObjectQuery>>,
}

#[derive(Debug, Serialize, Deserialize)]
Expand Down Expand Up @@ -105,6 +107,32 @@ pub struct IndentQuery {
pub outdent: HashSet<String>,
}

#[derive(Debug)]
pub struct TextObjectQuery {
pub query: Query,
}

impl TextObjectQuery {
/// Run the query on the given node and return sub nodes which match given
/// capture ("function.inside", "class.around", etc).
pub fn capture_nodes<'a>(
&'a self,
capture_name: &str,
node: Node<'a>,
slice: RopeSlice<'a>,
cursor: &'a mut QueryCursor,
) -> Option<impl Iterator<Item = Node<'a>>> {
let capture_idx = self.query.capture_index_for_name(capture_name)?;
let captures = cursor.captures(&self.query, node, RopeProvider(slice));

captures
.filter_map(move |(mat, idx)| {
(mat.captures[idx].index == capture_idx).then(|| mat.captures[idx].node)
})
.into()
}
}

fn load_runtime_file(language: &str, filename: &str) -> Result<String, std::io::Error> {
let path = crate::RUNTIME_DIR
.join("queries")
Expand Down Expand Up @@ -153,7 +181,6 @@ impl LanguageConfiguration {
// highlights_query += "\n(ERROR) @error";

let injections_query = read_query(&language, "injections.scm");

let locals_query = read_query(&language, "locals.scm");

if highlights_query.is_empty() {
Expand Down Expand Up @@ -203,6 +230,18 @@ impl LanguageConfiguration {
.as_ref()
}

pub fn textobject_query(&self) -> Option<&TextObjectQuery> {
self.textobject_query
.get_or_init(|| -> Option<TextObjectQuery> {
let lang_name = self.language_id.to_ascii_lowercase();
let query_text = read_query(&lang_name, "textobjects.scm");
let lang = self.highlight_config.get()?.as_ref()?.language;
let query = Query::new(lang, &query_text).ok()?;
Some(TextObjectQuery { query })
})
.as_ref()
}

pub fn scope(&self) -> &str {
&self.scope
}
Expand Down
51 changes: 51 additions & 0 deletions helix-core/src/textobject.rs
Original file line number Diff line number Diff line change
@@ -1,9 +1,13 @@
use std::fmt::Display;

use ropey::RopeSlice;
use tree_sitter::{Node, QueryCursor};

use crate::chars::{categorize_char, char_is_whitespace, CharCategory};
use crate::graphemes::next_grapheme_boundary;
use crate::movement::Direction;
use crate::surround;
use crate::syntax::LanguageConfiguration;
use crate::Range;

fn find_word_boundary(slice: RopeSlice, mut pos: usize, direction: Direction) -> usize {
Expand Down Expand Up @@ -51,6 +55,15 @@ pub enum TextObject {
Inside,
}

impl Display for TextObject {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.write_str(match self {
Self::Around => "around",
Self::Inside => "inside",
})
}
}

// count doesn't do anything yet
pub fn textobject_word(
slice: RopeSlice,
Expand Down Expand Up @@ -108,6 +121,44 @@ pub fn textobject_surround(
.unwrap_or(range)
}

/// Transform the given range to select text objects based on tree-sitter.
/// `object_name` is a query capture base name like "function", "class", etc.
/// `slice_tree` is the tree-sitter node corresponding to given text slice.
pub fn textobject_treesitter(
slice: RopeSlice,
range: Range,
textobject: TextObject,
object_name: &str,
slice_tree: Node,
lang_config: &LanguageConfiguration,
_count: usize,
) -> Range {
let get_range = move || -> Option<Range> {
let byte_pos = slice.char_to_byte(range.cursor(slice));

let capture_name = format!("{}.{}", object_name, textobject); // eg. function.inner
let mut cursor = QueryCursor::new();
let node = lang_config
.textobject_query()?
.capture_nodes(&capture_name, slice_tree, slice, &mut cursor)?
.filter(|node| node.byte_range().contains(&byte_pos))
.min_by_key(|node| node.byte_range().len())?;

let len = slice.len_bytes();
let start_byte = node.start_byte();
let end_byte = node.end_byte();
if start_byte >= len || end_byte >= len {
return None;
}

let start_char = slice.byte_to_char(start_byte);
let end_char = slice.byte_to_char(end_byte);

Some(Range::new(start_char, end_char))
};
get_range().unwrap_or(range)
}

#[cfg(test)]
mod test {
use super::TextObject::*;
Expand Down
19 changes: 19 additions & 0 deletions helix-term/src/commands.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4465,9 +4465,28 @@ fn select_textobject(cx: &mut Context, objtype: textobject::TextObject) {
let (view, doc) = current!(cx.editor);
let text = doc.text().slice(..);

let textobject_treesitter = |obj_name: &str, range: Range| -> Range {
let (lang_config, syntax) = match doc.language_config().zip(doc.syntax()) {
Some(t) => t,
None => return range,
};
textobject::textobject_treesitter(
text,
range,
objtype,
obj_name,
syntax.tree().root_node(),
lang_config,
count,
)
};

let selection = doc.selection(view.id).clone().transform(|range| {
match ch {
'w' => textobject::textobject_word(text, range, objtype, count),
'c' => textobject_treesitter("class", range),
'f' => textobject_treesitter("function", range),
'p' => textobject_treesitter("parameter", range),
// TODO: cancel new ranges if inconsistent surround matches across lines
ch if !ch.is_ascii_alphanumeric() => {
textobject::textobject_surround(text, range, objtype, ch, count)
Expand Down
21 changes: 21 additions & 0 deletions runtime/queries/go/textobjects.scm
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
(function_declaration
body: (block)? @function.inside) @function.around

(func_literal
(_)? @function.inside) @function.around

(method_declaration
body: (block)? @function.inside) @function.around

;; struct and interface declaration as class textobject?
(type_declaration
(type_spec (type_identifier) (struct_type (field_declaration_list (_)?) @class.inside))) @class.around

(type_declaration
(type_spec (type_identifier) (interface_type (method_spec_list (_)?) @class.inside))) @class.around

(parameter_list
(_) @parameter.inside)

(argument_list
(_) @parameter.inside)
14 changes: 14 additions & 0 deletions runtime/queries/python/textobjects.scm
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
(function_definition
body: (block)? @function.inside) @function.around

(class_definition
body: (block)? @class.inside) @class.around

(parameters
(_) @parameter.inside)

(lambda_parameters
(_) @parameter.inside)

(argument_list
(_) @parameter.inside)
26 changes: 26 additions & 0 deletions runtime/queries/rust/textobjects.scm
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
(function_item
body: (_) @function.inside) @function.around

(struct_item
body: (_) @class.inside) @class.around

(enum_item
body: (_) @class.inside) @class.around

(union_item
body: (_) @class.inside) @class.around

(trait_item
body: (_) @class.inside) @class.around

(impl_item
body: (_) @class.inside) @class.around

(parameters
(_) @parameter.inside)

(closure_parameters
(_) @parameter.inside)

(arguments
(_) @parameter.inside)

0 comments on commit 4ee92ca

Please sign in to comment.