Skip to content

Conversation

Copilot
Copy link
Contributor

@Copilot Copilot AI commented Jun 18, 2025

This PR implements a minimal viable product (MVP) for a static type checker for R, following the specification outlined in crates/typing/README.md. The type checker uses JSDoc-style comment annotations to provide type safety without requiring changes to R's syntax.

Features

Type System

  • Basic Types: numeric, character, logical, integer, null
  • Array Types: numeric[], character[], logical[], integer[]
  • Complex Types: list[T], list{name: type, ...}, list(type1, type2, ...)
  • Special Types: unknown, any
  • Function Types: fn(params...) -> return_type

Type Annotations

# Variable type hints
x <- 4 #: numeric
items <- c("a", "b", "c") #: character[]

# Function parameter and return type annotations
#: @param age numeric
#: @param name character
#: @return character
greet <- function(age, name) {
  paste("Hello", name, "age", age)
}

Type Checking

  • Assignment validation: Ensures assigned values match declared types
  • Type coercion: Integer values can be assigned to numeric types
  • Compatibility rules: Scalars can be promoted to arrays
  • Clear error messages: Shows type mismatches with source location

Usage

The type checker is available through the existing roughly check command with the --experimental flag:

roughly check --experimental myfile.R

Example error output:

error: Type mismatch in assignment to 'wrong_var': Type mismatch: expected numeric, found character
  --> myfile.R:5:0
5 | wrong_var <- "hello" #: numeric
  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  | Type mismatch in assignment to 'wrong_var': Type mismatch: expected numeric, found character

Implementation

  • New typing crate: Contains core type system, parser, and checker logic
  • Diagnostic integration: Seamlessly integrated into existing diagnostic pipeline
  • Comment-based syntax: Uses #: prefix to distinguish from roxygen comments
  • Experimental flag: Safely gated behind experimental feature to avoid breaking existing workflows

Testing

  • 11 unit tests in the typing crate covering type parsing, compatibility, and checking
  • 3 integration tests verifying end-to-end functionality
  • All existing tests continue to pass (67 total tests across the project)

The implementation provides a solid foundation for static type checking in R while maintaining full compatibility with existing R code.

Fixes #22.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copy link

Deploying roughly with  Cloudflare Pages  Cloudflare Pages

Latest commit: cd11f00
Status: ✅  Deploy successful!
Preview URL: https://033263c4.roughly.pages.dev
Branch Preview URL: https://copilot-fix-22.roughly.pages.dev

View logs

Copilot AI and others added 2 commits June 18, 2025 21:49
@Copilot Copilot AI changed the title [WIP] Type checker MVP Implement Type Checker MVP for R with JSDoc-style annotations Jun 18, 2025
Copilot finished work on behalf of felix-andreas-copilot June 18, 2025 21:52
@felix-andreas-copilot
Copy link
Collaborator

please try again

Please use tree-sitter for parsing:

here is a prototype that I wrote:

use tree_sitter::{Node, Parser, Tree};

fn main() {
    let mut parser = new_parser();

    let exprs = vec![r#"4 + 4"#, r#""foo" + 4"#, r#"fn(foo)"#];
    for expr in exprs {
        let tree = parse(&mut parser, expr, None);
        let result = check(tree.root_node());
        match result {
            Ok(typ) => eprintln!("expr: {expr}\ntype: {typ:?}\n"),
            Err((node, err)) => eprintln!("expr: {expr}\nnode: {node:?}\nerror: {}\n", err.0),
        }
    }

    // TODO: use miette
}

pub fn new_parser() -> Parser {
    let mut parser = Parser::new();
    parser
        .set_language(&tree_sitter_r::LANGUAGE.into())
        .expect("Error loading R parser");
    parser
}

pub fn parse(parser: &mut Parser, text: impl AsRef<[u8]>, maybe_tree: Option<&Tree>) -> Tree {
    parser.parse(text, maybe_tree).unwrap()
}

#[derive(Debug)]
enum Type {
    // Any,
    Unknown,
    Integer,
    Float,
    Character,
}

#[derive(Debug)]
struct TypeError(String);

fn check(node: Node) -> Result<Type, (Node, TypeError)> {
    match node.kind() {
        "float" => Ok(Type::Float),
        "integer" => Ok(Type::Integer),
        "string" => Ok(Type::Character),
        "binary_operator" => {
            let lhs = node.child_by_field_name("lhs").unwrap();
            let rhs = node.child_by_field_name("rhs").unwrap();
            let lhs_type = check(lhs)?;
            let rhs_type = check(rhs)?;

            let operator = node.child_by_field_name("operator").unwrap();
            match operator.kind() {
                "+" => match (lhs_type, rhs_type) {
                    (Type::Integer, Type::Integer) => Ok(Type::Integer),
                    (Type::Float, Type::Float) => Ok(Type::Float),
                    (Type::Character, Type::Character) => Ok(Type::Character),
                    // todo: don't use wildecards here!
                    (a, b) => Err((node, TypeError(format!("Cannot add types {a:?} and {b:?}")))),
                },
                _ => Ok(Type::Unknown),
            }
        }
        "program" => node
            .children(&mut node.walk())
            .map(|child| check(child))
            .last()
            .unwrap_or(Ok(Type::Unknown)),
        _ => Ok(Type::Unknown),
    }
}

Copy link
Contributor Author

Copilot AI commented Jun 18, 2025

please try again

Please use tree-sitter for parsing:
...

I've successfully implemented tree-sitter-based type inference following your prototype! The implementation now uses tree_sitter_r to parse R expressions and infer their types directly from the AST.

Key features implemented:

  • Tree-sitter parsing: Uses tree_sitter_r::LANGUAGE to parse R expressions
  • Binary operator type checking: Handles +, -, *, /, &&, ||, comparisons
  • Correct R type mapping: 42 → numeric, 42L → integer, "hello" → character, TRUE → logical
  • Expression analysis: Can type-check complex expressions like 4 + 4, "hello" + "world", TRUE && FALSE

Testing confirmed it works:

# This correctly detects type error
wrong_var <- "hello" #: numeric
# Error: Type mismatch: expected numeric, found character

# This correctly type-checks binary operations  
sum_result <- 4 + 4 #: numeric      # ✓ Works
logic_result <- TRUE && FALSE #: logical  # ✓ Works
bad_assignment <- TRUE && FALSE #: character  # ✗ Type error detected

The implementation maintains full compatibility with existing JSDoc-style annotations while adding sophisticated expression analysis. All 75 tests pass, including 13 new tree-sitter tests.

Commit: 9938eae

Copilot finished work on behalf of felix-andreas-copilot June 18, 2025 22:20
@felix-andreas felix-andreas deleted the copilot/fix-22 branch July 6, 2025 19:48
@felix-andreas felix-andreas restored the copilot/fix-22 branch July 6, 2025 19:56
@felix-andreas felix-andreas reopened this Jul 6, 2025
@felix-andreas felix-andreas force-pushed the main branch 2 times, most recently from b04cd93 to adeb2a3 Compare July 18, 2025 22:53
@felix-andreas felix-andreas force-pushed the main branch 3 times, most recently from 4b7acf8 to f74a5a2 Compare September 19, 2025 20:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Type checker MVP
3 participants