A Better R Programming Experience Thanks to Tree-sitter

Tree-sitter is revolutionizing how R developers interact with code through high-speed parsing. From improving GitHub search to optimizing the Positron IDE, this technology promises a more powerful tool ecosystem for the R community.
Thursday, April 2, 2026 From rOpenSci. Except where otherwise noted, content on this site is licensed under the CC-BY license.
A little bit less than two years ago, building on work by Jim Hester and Kevin Ushey, Davis Vaughan completed a very impactful JavaScript file for the R community: an R grammar for the Tree-sitter parsing generator. He even got a round of applause for it during a talk at the useR! 2024 conference! So, did he get cheered for… grammatical rules in a JavaScript file? 😅
No, the audience was excited about the improved developer experience for R that this file unlocked. In this post, we’ll explain what Tree-sitter is, and how tools built on Tree-sitter can benefit your R development workflow.
What is Tree-sitter?
Tree-sitter is a code parsing generator written in C, with bindings existing in several languages including Rust (and R!).
What does it mean to parse code? Basically, given a string of code like a <- mean(x, na.rm = TRUE), how do you know that mean is a function name, na.rm an argument name, and TRUE a logical? You have to parse that code into what’s called a parse tree.
R itself can obviously parse R code, thanks to its grammar. You can use parse() and getParseData() to see how R breaks down code into tokens like SYMBOL_FUNCTION_CALL or LEFT_ASSIGN.
Why Tree-sitter Matters
Tree-sitter performs this same code parsing but faster, especially thanks to its support of incremental parsing – which is key to updating the syntax tree as you are typing in your editor! Tree-sitter is agnostic in that it can parse any code as long as there is a grammar for it.
To have Tree-sitter “learn” a new language, you need to give it a file containing the definition of the syntax of that language, what’s called a grammar. The treesitter-r repo, which provides a translation of the R grammar in the format expected by Tree-sitter, is the base of all tools presented in this post.
Practical Applications
The real reason why the audience applauded Davis Vaughan is that he explained how the R grammar for Tree-sitter had been deployed to GitHub so that we get an excellent experience browsing R code. If we search for a function name in a repository, its definition will be indicated in the search results.
Also very useful is the use of Tree-sitter by Ark, the R kernel used in the Positron IDE. Ark is how you get:
- Autocompletion and help on hover: Smarter suggestions based on code structure.
- Selection expansion: Easily selecting logical blocks of code, such as steps in a pipeline.
Conclusion
The integration of Tree-sitter into the R ecosystem is more than just a technical update; it is a leap forward in developer productivity. By providing a fast and accurate way for machines to "understand" R code structure, Tree-sitter is laying the groundwork for a new generation of intelligent programming tools.
Source: Hacker News










