Abstract Syntax Trees
An abstract syntax tree (AST) is a data structure used to represent the contents of a file, where each node in the tree denotes a construct in the source code. For example, a node might represent a method declaration, with child nodes that provide information on accessibility, return type, method name, parameters and the method body, which itself has child nodes representing the method instructions and any nested code blocks. Equally, a node might represent whitespace, or a comment, or string literals.
The PSI defines common base interfaces for creating an AST. These interfaces are shared across all languages that ReSharper supports, providing a common base for sharing information across files and languages. ReSharper provides utility methods for walking and manipulating the tree, including deleting and adding nodes and moving nodes around the tree.
Each node in the tree implements the ITreeNode
interface, and the root file node also implements the IFile
interface. Each supported file type is parsed and an AST is constructed. The AST maps directly to the underlying text buffer, and the PSI is responsible for keeping the two in sync, updating the text buffer when the tree is manipulated, and updating the tree when the text file changes. ReSharper supports error resilient and incremental parsers, meaning a valid AST is generated even if the code is invalid, with special error nodes indicating where the syntax is incorrect, and, whenever possible, will only parse code blocks that have changed, rather than parsing the whole file. For example, when a C# method body changes, only the method body is re-parsed, and not the whole containing file.
ReSharper's ASTs are based primarily on file contents, but can also work with secondary and injected ASTs, commonly referred to as secondary and injected PSIs.
A secondary PSI is a PSI syntax tree built from a secondary file - an in-memory "code-behind" file generated by ReSharper from the primary code file. Areas of the secondary PSI are mapped to the original file. This is how ReSharper handles .aspx and Razor files. It maintains a primary PSI tree of the main file, but also generates an in-memory representation of the file that is generated when the .aspx or Razor template is compiled. (This generated file is referred to by ReSharper as a code-behind file, but shouldn't be confused with ASP.NET code behind files.) A second PSI tree is created for the generated C# file, and the areas of C# in the .aspx or Razor file are mapped to the equivalent areas in the generated file. Because the generated file has full context of the class it belongs to, code completion and navigation are possible, simply by mapping between the areas of the files.
An injected PSI serves a different purpose. Instead of providing extra context for a whole file, or for "islands" of content within a file, it creates a new AST by parsing the contents of a single node of a host AST. For example, ReSharper 9.0 introduces support for regular expressions by creating an AST for the regular expression represented by a string literal in a C# file. The intent of injected PSIs is to handle the "embedded DSLs" of languages written inside a string literal of another language, such as regular expressions, SQL, or even Angular JS expressions.