dts-tree-sitter generates TypeScript .d.ts files for interacting the AST from a given tree-sitter grammar.
npm i @asgerf/dts-tree-sitter
npx @asgerf/dts-tree-sitter INPUT > OUTPUT.d.tsAlternative if you prefer to run without npx:
node ./node_modules/@asgerf/dts-tree-sitter/build/src/index.js INPUT > OUTPUT.d.tswhere INPUT is used to locate a node-types.json file in one of the following locations:
${INPUT}${INPUT}/node-types.json${INPUT}/src/node-types.jsonnode_modules/${INPUT}/src/node-types.json
The tree-sitter-javascript grammar can be compiled like this:
npm i tree-sitter-javascript
npx @asgerf/dts-tree-sitter tree-sitter-javascript > generated.d.tsAlternative if you prefer to run without npx:
node ./node_modules/@asgerf/dts-tree-sitter/build/src/index.js tree-sitter-javascript > generated.d.tsIn the resulting grammar, two of the node types look like this:
export interface ClassDeclarationNode extends SyntaxNodeBase {
type: SyntaxType.ClassDeclaration;
bodyNode: ClassBodyNode;
decoratorNodes?: DecoratorNode[];
nameNode: IdentifierNode;
}
export interface ClassBodyNode extends SyntaxNodeBase {
type: SyntaxType.ClassBody;
memberNodes?: (MethodDefinitionNode | PublicFieldDefinitionNode)[];
}This can be used like this (see full example):
import * as g from "./generated";
function getMemberNames(node: g.ClassDeclarationNode) {
let result = [];
for (let member of node.bodyNode.memberNodes) {
if (member.type === g.SyntaxType.MethodDefinition) {
result.push(member.nameNode.text);
} else {
result.push(member.propertyNode.text);
}
}
return result;
}Observe TypeScript do its magic: the type check in the if promotes the type of member to a MethodDefinitionNode
in the 'then' branch, and to PublicFieldDefinitionNode in the 'else' branch.
Tree sitter's TreeCursor allows fast traversal of an AST, and has two properties with correlated types: nodeType, and currentNode.
Once you've checked nodeType, it's annoying to have to cast currentNode to the correponding type right afterwards:
if (cursor.nodeType === g.SyntaxType.Function) {
let node = cursor.currentNode as g.Function; // annoying cast
}There's another way, which is handy in large switches: Cast the cursor itself to a TypedTreeCursor before switching on nodeType.
Then the guarded use of currentNode has the expected type. For example:
function printDeclaredNames() {
let cursor = tree.walk();
do {
const c = cursor as g.TypedTreeCursor;
switch (c.nodeType) {
case g.SyntaxType.ClassDeclaration:
case g.SyntaxType.FunctionDeclaration:
case g.SyntaxType.VariableDeclarator: {
let node = c.currentNode;
console.log(node.nameNode.text);
break;
}
}
} while(gotoPreorderSucc(cursor));
}nodegets the typeClassDeclarationNode | FunctionDeclarationNode | VariableDeclaratorNode.- This allows safe access to
node.nameNode, since each of those types have anamefield. - We don't pay the cost of invoking
currentNodefor other types of nodes.
This happens if you compare types from the general tree-sitter.d.ts file with those from the generated .d.ts file.
Every type from tree-sitter.d.ts has a stronger version in the generated file; make sure you don't mix and match.
This can happen if the grammar contains rules and literals with the same name. For example this grammar rule,
func: $ => seq('func', $.name, $.body)will produce a named node with type func, while the 'func' literal will produce an unnamed node with type func as well.
This means a check like node.type === 'func' is not an exact type check, and the type of node will only be restricted to FuncNode | UnnamedNode<'func'>. This is not a bug in the generated .d.ts file: there really are two kinds of nodes you need to handle after that check.
Some possible solutions are:
- Change the grammar to avoid rules with the same name as a keyword.
- Write the check as
node.isNamed && node.type === 'func'. - Change the declared type of
nodefromSyntaxNodetoNamedNode.