Skip to content

Change match order to avoid inefficient comparisons #114

@octogonz

Description

@octogonz

I encountered the problem #111 today, where ESLint crashes due to TypeScript node types that are unrecognized by esquery. But it raises a different question about query strategy:

In my case, the selector Program > :first-child is trying to match the very first node in the AST tree. I was surprised to find that the crash was happening when this code is trying to find the :first-child of a TSTypeAnnotation very deep in the AST tree:

esquery/esquery.js

Lines 301 to 317 in e27e73d

* @param {external:AST} node
* @param {external:AST[]} ancestry
* @param {IndexFunction} idxFn
* @returns {boolean}
*/
function nthChild(node, ancestry, idxFn) {
const [parent] = ancestry;
if (!parent) { return false; }
const keys = estraverse.VisitorKeys[parent.type];
for (const key of keys) {
const listProp = parent[key];
if (Array.isArray(listProp)) {
const idx = listProp.indexOf(node);
if (idx >= 0 && idx === idxFn(listProp.length)) { return true; }
}
}
return false;

Why should this comparison be performed at all? I expected for Program to trivially fail to match TSTypeAnnotation, and the :first-child would never even get tested.

It seems that the query evaluation for > ('child') compares the right side :first-child BEFORE comparing the left side Program:

esquery/esquery.js

Lines 131 to 135 in e27e73d

case 'child':
if (matches(node, selector.right, ancestry)) {
return matches(ancestry[0], selector.left, ancestry.slice(1));
}
return false;

Is there a good reason for that?

Logically it is like: "Is there a tire? --> Is the tire on a wheel? --> Is the wheel on a vehicle? --> Is the vehicle a motorcycle?"

Wouldn't it be better to first check for a motorcycle, before we go inspecting every tire?

The error goes away if I reorder the tests like this:

 case 'child': 
     if (matches(ancestry[0], selector.left, ancestry.slice(1))) { 
         return matches(node, selector.right, ancestry);
     } 
     return false; 

Should we make this change? Is there any downside?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions