Major Performance Difference: SQL vs. Cypher for Aggregation/Ordering

I am executing a performance benchmarking for Apache AGE by using [goodreads dataset](https://cseweb.ucsd.edu/~jmcauley/datasets/goodreads.html). I've found a significant performance gap between a direct SQL query and its AGE/Cypher equivalent, specifically with aggregation, grouping, and ordering on a large dataset.

Here is my graph design:

<img width="757" height="443" alt="Image" src="https://github.com/user-attachments/assets/7c4d3628-e6eb-42b4-9bab-b4000f6ce643" />

Fast Direct PostgreSQL Query (3 seconds):
`select count(*), u.id from "User" u, "HAS_INTERACTION" h, "Book" b where u.id = h.start_id and b.id = h.end_id GROUP BY u.id ORDER BY 1 DESC LIMIT 10;`

Slow AGE/Cypher Query (50+ seconds):
`SELECT * FROM cypher('goodreads_graph', $$ MATCH (u:User)-[:HAS_INTERACTION]->() RETURN u.user_id, count(*) ORDER BY count(*) DESC LIMIT 10 $$) AS (user_id agtype, interaction_count agtype);`

I expect AGE/Cypher to be much closer in performance to direct SQL. Currently, the Cypher query is over 15x slower, which is a major issue.

I believe my sample Cypher query is similar to, and aligns with, [the sorting on aggregate functions sample](https://age.apache.org/age-manual/master/intro/aggregation.html#sorting-on-aggregate-functions) in the official AGE documentation. Is this a bug that will be addressed in an upcoming release?

I'm looking for guidance on how to optimize this Cypher query to achieve performance more comparable to the direct PostgreSQL query. Any recommendations on AGE-specific best practices, indexing for aggregations, or relevant configuration tuning would be greatly appreciated. 

For reference, when the ORDER BY clause is removed from the AGE/Cypher query, its execution time significantly improves.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Major Performance Difference: SQL vs. Cypher for Aggregation/Ordering #2194

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Major Performance Difference: SQL vs. Cypher for Aggregation/Ordering #2194

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions