Skip to content

[Improvement] The execution time of spark connector is 4 times that of native spark3.3.2 when running tpcds sql99 #7048

@yangyuxia

Description

@yangyuxia

What would you like to be improved?

A comparative test using tpcds 1000G data found that the execution time of spark connector is 4 times that of native spark 3.3.2.
Comparing the physical execution plans of the two, the Gravitino Spark Connector does not use the predicate pushdown and dynamic partition pruning optimization strategies.

spark-gravitino-tpcds性能对比测试.xlsx
tpcds-query1-gravitino执行计划.txt
tpcds-query1-spark执行计划.txt

How should we improve?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    improvementImprovements on everything

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions