SQuaLity

SQuaLity is a tool that unifies test suites from different Database Management Systems (DBMS). It performs cross-compatibility testing by running test cases from one DBMS on another, helping identify compatibility issues between different SQL dialects.

Getting Started

Requirements:

Python >= 3.10
Run pip3 install -r requirements.txt to install dependencies
DBMS server setup

Demo

The following commands will clone SQuaLity, install the required packages, and run a single test case (executing an SQL Logic Test (SLT) test case on DuckDB using the DuckDB Python connector). SQuaLity logs the execution status in logs/debug.log and outputs results to output/duckdb_sqlite_debug_results.csv.

git clone [email protected]:suyZhong/SQuaLity.git
cd SQuaLity
pip3 install -r requirements.txt
./demo.sh

Install Original Test Suites

The test suites are stored in $DBMS_suites folders and are not included in this artifact. The following commands download the latest original test suites from their official repositories.

cd SQuaLity
./scripts/install_test.sh

Original test cases are distributed under different licenses. For more information, please refer to:

SQLite is in the public domain and does not require a license.
DuckDB is licensed under the MIT License.
PostgreSQL is licensed under the PostgreSQL License.
MySQL is licensed under version 2 of the GNU General Public License (GPLv2).

Run SQuaLity (For Reproducibility)

Test Suites Used in the Paper

For reproducibility purposes, the test suites used in our paper should be downloaded from this link. However, you can also experiment with the latest versions of the test suites to explore new findings.

Analyze Test Suites (RQ1, RQ2)

We use Python scripts to analyze the test suites.

First, extract test cases from the test suites into our unified format:

python3 scripts/extract_testcases.py -s all

Then analyze the test suites:

python3 scripts/analyze_test_cases.py -m $MODE -o $OUTPUT_DIR

The MODE parameter specifies the analysis mode:

length: Count the lines of code (LOC) for each test case (RQ1)
dist: Count the distribution of overall SQL statements (RQ2)
select: Count the distribution of SELECT statements (RQ2)
join: Count the distribution of JOIN statements (RQ2)

Example: python3 scripts/analyze_test_cases.py -m length -o output

Note: For RQ2, the analysis might take a long time due to SQLite's large test suite. We plan to parallelize the analysis in future versions.

Execute Test Suites

The following command runs SQuaLity on a specific DBMS using a specific test suite. Results are stored in output/$DBMS_$SUITE_results.csv.

python3 main.py --dbms $DBMS --s $SUITE [-f DB_NAME] --dump_all --filter --log INFO

Example 1: Run SQuaLity on DuckDB using the PostgreSQL test suite:

python3 main.py --dbms duckdb --s postgresql  -f output/testpgdb --dump_all --filter --log INFO

Example 2: Run MySQL on the same test suite (requires setting up a MySQL server and configuring the connection in ./config/config.json):

python3 main.py --dbms mysql --s postgresql  -f output/testpgdb --dump_all --filter --log INFO

Results Files

After running the test suites, results are stored in the output directory and logs are stored in the logs directory.

.
├── logs
│   ├── $DBMS_$SUITE-$date.log
│   ├── $DBMS_$SUITE-$date.log.out
├── output
│   ├── $DBMS_$SUITE_filter_logs.csv
│   ├── $DBMS_$SUITE_filter_results.csv
...

The *.log* files contain detailed information about test cases that succeeded or failed. The *.log.out* files contain execution summaries of the test cases.

The *results.csv files contain the execution results for each test case. The *logs.csv files contain the SQL statements used to build schemas for failing test cases.

Analyze Results (RQ3, RQ4)

We use Jupyter notebooks to analyze the test suite results.

.
├── scripts
│   ├── RQ3-SampleDuckDB.ipynb
│   ├── RQ3-SamplePostgres.ipynb
│   ├── RQ4-BugAnalysisDuckDBTest.ipynb
│   ├── RQ4-BugAnalysisPGTest.ipynb
│   ├── RQ4-BugAnalysisSLT.ipynb
...

Execute the Jupyter notebooks to analyze the test suite results. Manual analysis is required to understand the compatibility issues between different SQL dialects. The analysis generally includes the following steps:

Load the test suite results.
Filter test cases that failed using regular expressions.
Sample the remaining failed test cases. Sampled test cases are stored in output/$DBMS_$SUITE_sample_100.csv.
Manually analyze the sampled failed test cases to identify compatibility issues between different SQL dialects:
- For each sampled test case, analyze the error reason and update the ERROR_REASON column in the CSV file.
- Summarize the compatibility issues in data/$SUITE_suite_errors.csv.
Export analysis statistics.

Name		Name	Last commit message	Last commit date
Latest commit History 202 Commits
.github/workflows		.github/workflows
config		config
data		data
demo		demo
scripts		scripts
src		src
test		test
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
demo.sh		demo.sh
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SQuaLity

Getting Started

Demo

Install Original Test Suites

Run SQuaLity (For Reproducibility)

Test Suites Used in the Paper

Analyze Test Suites (RQ1, RQ2)

Execute Test Suites

Results Files

Analyze Results (RQ3, RQ4)

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

suyZhong/SQuaLity

Folders and files

Latest commit

History

Repository files navigation

SQuaLity

Getting Started

Demo

Install Original Test Suites

Run SQuaLity (For Reproducibility)

Test Suites Used in the Paper

Analyze Test Suites (RQ1, RQ2)

Execute Test Suites

Results Files

Analyze Results (RQ3, RQ4)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages