Skip to content

Conversation

@lichuang
Copy link

@lichuang lichuang commented Jan 4, 2026

This PR adds four new test cases to improve test coverage for the write functionality in Lance:

Changes

  • test_max_rows_per_file - Tests the max_rows_per_file parameter behavior

    • Validates that data is correctly split into multiple fragments based on row count limits
    • Verifies row distribution: 12,000 rows with limit of 5,000 creates 3 fragments [5000, 5000, 2000]
  • test_max_rows_per_group - Tests max_rows_per_group parameter across different Lance file versions

    • V1 (Legacy): Row group chunking affects fragment distribution
    • V2 (Stable): Ignores row group size, splits only at file boundaries
    • Demonstrates the behavioral differences between V1 and V2 implementations
  • test_empty_stream_write: Verifies graceful handling of empty input streams to prevent unexpected panics or cryptic errors.

  • test_schema_mismatch_on_append: Ensures clear error messages and data integrity when attempting to append data with incompatible schemas.

  • test_disk_full_error: Validates proper error propagation for storage-related failures to help users quickly identify and debug disk space issues.

  • test_write_interruption_recovery: Tests the complete transaction flow for interrupted writes, ensuring dataset consistency, data integrity, and successful retry capability.

Motivation

These tests improve confidence in the write functionality by covering important parameters and features that were previously untested or under-tested. They help prevent regressions and ensure correct behavior across different Lance file versions.

@github-actions github-actions bot added the chore label Jan 4, 2026
@lichuang lichuang force-pushed the more-test-write branch 2 times, most recently from 1304652 to 8335288 Compare January 4, 2026 09:13
Copy link
Contributor

@wjones127 wjones127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also changed the type in the PR title to test.

Comment on lines 1366 to 1367
#[tokio::test]
async fn test_stable_row_ids() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is already covered in test_new_row_ids and test_row_ids_append, right? If you think there's something missing in those test cases, can you add them there?

@wjones127 wjones127 changed the title chore: add more test cases to improve test coverage for the write functionality in Lance test: add more test cases to improve test coverage for the write functionality in Lance Jan 5, 2026
@wjones127 wjones127 self-assigned this Jan 5, 2026
Comment on lines +2381 to +2402
// Empty stream should be handled gracefully
// It should create an empty dataset or return an appropriate result
match result {
Ok((fragments, _)) => {
// If successful, verify it creates an empty result
assert!(
fragments.is_empty(),
"Empty stream should create no fragments"
);
}
Err(e) => {
// If it returns an error, verify it's an appropriate error type
let error_msg = e.to_string();
assert!(
error_msg.contains("empty")
|| error_msg.contains("no data")
|| error_msg.contains("batch"),
"Error should mention empty or no data: {}",
error_msg
);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should test for one behavior; as it would be a breaking change to have one behavior and change to another.

An empty stream should output empty fragments.

@codecov
Copy link

codecov bot commented Jan 6, 2026

Codecov Report

❌ Patch coverage is 88.25137% with 43 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance/src/dataset/write.rs 88.25% 34 Missing and 9 partials ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants