Skip to content

Conversation

@Lucas61000
Copy link

Changes Made

  1. Added support​ for STRUCT<field_0: type_0, field_1: type_1> syntax
  2. ​All other syntax variations​ will fail with error messages

Related Issues

#4448

Checklist

  • Documented in API Docs (if applicable)
  • Documented in User Guide (if applicable)
  • If adding a new documentation page, doc is added to docs/mkdocs.yml navigation
  • Documentation builds and is formatted properly

@github-actions github-actions bot added the fix label Oct 27, 2025
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

Added support for parsing STRUCT<field: type> syntax with angle brackets and colons in SQL type definitions. The implementation validates colon usage and explicitly rejects the STRUCT(...) parentheses syntax with a helpful error message.

Key changes:

  • Preprocessing step tokenizes input to detect STRUCT syntax before passing to sqlparser
  • Validates that colons appear consistently (either 0 or exactly one per field)
  • Replaces colons with spaces so sqlparser can parse the modified syntax
  • Added tests for both STRUCT<a bool, b int> and STRUCT<a: bool, b: int> syntaxes

Issues found:

  • Critical: Line 75 uses global string replacement (s.replace(':', " ")) which affects ALL colons in the input, not just those within the STRUCT definition. This will corrupt inputs containing colons outside the STRUCT context.
  • The validation at lines 112-115 doesn't prevent keywords like STRUCT from appearing before a colon, which could allow invalid nested struct syntax to pass validation.

Confidence Score: 2/5

  • This PR has a critical bug that will corrupt inputs with colons outside STRUCT definitions
  • While the PR successfully addresses the original issue of supporting STRUCT syntax with angle brackets and colons, it contains a critical logic error on line 75 where s.replace(':', " ") performs a global replacement of all colons in the entire input string rather than only those within the STRUCT definition. This will break any SQL type strings containing colons elsewhere (e.g., nested contexts, or if combined with other syntax). The validation logic also has gaps for nested struct edge cases.
  • src/daft-sql/src/schema.rs requires immediate attention to fix the global colon replacement bug

Important Files Changed

File Analysis

Filename Score Overview
src/daft-sql/src/schema.rs 3/5 Added STRUCT type parsing with angle brackets and colon syntax, validates colon usage, rejects parentheses syntax; found critical issue with global colon replacement

Sequence Diagram

sequenceDiagram
    participant User
    participant try_parse_dtype
    participant check_and_modify_struct
    participant validate_colons
    participant Tokenizer
    participant Parser
    participant sql_dtype_to_dtype

    User->>try_parse_dtype: "STRUCT<a: INT, b: STRING>"
    try_parse_dtype->>check_and_modify_struct: Check for STRUCT syntax
    check_and_modify_struct->>Tokenizer: Tokenize input
    Tokenizer-->>check_and_modify_struct: Token list
    
    alt STRUCT with parentheses
        check_and_modify_struct-->>try_parse_dtype: Error: Use angle brackets
    else STRUCT with angle brackets
        check_and_modify_struct->>check_and_modify_struct: Find matching brackets (depth tracking)
        check_and_modify_struct->>validate_colons: Validate inner tokens
        validate_colons->>validate_colons: Count colons and commas
        validate_colons->>validate_colons: Verify colon positions
        validate_colons-->>check_and_modify_struct: OK or Error
        check_and_modify_struct-->>try_parse_dtype: Modified string (colons replaced with spaces)
    else No STRUCT keyword
        check_and_modify_struct-->>try_parse_dtype: None (no modification)
    end
    
    try_parse_dtype->>Tokenizer: Tokenize final string
    Tokenizer-->>try_parse_dtype: Tokens
    try_parse_dtype->>Parser: Parse with sqlparser
    Parser-->>try_parse_dtype: DataType AST
    try_parse_dtype->>sql_dtype_to_dtype: Convert to Daft DataType
    sql_dtype_to_dtype-->>try_parse_dtype: DataType::Struct(fields)
    try_parse_dtype-->>User: Parsed DataType
Loading

1 file reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

@codecov
Copy link

codecov bot commented Oct 27, 2025

Codecov Report

❌ Patch coverage is 74.35897% with 20 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.74%. Comparing base (779074e) to head (f719176).
⚠️ Report is 14 commits behind head on main.

Files with missing lines Patch % Lines
src/daft-sql/src/schema.rs 74.35% 20 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #5449      +/-   ##
==========================================
+ Coverage   71.58%   71.74%   +0.16%     
==========================================
  Files         993      998       +5     
  Lines      125932   126956    +1024     
==========================================
+ Hits        90145    91085     +940     
- Misses      35787    35871      +84     
Files with missing lines Coverage Δ
src/daft-sql/src/schema.rs 74.43% <74.35%> (-0.22%) ⬇️

... and 66 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Lucas61000 and others added 2 commits October 29, 2025 11:14
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant