Skip to content

Conversation

@dahbka-lis
Copy link

@dahbka-lis dahbka-lis commented Oct 14, 2025

Rationale for this change

There is a bug for creating union types with empty type_codes. If fields.size() == 128 (kMaxTypeCode + 1) and type_codes is empty, static_cast<int8_t> returns -128 and internal::Iota generates an empty vector of type codes, but the expected vector is [0, 1, 2, ..., 127], where 127 is kMaxTypeCode.

What changes are included in this PR?

  • Added a new internal::Iota function to generate vectors of size = length with values from start.
  • Changed internal::Iota call from old parameters to new for creating dense_union and sparse_union types.
  • Implemented a new test to detect this error.

Are these changes tested?

Yes, there is a new test.

Are there any user-facing changes?

No.

This PR contains a "Critical Fix".
(b) a bug that caused incorrect or invalid data to be produced

If fields.size() == 128 (kMaxTypeCode + 1), internal::Iota returns an empty vector that cannot validate here:
https://github.com/apache/arrow/blob/main/cpp/src/arrow/type.cc#L1232

@github-actions
Copy link

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose

Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename the pull request title in the following format?

GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

See also:

@dahbka-lis dahbka-lis changed the title [C++] Fix creating union types without type_codes for fields.size() == 128 MINOR: [C++] Fix creating union types without type_codes for fields.size() == 128 Oct 14, 2025
@dahbka-lis
Copy link
Author

dahbka-lis commented Oct 18, 2025

Hi, @kou! Do you know someone who can review this PR? It looks like a bug rather than a "feature"

Copy link
Member

@raulcd raulcd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!
This doesn't fit our definition of MINOR, could you please open an issue for this change? See our definition for MINOR:
https://github.com/apache/arrow/blob/main/CONTRIBUTING.md#minor-fixes
Thanks!

@dahbka-lis dahbka-lis changed the title MINOR: [C++] Fix creating union types without type_codes for fields.size() == 128 GH-47859: [C++] Fix creating union types without type_codes for fields.size() == 128 Oct 18, 2025
@github-actions
Copy link

⚠️ GitHub issue #47859 has been automatically assigned in GitHub to PR creator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants