Skip to content

Conversation

@radoering
Copy link
Member

@radoering radoering commented Nov 19, 2025

Especially, since we do support other build backends than poetry-core this is more correct.

Pull Request Check List

This resolves an issue noticed in python-poetry/poetry-core#894 (comment)

  • Added tests for changed code.
  • Updated documentation for changed code.

Summary by Sourcery

Revise the package upload logic to derive metadata from distribution files (wheel and sdist) instead of the project, improving compatibility with alternative build backends.

New Features:

  • Parse metadata directly from wheel and sdist artifacts for uploads

Bug Fixes:

  • Use artifact metadata instead of project metadata to support non-poetry-core build backends

Enhancements:

  • Refactor post_data into a classmethod and streamline upload payload construction by iterating over artifact metadata
  • Introduce a _get_metadata helper to extract and normalize metadata from .whl and .tar.gz files
  • Convert _get_type to a staticmethod with explicit return type annotations

Tests:

  • Add tests for Uploader.post_data with both wheel and sdist distributions

@sourcery-ai
Copy link

sourcery-ai bot commented Nov 19, 2025

Reviewer's Guide

The PR refactors the uploader to source package metadata directly from the distribution artifacts (wheel or sdist) instead of the project model, simplifying and unifying the upload payload construction.

Class diagram for updated Uploader metadata handling

classDiagram
    class Uploader {
        +upload(session, url, dry_run, skip_existing)
        +post_data(file: Path) dict[str, Any]
        +_upload_file(file: Path)
        +_prepare_data(data: dict[str, Any]) list[tuple[str, str]]
        +_get_type(file: Path) Literal["bdist_wheel", "sdist"]
        +_get_metadata(file: Path) RawMetadata
        +_is_file_exists_error(response: requests.Response) bool
        -_package
    }
    Uploader ..> HashManager
    Uploader ..> RawMetadata
    Uploader ..> Path
    Uploader ..> requests.Response
Loading

Flow diagram for artifact metadata extraction in upload process

flowchart TD
    A["Start upload"] --> B["Select file (wheel or sdist)"]
    B --> C["Determine file type"]
    C --> D["Extract metadata from artifact"]
    D --> E["Construct upload payload from artifact metadata"]
    E --> F["Send payload to upload API"]
    F --> G["Finish upload"]
Loading

File-Level Changes

Change Details Files
post_data refactored to use artifact metadata
  • Converted post_data to a classmethod and removed Metadata.from_package
  • Initialized data dict with digests, filetype, and pyversion only
  • Iterated over keys from _get_metadata to populate remaining fields
src/poetry/publishing/uploader.py
Introduced _get_metadata to parse METADATA/PKG-INFO
  • Added staticmethod to extract RawMetadata from wheel dist-info
  • Added support for reading PKG-INFO from sdist tar.gz
  • Raised errors when metadata files are missing or unsupported formats
src/poetry/publishing/uploader.py
Simplified upload helper methods
  • Changed _get_type to a staticmethod returning a Literal
  • Merged action and protocol_version into a single data.update call
src/poetry/publishing/uploader.py
Added and updated tests for post_data
  • Created tests verifying post_data output for wheel artifacts
  • Created tests verifying post_data output for sdist artifacts
tests/publishing/test_uploader.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location> `src/poetry/publishing/uploader.py:139-155` </location>
<code_context>
-        # Metadata 2.1
-        if meta.description_content_type:
-            data["description_content_type"] = meta.description_content_type
+        for key, value in cls._get_metadata(file).items():
+            # strip trailing 's' to match API field names
+            # see https://docs.pypi.org/api/upload/
+            if key in {"platforms", "supported_platforms", "license_files"}:
+                key = key[:-1]

-        # TODO: Provides extra
+            # revert some special cases from packaging.metadata.parse_email()
+
+            # "keywords" is not "multiple use" but a comma-separated string
+            if key == "keywords":
+                value = ", ".join(value)
+
+            # "project_urls" is not a dict
+            if key == "project_urls":
+                value = [f"{k}, {v}" for k, v in value.items()]
+
+            data[key] = value

         return data
</code_context>

<issue_to_address>
**suggestion:** Stripping trailing 's' from metadata keys may cause mismatches with expected API fields.

Centralize the mapping of plural to singular keys or use explicit key handling to prevent future mismatches if API field names change or new plural keys are added.

```suggestion
        # Centralized mapping for plural-to-singular metadata keys
        PLURAL_TO_SINGULAR_KEYS = {
            "platforms": "platform",
            "supported_platforms": "supported_platform",
            "license_files": "license_file",
        }

        for key, value in cls._get_metadata(file).items():
            # Use explicit mapping for plural keys
            if key in PLURAL_TO_SINGULAR_KEYS:
                key = PLURAL_TO_SINGULAR_KEYS[key]

            # revert some special cases from packaging.metadata.parse_email()

            # "keywords" is not "multiple use" but a comma-separated string
            if key == "keywords":
                value = ", ".join(value)

            # "project_urls" is not a dict
            if key == "project_urls":
                value = [f"{k}, {v}" for k, v in value.items()]

            data[key] = value
```
</issue_to_address>

### Comment 2
<location> `src/poetry/publishing/uploader.py:148-149` </location>
<code_context>
+            # revert some special cases from packaging.metadata.parse_email()
+
+            # "keywords" is not "multiple use" but a comma-separated string
+            if key == "keywords":
+                value = ", ".join(value)
+
+            # "project_urls" is not a dict
</code_context>

<issue_to_address>
**nitpick:** Joining keywords with a comma and space may not match all consumers' expectations.

Verify that the API documentation specifies this format for keywords, or adjust to use only a comma if needed.
</issue_to_address>

### Comment 3
<location> `src/poetry/publishing/uploader.py:152-153` </location>
<code_context>
+                value = ", ".join(value)
+
+            # "project_urls" is not a dict
+            if key == "project_urls":
+                value = [f"{k}, {v}" for k, v in value.items()]
+
+            data[key] = value
</code_context>

<issue_to_address>
**issue (bug_risk):** Formatting project_urls as a list of comma-separated strings may not match API expectations.

Verify the required format for project_urls in the upload API to ensure compatibility.
</issue_to_address>

### Comment 4
<location> `src/poetry/publishing/uploader.py:295` </location>
<code_context>

-    def _get_type(self, file: Path) -> str:
+    @staticmethod
+    def _get_type(file: Path) -> Literal["bdist_wheel", "sdist"]:
         exts = file.suffixes
         if exts[-1] == ".whl":
</code_context>

<issue_to_address>
**issue:** Using file.suffixes without checking for empty lists may raise IndexError.

Accessing exts[-1] and exts[-2:] without checking if exts is non-empty can cause IndexError. Please add a check to handle cases where file.suffixes is empty.
</issue_to_address>

### Comment 5
<location> `src/poetry/publishing/uploader.py:305-314` </location>
<code_context>
+    def _get_metadata(file: Path) -> RawMetadata:
</code_context>

<issue_to_address>
**suggestion:** The _get_metadata method does not handle .zip sdists, which are valid in some cases.

Support for .zip sdists should be added for compatibility, or the limitation should be documented if intentional.
</issue_to_address>

### Comment 6
<location> `tests/publishing/test_uploader.py:162-171` </location>
<code_context>
+def test_uploader_post_data_wheel(fixture_dir: FixtureDirGetter) -> None:
</code_context>

<issue_to_address>
**suggestion (testing):** Consider adding tests for wheels with missing or malformed METADATA files.

Tests should also cover scenarios where the wheel lacks a METADATA file or has a malformed METADATA, verifying that Uploader.post_data raises the appropriate exceptions.
</issue_to_address>

### Comment 7
<location> `tests/publishing/test_uploader.py:203-200` </location>
<code_context>
+def test_uploader_post_data_sdist(fixture_dir: FixtureDirGetter) -> None:
</code_context>

<issue_to_address>
**suggestion (testing):** Add tests for sdist files with missing or malformed PKG-INFO.

Please add tests for sdists missing PKG-INFO or with malformed PKG-INFO to ensure Uploader.post_data and _get_metadata handle these cases as expected.

Suggested implementation:

```python
def test_uploader_post_data_sdist(fixture_dir: FixtureDirGetter) -> None:
    file = fixture_dir("simple_project") / "dist" / "simple_project-1.2.3.tar.gz"
    assert Uploader.post_data(file) == {
        "md5_digest": "e611cbb8f31258243d90f7681dfda68a",
        "sha256_digest": "c4a72becabca29ec2a64bf8c820bbe204d2268f53e102501ea5605bc1c1675d1",
        "blake2_256_digest": "d3df22f4944f6acd02105e7e2df61ef63c7b0f4337a12df549ebc2805a13c2be",
        "filetype": "sdist",
        "pyversion": "source",
        "metadata_version": "2.1",
        "name": "simple-project",
        "version": "1.2.3",
    }

import io
import tarfile
import tempfile
import pytest

def _make_sdist_without_pkg_info():
    tmp = tempfile.NamedTemporaryFile(suffix=".tar.gz", delete=False)
    with tarfile.open(tmp.name, "w:gz") as tar:
        # Add a dummy file, but no PKG-INFO
        info = tarfile.TarInfo(name="simple_project-1.2.3/dummy.txt")
        data = b"dummy"
        info.size = len(data)
        tar.addfile(info, io.BytesIO(data))
    return tmp.name

def _make_sdist_with_malformed_pkg_info():
    tmp = tempfile.NamedTemporaryFile(suffix=".tar.gz", delete=False)
    with tarfile.open(tmp.name, "w:gz") as tar:
        # Add malformed PKG-INFO
        info = tarfile.TarInfo(name="simple_project-1.2.3/PKG-INFO")
        data = b"This is not a valid PKG-INFO"
        info.size = len(data)
        tar.addfile(info, io.BytesIO(data))
    return tmp.name

def test_uploader_post_data_sdist_missing_pkg_info():
    sdist_path = _make_sdist_without_pkg_info()
    with pytest.raises(Exception):
        Uploader.post_data(sdist_path)

def test_uploader_post_data_sdist_malformed_pkg_info():
    sdist_path = _make_sdist_with_malformed_pkg_info()
    with pytest.raises(Exception):
        Uploader.post_data(sdist_path)

```

- If Uploader.post_data does not currently raise an Exception for these cases, you may need to adjust the assertion to check for None or partial data, or update Uploader to raise a specific error.
- You may want to use more specific exception types if Uploader.post_data raises them (e.g., ValueError).
- If you have a _get_metadata function exposed, you can add similar tests for it.
</issue_to_address>

### Comment 8
<location> `src/poetry/publishing/uploader.py:324-325` </location>
<code_context>
    @staticmethod
    def _get_metadata(file: Path) -> RawMetadata:
        if file.suffix == ".whl":
            with zipfile.ZipFile(file) as z:
                for name in z.namelist():
                    parts = Path(name).parts
                    if (
                        len(parts) == 2
                        and parts[1] == "METADATA"
                        and parts[0].endswith(".dist-info")
                    ):
                        with z.open(name) as mf:
                            return parse_email(mf.read().decode("utf-8"))[0]
            raise FileNotFoundError("METADATA not found in wheel")

        elif file.suffixes[-2:] == [".tar", ".gz"]:
            with tarfile.open(file, "r:gz") as tar:
                for member in tar.getmembers():
                    parts = Path(member.name).parts
                    if len(parts) == 2 and parts[1] == "PKG-INFO":
                        pf = tar.extractfile(member)
                        if pf:
                            return parse_email(pf.read().decode("utf-8"))[0]
            raise FileNotFoundError("PKG-INFO not found in sdist")

        raise ValueError(f"Unsupported file type: {file}")

</code_context>

<issue_to_address>
**suggestion (code-quality):** Use named expression to simplify assignment and conditional ([`use-named-expression`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-named-expression/))

```suggestion
                        if pf := tar.extractfile(member):
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@radoering radoering force-pushed the publish-artifact-metadata branch from 106dbf0 to e102e6e Compare November 19, 2025 17:33
Especially, since we do support other build backends than poetry-core this is more correct.
@radoering radoering force-pushed the publish-artifact-metadata branch from e102e6e to c1b5f21 Compare November 21, 2025 05:25
@radoering radoering enabled auto-merge (squash) November 21, 2025 05:25
@radoering radoering merged commit 92190c5 into python-poetry:main Nov 21, 2025
63 checks passed
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 22, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants