Skip to content

Conversation

@nyamsprod
Copy link
Member

@nyamsprod nyamsprod commented Dec 12, 2025

@kocsismate, @TimWolla since it would mean another wall of text I choose to create this POC instead. in response to https://news-web.php.net/php.internals/129595

TL;DR:

  • We introduce a UriPathType Enum
  • We introduce a UriPathSegments class (Naming is not that important for now)

The UriPathSegments has 2 properties:

  • $type (UriPathType::Absolute or UriPathType::Relative) and the
  • $segments (an array list of decoded path segments).

Unless I am mistaking or forgetting something both URL and URI encode and decode path the same way. So in this case having 2 class is meaningless.

The path is always reconstructed safely using both properties (we stay away from how C# does its representation).

The class could/may be improved with more methods but for now I added the following methods:

namespace Uri {
    /**
     * @implements IteratorAggregate<int, string>
     */
    final class PathSegments implements Countable, IteratorAggregate
    {
        /**
         * Returns a new instance from a PathType and a list of segments as string
         * 
         * @param list<string> $segments
         */
        public static function fromSegments(PathType $type, array $segments): static;

        public function getType(): UriPathType;
    
        /**
         * The returned the decoded segment value or null
         * if no segment exists for the submitted index
         */
        public function get(int $index): ?string;
    
        /**
         * The returned decoded segments
         *
         * @return list<string>
         */
        public function getAll(): array;
    
        /**
         * The Iterator version of getAll
         *
         * @return Iterator<int, string>
         */
        public function getIterator(): Iterator;
    
        /**
         * Returns the number of segment
         */
        public function count(): int;
    
        /**
         * Returns the first decoded segment or null if no segment exists
         */
        public function getFirst(): ?string;
    
        /**
         * Returns the last decoded segment or null if no segment exists
         */
        public function getLast(): ?string;
    
        /**
         * Tells whether the given decoded segment exists in the current path
         */
        public function has(string $segment): bool;
    
        /**
         * Returns the index of the first occurrence of the given decoded segment
         * if it exists or null if the segment is not found.
         */
        public function getIndexOf(string $segment): ?int;
    
        /**
         * Returns the index of the last occurrence of the given decoded segment
         * if it exists or null if the segment is not found.
         */
        public function getLastIndexOf(string $segment): ?int;
    
        /**
         * Returns the raw encoded path.
         */
        public function toRawString(): string;
    
        /**
         * Retruns the path normalized using the remove dot segments algorithm.
         */
        public function toString(): string;
    
        /**
         * Returns a new instance with a new type
         */
        public function withType(UriPathType $type): static;
    
        /**
         * Returns a new instance with new segments
         *
         * @param list<string> $segments
         */
        public function withSegments(array $segments): static;
    
        public function __debugInfo(): array;
    
        /**
         * @return array{0: array{path: string}, 1: array{}}
         */
        public function __serialize(): array;
    
        /**
         * @param array{0: array{path: string}, 1: array{}} $data
         *
         * @throws Exception|Uri\InvalidUriException
         */
        public function __unserialize(array $data): void;
    }
}

Of note in the case of a class created from Url::getPath there should be differences between both toRawString and toString string representation but it can for the path created from Url::getRawPath.

IMHO this version improve DX and avoid adding with* and get* methods on the URI/Url class because not all Url/Uri uses segments good examples includes the URI/URL with the data or the urn scheme.

@nyamsprod nyamsprod self-assigned this Dec 12, 2025
@nyamsprod nyamsprod marked this pull request as draft December 12, 2025 15:37
@nyamsprod
Copy link
Member Author

nyamsprod commented Dec 12, 2025

For reference I use the following script to test the output

use Uri\Rfc3986\Uri;
use Uri\UriPathSegments;

$uriLists = [
    "https://example.com",
    "https://example.com/",
    "https://example.com/foo",
    "https://example.com/foo/",
    "foo/",
    "foo",
    "/foo",
    "/",
    '/a/b/c/./../../g',
];

foreach ($uriLists as $uri) {
    $pathSegments = new UriPathSegments(new Uri($uri)->getRawPath());
    dump([
        'uri' => $uri,
        'uri raw path' => $uriObject->getRawPath(),
        'uri path' => $uriObject->getPath(),
        'path type' => $pathSegments->getType(),
        'path segments' => $pathSegments->getAll(),
        'path raw string' => $pathSegments->toRawString(),
        'path string' => $pathSegments->toString(),
    ]);
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants