feat(enqueueLinks): add "allowedSubdomains" option for subdomain filtering in "same-domain" strategy #3098
+420
−16
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
This PR introduces a new
enqueueLinksoption calledallowedSubdomainswhich takes in a string array to filter user-defined subdomains and allows users to have simplified control of subdomain access more precisely. Furthermore, this includes new documentation and testing to ensure its capabilities work consistently.allowedSubdomains- The newenqueueLinksoption which filters subdomains by user's choice.By default,
allowedSubdomainsis set to['*']if not specified.Note: This option can only be used in EnqueueStrategy
same-domaindue to its natural behavior of allowing any subdomain under the same domain.Implementation
The enhanced
same-domainstrategy has several modifications that allow users to add specific subdomains intoenqueueStrategyPatterns:same-domainifallowedSubdomainsis either set to['*']or[], granting backwards compatibility.allowedSubdomainswhen at least one subdomain is found.options.baseUrl) intoenqueueStrategyPatterns.allowedSubdomainand sets the hostname of the newfilteredSubdomainUrl.filteredSubdomainUrlintoenqueueStrategyPatternswhile avoiding a duplicate of the URL origin.enqueueStrategyPatterns.As it turns out, the major difference with this is replacing the asterisk that is in front of the domain normally in
same-domain's former algorithm.Example
Assume that
allowedSubdomains: ['www', 'blog']and the base URL ishttps://example.com.Before (without
allowedSubdomains):After (with
allowedSubdomains):Use Cases
Here are the conditions that would be affected based on how
allowedSubdomainsis checked:allowedSubdomains: [''], it should still accept it as subdomain filtering because this means that there is no other subdomain that should be accepted other than the apex (the original URL) itself.allowedSubdomains: [], it should automatically handle requests with the default behavior because the user never specified whether subdomains should be filtered or not.allowedSubdomains: ['*']or[sub1, sub2, ..., '*'](includes the asterisk), it will always automatically handle requests with the default behavior because the definition of asterisk is equivalent to accepting any subdomain.Documentation Updates
This PR includes documentation that:
allowedSubdomainsoption with a simple definition and use case.Testing Improvements
This PR also includes new tests:
enqueue_links.test.tsto validate the behavior of theallowedSubdomainsoption with various configurations.HTML_WITH_SUBDOMAINS) to facilitate testing of subdomain filtering.Contributors
Closes #3099
Alternative solution to #2513