Skip to content

Conversation

@hoffmaen
Copy link
Contributor

@hoffmaen hoffmaen commented Nov 11, 2025

Summary

This Pull-Request implements #505.

  • Implement hash-based routing in Gorouter
    • Gorouter routes requests to endpoints based on the value of a configured request header
    • Gorouter distributes requests to different predefined application instances, when the current request count to an app instance exceeds the balance factor
    • Gorouter routes requests to the "next instance" in the lookup table on retries
  • Follow decisions documented in this comment

Backward Compatibility

Breaking Change? Yes/No

a18e and others added 3 commits November 10, 2025 15:26
This commit provides the basic implementation for hash-based routing. It does not consider the balance factor yet.

Co-authored-by: Clemens Hoffmann <[email protected]>
Co-authored-by: Tamara Boehm <[email protected]>
Co-authored-by: Soha Alboghdady <[email protected]>
Comment on lines 130 to 143
if reqInfo.RoutePool.LoadBalancingAlgorithm == config.LOAD_BALANCE_HB {
if reqInfo.RoutePool.HashRoutingProperties == nil {
rt.logger.Error("hash-routing-properties-nil", slog.String("host", reqInfo.RoutePool.Host()))

} else {
headerName := reqInfo.RoutePool.HashRoutingProperties.Header
headerValue := request.Header.Get(headerName)
if headerValue != "" {
iter.(*route.HashBased).HeaderValue = headerValue
} else {
iter = reqInfo.RoutePool.FallBackToDefaultLoadBalancing(rt.config.LoadBalance, rt.logger, stickyEndpointID, mustBeSticky, rt.config.LoadBalanceAZPreference, rt.config.Zone)
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer if we could keep the round tripper clear of any algorithm specific logic. Could we instead modify the function which returns the endpoint iterator to also accept the request as a whole and pass that into the hash-specific implementation? That way we also keep the door open on extending for other properties to hash on as well as remove the need for ugly casts like iter.(*route.HashBased).HeaderValue which can cause nasty bugs.

Comment on lines 3 to 11
/******************************************************************************
* Original github.com/kkdai/maglev/maglev.go
*
* Copyright (c) 2019 Evan Lin (github.com/kkdai)
*
* This program and the accompanying materials are made available under
* the terms of the Apache License, Version 2.0 which is available at
* http://www.apache.org/licenses/LICENSE-2.0.
******************************************************************************/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use // for comments, Apache-2.0 also mandates that you clearly state if you modified the file which it seems like you did.

// new one.
e.Lock()
defer e.Unlock()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid such unrelated changes.

@b1tamara b1tamara force-pushed the hash-based-routing-with-overload branch from e9cbe0c to 7b456f8 Compare December 3, 2025 09:32
@hoffmaen hoffmaen force-pushed the hash-based-routing-with-overload branch from 9f21f6b to 513c575 Compare December 9, 2025 08:24
Copy link
Contributor

@peanball peanball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First set of comments. I have not looked at the implementation of maglev and the hash based instance determination.

I've focussed on the integration with the existing Gorouter code, route lookup, LB algo handling, etc.

switch defaultLBAlgo {
case config.LOAD_BALANCE_LC:
logger.Debug("endpoint-iterator-with-least-connection-lb-algo")
return NewLeastConnection(logger, p, initial, mustBeSticky, azPreference == config.AZ_PREF_LOCAL, az)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you pass in azPreference as boolean, like for the other LB algos, you don't need to repeat the condition in-line 3 times.

}

func (p *EndpointPool) Endpoints(logger *slog.Logger, initial string, mustBeSticky bool, azPreference string, az string) EndpointIterator {
func (p *EndpointPool) Endpoints(logger *slog.Logger, initial string, mustBeSticky bool, azPreference string, az string, globalLB string, request *http.Request) EndpointIterator {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're passing in the full request to then dynamically fetch the hash based header. Why not pass an optional reference to *HashRoutingProperties? Parsing then can be done at the call site where you have the request anyway.

Comment on lines +484 to +487
if p.HashRoutingProperties == nil || request.Header.Get(p.HashRoutingProperties.Header) == "" {
logger.Error("hash-routing-properties-missing", slog.String("host", p.Host()))
return p.FallBackToDefaultLoadBalancing(globalLB, logger, initial, mustBeSticky, azPreference, az)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the sake of clarity you could pull this out into func validateHashBasedInputs(poolProperties *HashBasedProperties, requestProperties *HashRoutingValues, logger *slog.Logger) error
This does not return an error if everything is fine, but logs the error.

Logic then reads:

  • if not validated, return fallback (Early exit)
  • otherwise continue doing hash based stuff.

No change to the logic as it is.

Alternatively you could pull out the whole logic and just have it return the proper LB algo implementation (Fallback or hash-based) in the switch block.

logger.Debug("endpoint-iterator-with-round-robin-lb-algo")
return NewRoundRobin(logger, p, initial, mustBeSticky, azPreference == config.AZ_PREF_LOCAL, az)
}
return NewRoundRobin(logger, p, initial, mustBeSticky, azPreference == config.AZ_PREF_LOCAL, az)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could also move this to the default: to make it clear in the switch. There is also no debug log for this branch.

}

func (p *EndpointPool) FallBackToDefaultLoadBalancing(defaultLBAlgo string, logger *slog.Logger, initial string, mustBeSticky bool, azPreference string, az string) EndpointIterator {
logger.Info("hash-based-routing-header-not-found",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The information could all be in a single log:

  • hash based was intended
  • routing header not found, and the info which one
  • the selected fallback (i.e. defaultLBAlgo)

This log could also be in a deferred closure, in case you need to set any data in the switch.

logger.Error("hash-routing-properties-missing", slog.String("host", p.Host()))
return p.FallBackToDefaultLoadBalancing(globalLB, logger, initial, mustBeSticky, azPreference, az)
}
headerValue := request.Header.Get(p.HashRoutingProperties.Header)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would also remove looking up the header twice.

As @maxmoehl mentioned, if we change/extend the logic for hash based routing, it would be good if the data and the decision somewhat central (in a data structure and a function), and not referenced with individual fields in 20 different locations.

Comment on lines +52 to +55
endpoint := h.findEndpointIfStickySession()
if endpoint == nil && h.mustBeSticky {
return nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this not be turned around? if h.mustBeSticky is false we can skip the whole findEndpointIfStickySession() call?

The rest of the non-sticky handling remains as is.


p := r.Lookup("foo.com")
Expect(p.Endpoints(logger.Logger, "", false, azPreference, az).Next(0).ModificationTag).To(Equal(modTag))
Expect(p.Endpoints(logger.Logger, "", false, azPreference, az, r.DefaultLoadBalancingAlgorithm, nil).Next(0).ModificationTag).To(Equal(modTag))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just looking at these changes that added 2 parameters, and may add more in the future, we might consider having an optional "*RoutingOptions" struct or something that allows passing extra info if needed. This might include the default LB algo, hash based routing config, and extracted values for now.

For other LB algos this could then contain other information. Could be an interface that is then implemented by different structs.

Arguably, the AZ affinity configuration falls into the same category.

if err != nil {
return nil, fmt.Errorf("could not find reqInfo in context")
}
stickyEndpointID, mustBeSticky := GetStickySession(request, stickySessionCookieNames, authNegotiateSticky)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar to GetStickySession() there could be a GetRoutingOptions that extracts info from the request. See other comments where I complain about passing the request around.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

5 participants