Skip to content

Multinomial logistic intercepts applied on incorrect scale #11763

@david-cortes

Description

@david-cortes

It appears the multinomial logistic objective is calculating a base score in the scale of the reponse variable (probabilities) and applying it as-is instead of transforming it to the 'margin' scale.

This shoud predict the class proportions for every observation:

import numpy as np
import xgboost as xgb

y = np.r_[
    np.zeros(25),
    np.ones(25),
    np.repeat(2, 50)
]
rng = np.random.default_rng(seed=123)
X = rng.standard_normal(size=(100,10))
model = xgb.XGBClassifier(
    n_estimators=1,
    gamma=1e10,
    min_child_weight=1e10,
).fit(X, y)
model.predict_proba(X[:5])
array([[0.30450436, 0.30450436, 0.39099133],
       [0.30450436, 0.30450436, 0.39099133],
       [0.30450436, 0.30450436, 0.39099133],
       [0.30450436, 0.30450436, 0.39099133],
       [0.30450436, 0.30450436, 0.39099133]], dtype=float32)

It looks like the intercepts are calculated correctly if seen in the scale of the response variable:

import json
json.loads(
    model.get_booster().save_config()
)["learner"]["learner_model_param"]["base_score"]
'[2.5E-1,2.5E-1,5E-1]'

.. but the probabilities are calculated by applynig softmax on that, without converting those numbers to the 'margin' scale:

from scipy.special import softmax
softmax(
    json.loads(
        json.loads(
            model.get_booster().save_config()
        )["learner"]["learner_model_param"]["base_score"]
    )
)
array([0.30450434, 0.30450434, 0.39099132])

And as a result, the predictions are biased.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions