This repository for the 5th Place Model in the Ad Click-Through Rate (CTR) Prediction Competition with toss.
We use poetry to manage dependencies of repository.
Use poetry with version 2.1.1.
$ poetry --version
Poetry (version 2.1.1)Python version should be 3.11.x.
$ python --version
Python 3.11.11If python version is lower than 3.11, try installing required version using pyenv.
Create virtual environment.
$ poetry env activateIf your global python version is not 3.11, run following command.
$ poetry env use python3.11You can check virtual environment path info and its executable python path using following command.
$ poetry env infoAfter setting up python version, just run following command which will install all the required packages from poetry.lock.
$ poetry installSet up automatic linting using the following commands:
# This command will ensure linting runs automatically every time you commit code.
poetry run pre-commit installIf you want to add package to pyproject.toml, please use following command.
$ poetry add "package==1.0.0"Then, update poetry.lock to ensure that repository members share same environment setting.
$ poetry lockLightGBM DART
Dart introduces dropout into gradient boosting, randomly dropping trees during training to prevent overfitting and improve generalization.
We found DART particularly effective for this task because:
- It handles sparse and high-cardinality categorical features efficiently.
- It achieves stable validation performance across folds.
- It generalizes well under data drift and imbalanced conditions.
While we also experimented with XGBoost, CatBoost, and Deep Cross Network (DCN),DART consistently served as the backbone model and delivered the highest overall reliability. Final submissions were built around DART and refined through ensemble blending with complementary models.
Seq-aware DCN
DCN with MHA encoded seq feature.
Seq-aware DCN V2
DCN V2 with MHA encoded seq feature.
| Model | CV Score | Public LB | Private LB | Chosen for Ensemble |
|---|---|---|---|---|
| Sigmoid Ensemble | - | 0.35126 | 0.35073 | FINAL |
| LightGBM | 0.35501 | 0.35024 | 0.34960 | O |
| XGBoost | 0.35489 | 0.34788 | 0.34757 | O |
| dcn_v2_seq | 0.35375 | 0.34471 | 0.34452 | O |
| dcn_seq | 0.35345 | 0.34645 | 0.34602 | O |
| CatBoost | 0.34348 | 0.34804 | 0.34790 | X |
| dcn_v2 | 0.35395 | 0.34512 | NA | X |
| dcn | 0.34709 | 0.346708 | X | |
| ffm_seq | NA | NA | X | |
| ffm | 0.34579 | 0.34565 | X | |
| xdeepfm_seq | NA | NA | X | |
| xdeepfm | 0.34861 | 0.34321 | NA | X |
| deepfm_seq | 0.35219 | 0.34497 | NA | X |
| deepfm | NA | NA | X | |
| fm_seq | NA | NA | X | |
| fm | NA | NA | X | |
| fibinet | NA | NA | X |
| Model | Config Path |
|---|---|
| LightGBM | config/models/lightgbm.yaml |
| XGBoost | config/models/xgboost.yaml |
| CatBoost | config/models/catboost.yaml |
| All FM Models | config/models/fm.yaml |
Place the following files inside the input/toss-next-challenge/ directory:
โโโ input
ย ย โโโ toss-next-challenge
ย ย โโโ sample_submission.csv
ย ย โโโ test.parquet
ย ย โโโ train.parquet
-
train
$ sh scripts/train.sh
-
inference
$ sh scripts/inference.sh
tree4-dcn2-mha-concatmod-sigmoid-ensemble.csvis final submission.- Please use this CSV file for evaluation.


