toss-next-challenge-solution

This repository for the 5th Place Model in the Ad Click-Through Rate (CTR) Prediction Competition with toss.

Setting up environment

We use poetry to manage dependencies of repository.

Use poetry with version 2.1.1.

$ poetry --version
Poetry (version 2.1.1)

Python version should be 3.11.x.

$ python --version
Python 3.11.11

If python version is lower than 3.11, try installing required version using pyenv.

Create virtual environment.

$ poetry env activate

If your global python version is not 3.11, run following command.

$ poetry env use python3.11

You can check virtual environment path info and its executable python path using following command.

$ poetry env info

After setting up python version, just run following command which will install all the required packages from poetry.lock.

$ poetry install

Setting up git hook

Set up automatic linting using the following commands:

# This command will ensure linting runs automatically every time you commit code.
poetry run pre-commit install

Note

If you want to add package to pyproject.toml, please use following command.

$ poetry add "package==1.0.0"

Then, update poetry.lock to ensure that repository members share same environment setting.

$ poetry lock

Architecture of Our Solution

Ensemble Architecture

Boosting

LightGBM DART

Dart introduces dropout into gradient boosting, randomly dropping trees during training to prevent overfitting and improve generalization.

We found DART particularly effective for this task because:

It handles sparse and high-cardinality categorical features efficiently.
It achieves stable validation performance across folds.
It generalizes well under data drift and imbalanced conditions.

While we also experimented with XGBoost, CatBoost, and Deep Cross Network (DCN),DART consistently served as the backbone model and delivered the highest overall reliability. Final submissions were built around DART and refined through ensemble blending with complementary models.

Deep Cross Network Architecture

Seq-aware DCN

DCN with MHA encoded seq feature.

Seq-aware DCN V2

DCN V2 with MHA encoded seq feature.

Implemented models

Model	CV Score	Public LB	Private LB	Chosen for Ensemble
Sigmoid Ensemble	-	0.35126	0.35073	FINAL
LightGBM	0.35501	0.35024	0.34960	O
XGBoost	0.35489	0.34788	0.34757	O
dcn_v2_seq	0.35375	0.34471	0.34452	O
dcn_seq	0.35345	0.34645	0.34602	O
CatBoost	0.34348	0.34804	0.34790	X
dcn_v2	0.35395	0.34512	NA	X
dcn		0.34709	0.346708	X
ffm_seq		NA	NA	X
ffm		0.34579	0.34565	X
xdeepfm_seq		NA	NA	X
xdeepfm	0.34861	0.34321	NA	X
deepfm_seq	0.35219	0.34497	NA	X
deepfm		NA	NA	X
fm_seq		NA	NA	X
fm		NA	NA	X
fibinet		NA	NA	X

Experiment Configurations

Model	Config Path
LightGBM	config/models/lightgbm.yaml
XGBoost	config/models/xgboost.yaml
CatBoost	config/models/catboost.yaml
All FM Models	config/models/fm.yaml

How to Run Our Solution

1. Prepare the input data

Place the following files inside the input/toss-next-challenge/ directory:

├── input
   └── toss-next-challenge
       ├── sample_submission.csv
       ├── test.parquet
       └── train.parquet

2. Run the following script:

train
```
$ sh scripts/train.sh
```
inference
```
$ sh scripts/inference.sh
```

3. The final submission file will be generated in the output folder as

tree4-dcn2-mha-concatmod-sigmoid-ensemble.csv is final submission.
Please use this CSV file for evaluation.

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
.github/workflows		.github/workflows
config		config
input/toss-next-challenge		input/toss-next-challenge
notebook		notebook
output		output
res		res
scripts		scripts
src		src
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

toss-next-challenge-solution

Setting up environment

Setting up git hook

Note

Architecture of Our Solution

Ensemble Architecture

Boosting

Deep Cross Network Architecture

Implemented models

Experiment Configurations

How to Run Our Solution

1. Prepare the input data

2. Run the following script:

3. The final submission file will be generated in the output folder as

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

ds-wook/toss-next-challenge-solution

Folders and files

Latest commit

History

Repository files navigation

toss-next-challenge-solution

Setting up environment

Setting up git hook

Note

Architecture of Our Solution

Ensemble Architecture

Boosting

Deep Cross Network Architecture

Implemented models

Experiment Configurations

How to Run Our Solution

1. Prepare the input data

2. Run the following script:

3. The final submission file will be generated in the output folder as

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages