Skip to content

Label-free Iterative Retriever Training for Multi-hop Question Answering with Relevance-Consistency Supervision

License

Notifications You must be signed in to change notification settings

leeds1219/ReSCORE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation



Korea University    MIIL    Naver AI Lab    Naver Cloud    Richmond University


ReSCORE: Label-free Iterative Retriever Training for Multi-hop Question Answering with Relevance-Consistency Supervision

arXiv | Project

by Dosung Lee*, Wonjun Oh*, Boyoung Kim, Minyoung Kim, Joonsuk Park†, Paul Hongsuck Seo

Introduction

Directly mapping complex problems ($x$) to their final solutions ($y$) poses a significant challenge, often requiring an intermediate reasoning step—a latent variable ($z$)—to bridge the gap. However, explicit supervision for these intermediate thoughts is rarely available. Instead of relying on ground-truth reasoning labels, our approach leverages the model's confidence in the final answer ($y$) as an intrinsic reward signal.
Through this approach, the model learns to autonomously generate the most effective intermediate steps ($z$) that maximize downstream solvability.

This is our official implementation of ReSCORE: Label-free Iterative Retriever Training for Multi-hop Question Answering with Relevance-Consistency Supervision!

Figure Multi-hop question answering (MHQA) requires reasoning across multiple documents, making dense retriever training challenging due to query variability. We propose ReSCORE, a method that trains dense retrievers without labeled data by leveraging LLMs to assess document relevance and consistency with answers.

For further details, please check out our Paper and our Project page.

Demo

Move the app.py from demo folder to ReSCORE dir.

python app.py

"""
Examples)

Which company owns the manufacturer of Learjet 60?

In which county is Southern Maryland Electric Cooperative headquartered?

What is another notable work made by the author of Miss Sara Sampson?

What is the seat of the county where Van Hook Township is located?

The Unwinding author volunteered for which organisation?

...
"""

🔥TODO

  • Check Typo ...

Installation

pip install -r requirements.txt

You need permission to access the Llama-3.1-8B-Instruct model, or you can modify the Script to use your own LLM.

We conducted all experiments using Python 3.10.12 on an NVIDIA A100 HBM2 40GB PCIe GPU, and the environments are listed in Packages, so please refer to it if any issues arise.

Data Preparation

# Download MHQA datasets
sh script/download/multihop_raw_data.sh

# Preprocess and build Retrieval DB
sh script/download/build.sh

Training

# Training
python -m source.run.train
--running_name {train}
--dataset {dataset}

Model Weights

Model Weights Link
Contriever-MSMARCO 🔗 Click here
IQATR-Musique 🔗 Click here
IQATR-HotpotQA 🔗 Click here
IQATR-2WikiMultiHopQA 🔗 Click here

Inference

# Inference
python -m source.run.inference
--method {base_or_iqatr}
--running_name {inference}
--dataset {dataset}

Acknowledgement

This project includes code from Contriever, DPR, and IRCoT.

This research was supported by the following grants:

  • IITP (Institute of Information & Communications Technology Planning & Evaluation)

    • IITP-2025-RS-2020-II201819
    • IITP-2025-RS-2024-00436857
    • IITP-2025-RS-2024-00398115
    • IITP-2025-RS-2025-02263754
    • IITP-2025-RS-2025-02304828
  • NRF (National Research Foundation of Korea)

    • NRF-2021R1A6A1A03045425
  • KOCCA (Korea Creative Content Agency)

    • RS-2024-00345025

Funded by the Korea government (MSIT, MOE, and MSCT).

Citation

@inproceedings{lee-etal-2025-rescore,
    title = "{R}e{SCORE}: Label-free Iterative Retriever Training for Multi-hop Question Answering with Relevance-Consistency Supervision",
    author = "Lee, Dosung  and
      Oh, Wonjun  and
      Kim, Boyoung  and
      Kim, Minyoung  and
      Park, Joonsuk  and
      Seo, Paul Hongsuck",
    editor = "Che, Wanxiang  and
      Nabende, Joyce  and
      Shutova, Ekaterina  and
      Pilehvar, Mohammad Taher",
    booktitle = "Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = jul,
    year = "2025",
    address = "Vienna, Austria",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.acl-long.16/",
    doi = "10.18653/v1/2025.acl-long.16",
    pages = "341--359",
    ISBN = "979-8-89176-251-0",
    abstract = "Multi-hop question answering (MHQA) involves reasoning across multiple documents to answer complex questions. Dense retrievers typically outperform sparse methods like BM25 by leveraging semantic embeddings in many tasks; however, they require labeled query-document pairs for fine-tuning, which poses a significant challenge in MHQA due to the complexity of the reasoning steps. To overcome this limitation, we introduce Retriever Supervision with Consistency and Relevance (ReSCORE), a novel method for training dense retrievers for MHQA without the need for labeled documents. ReSCORE leverages large language models to measure document-question relevance with answer consistency and utilizes this information to train a retriever within an iterative question-answering framework. Evaluated on three MHQA benchmarks, our extensive experiments demonstrate the effectiveness of ReSCORE, with significant improvements in retrieval performance that consequently lead to state-of-the-art Exact Match and F1 scores for MHQA."
}

About

Label-free Iterative Retriever Training for Multi-hop Question Answering with Relevance-Consistency Supervision

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •