R2Rest: A Novel Deep Learning Framework for Estimating Respiration Rate from Respiratory Sounds
Authors: Soubhagya Ranjan Hota†, Arka Roy†, and Udit Satija († means equally contributed in the paper)
Non-invasive respiratory activity assessment, including airflow signal (AF)-derived vital extraction such as respiration rate (RR), tidal volume, peak expiratory rate, etc., and adventitious breathing event detection, are emerging research areas in continuous health monitoring. Recent studies have demonstrated a strong pathological correlation between AFs and respiratory sounds (RSs). In this research, we present a unified deep learning framework, namely R2REst, for RR estimation by synthesizing equivalent electrical impedance tomography (EIT)-based AFs from RSs. The proposed framework comprises four major stages: pre-processing, mel spectrogram generation, mel spectrogram-vision transformer-based AF prediction, and lastly, RR estimation by analyzing the frequency spectrum of the predicted AF signal. Experimental results on the RSs from the BRACETS dataset show that the proposed framework exceeds existing works on RR estimation, which utilize either RSs or other bio-acoustic modalities, by obtaining a mean square error (MSE) and mean absolute error (MAE) of 0.001, 0.003, and 0.010, 0.016 (in breaths per minute (BPM)) for tidal breathing followed by deep breathing (TBDB) and cough-speech (TBCS) induced cases.
S. R. Hota, A. Roy and U. Satija, ``R2REst: A Unified Deep Learning Framework for Estimating Respiration Rate From Respiratory Sounds," in IEEE Signal Processing Letters, doi: 10.1109/LSP.2025.3578932.
@ARTICLE{11030861,
author={Hota, Soubhagya Ranjan and Roy, Arka and Satija, Udit},
journal={IEEE Signal Processing Letters},
title={R2REst: A Unified Deep Learning Framework for Estimating Respiration Rate From Respiratory Sounds},
year={2025},
volume={},
number={},
pages={1-5},
keywords={Spectrogram;Transformers;Estimation;Long short term memory;Electrical impedance tomography;Computer architecture;Monitoring;Recording;Mean square error methods;Deep learning;Respiratory sounds;respiration rate;mel spectrogram;vision transformer;estimation},
doi={10.1109/LSP.2025.3578932}}


