Physics-Informed Deep Prior for Reconstruction of Early-part Room Impulse Responses

Horikoshi, Koki; Sato, Gen; Tsunokuni, Izumi; Ikeda, Yusuke

doi:10.51094/jxiv.2823

##article.authors##

Horikoshi, Koki Tokyo Denki University, Information Systems and Multimedia Design
Sato, Gen Tokyo Denki University, Doctoral Programs Graduate School of Advanced Science and Technology https://orcid.org/0009-0002-9431-6200
Tsunokuni, Izumi Tokyo Denki University, Department of Information systems and multimedia design
Ikeda, Yusuke Tokyo Denki University, Department of Information systems and multimedia design https://orcid.org/0000-0001-9092-0537

DOI:

https://doi.org/10.51094/jxiv.2823

キーワード:

Sound Field Reconstruction、 Spatial Audio、 Convolutional Neural Networks、 PINNs、 Unsupervised Learning

抄録

Accurate reconstruction of sound fields from sparse measurements remains a fundamental challenge in acoustics due to the spatial sampling (Nyquist) limit. Unsupervised deep learning approaches such as deep prior (DP) exploit the inductive bias of network architectures without requiring large-scale training data; however, they may violate physical consistency and degrade reconstruction accuracy. In this letter, we propose a physics-informed sound field reconstruction method, termed physics-informed deep prior (PIDP), which incorporates the acoustic wave equation into DP as a regularizer. By augmenting the network's structural bias with a physics-informed loss, PIDP encourages reconstructed early-part room impulse responses (RIRs) to follow the underlying laws of sound propagation. Simulation results show that PIDP consistently outperforms conventional DP across various SNR conditions, improving NMSE by 1.5–2.9 dB. In addition, learning-curve analysis indicates that reconstruction stability is closely related to the decay of the physics-informed loss, highlighting the importance of the loss-weight parameter.

利益相反に関する開示

本論文に関連し、開示すべき利益相反（COI）関係にある企業・法人組織や営利を目的した団体はありません。

ダウンロード *前日までの集計結果を表示します

ダウンロード実績データは、公開の翌日以降に作成されます。

引用文献

E. Fernandez-Grande, A. Xenaki, and P. Gerstoft, "A sparse equivalent source method for near-field acoustic holography," J. Acoust. Soc. Am., vol. 141, no. 1, pp. 532–542, 2017.

S. Koyama, J. G. C. Ribeiro, T. Nakamura, N. Ueno, and M. Pezzoli, "Physics-Informed Machine Learning for Sound Field Estimation: Fundamentals, state of the art, and challenges," IEEE Signal Process. Mag., vol. 41, no. 1, pp. 60–71, 2024.

F. Lluís, P. Martínez-Nuevo, and B. Møller, "Sound field reconstruction in rooms: Inpainting meets super-resolution," J. Acoust. Soc. Am., vol. 148, pp. 649–659, 2020.

K. Horikoshi, G. Sato, I. Tsunokuni, and Y. Ikeda, "Time-Domain sound field estimation using 3D-CNN for sound field reproduction," Proc. 2025 IEEE 14th Global Conference on Consumer Electronics (GCCE), pp. 1160–1164, 2025.

M. Raissi, P. Perdikaris, and G. E. Karniadakis, "Physics Informed Deep Learning (Part I): Data-driven Solutions of Nonlinear Partial Differential Equations," CoRR, abs/1711.10561, 2017.

M. Raissi, P. Perdikaris, and G. E. Karniadakis, "Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations," J. Comput. Phys., vol. 378, pp. 686–707, 2019.

X. Karakonstantis and E. Fernandez-Grande, "Room impulse response reconstruction using physics-informed neural networks," Proc. Forum Acusticum 2023, pp. 3181–3188, 2023.

S. Y. Lee, C. S. Park, K. Park, H. J. Lee, and S. Lee, "A Physics-informed and data-driven deep learning approach for wave propagation and its scattering characteristics," Eng. Comput., vol. 39, pp. 2609–2625, 2023.

D. Ulyanov, A. Vedaldi, and V. Lempitsky, "Deep image prior," Int. J. Comput. Vis., vol. 128, pp. 1867–1888, 2020.

Y. Tian, C. Xu, and D. Li, "Deep audio prior," arXiv preprint arXiv:1912.10292, pp. 1–17, 2019.

M. Pezzoli, D. Perini, A. Bernardini, F. Borra, F. Antonacci, and A. Sarti, "Deep prior approach for room impulse response reconstruction," Sensors, vol. 22, pp. 1–15, 2022.

X. Karakonstantis, D. Caviedes-Nozal, A. Richard, and E. Fernandez-Grande, "Room impulse response reconstruction with physics-informed deep learning," J. Acoust. Soc. Am., vol. 155, pp. 1048–1059, 2024.

J. Long, E. Shelhamer, and T. Darrell, "Fully convolutional networks for semantic segmentation," Proc. CVPR, pp. 3431–3440, 2015.

O. Ronneberger, P. Fischer, and T. Brox, "U-Net: Convolutional networks for biomedical image segmentation," Medical Image Computing and Computer-Assisted Intervention (MICCAI), pp. 234–241, 2015.

S. Koyama, T. Nishida, K. Kimura, T. Abe, N. Ueno, and J. Brunnström, "MESHRIR: A Dataset of room impulse responses on meshed grid points for evaluating sound field analysis and synthesis methods," Proc. IEEE WASPAA 2021, pp. 1–5, 2021.

R. Scheibler, E. Bezzam, and I. Dokmanić, "Pyroomacoustics: A Python package for audio room simulation and array processing algorithms," Proc. IEEE ICASSP 2018, pp. 351–355, 2018.