Two-stage framework reconstructs sharp 4D scenes from blurry handheld videos


MoBluRF: A framework for creating sharp 4D reconstructions from blurry videos
Motion deblurring novel view synthesis results. We propose a novel motion deblurring NeRF for blurry monocular videos, called MoBluRF, which significantly outperforms previous SOTA NeRF methods, trained on the newly synthesized Blurry iPhone dataset. Credit: IEEE Transactions on Pattern Analysis and Machine Intelligence (2025). DOI: 10.1109/tpami.2025.3574644

Neural Radiance Fields (NeRF) is a fascinating technique that creates three-dimensional (3D) representations of a scene from a set of two-dimensional (2D) images, captured from different angles. It works by training a deep neural network to predict the color and density at any point in 3D space.

To do this, it casts imaginary light rays from the camera through each pixel in all input images, sampling points along those rays with their 3D coordinates and viewing direction. Using this information, NeRF reconstructs the scene in 3D and can render it from entirely new perspectives, a process known as novel view synthesis (NVS).

Beyond still images, a video can also be used, with each frame of the video treated as a static image. However, existing methods are highly sensitive to the quality of the videos.

Videos captured with a single camera, such as those from a phone or drone, inevitably suffer from motion blur caused by fast object motion or camera shake. This makes it difficult to create sharp, dynamic NVS. This is because most existing deblurring-based NVS methods are designed for static multi-view images, which fail to account for global camera and local object motion. In addition, blurry videos often lead to inaccurate camera pose estimations and loss of geometric precision.

To address these issues, a research team jointly led by Assistant Professor Jihyong Oh from the Graduate School of Advanced Imaging Science (GSIAM) at Chung-Ang University (CAU) in Korea, and Professor Munchurl Kim from Korea Advanced Institute of Science and Technology (KAIST), Korea, along with Mr. Minh-Quan Viet Bui, Mr. Jongmin Park, developed MoBluRF, a two-stage motion deblurring method for NeRFs.

“Our framework is capable of reconstructing sharp 4D scenes and enabling NVS from blurry monocular videos using motion decomposition, while avoiding mask supervision, significantly advancing the NeRF field,” explains Dr. Oh. Their study is published in IEEE Transactions on Pattern Analysis and Machine Intelligence.

MoBluRF consists of two main stages: Base Ray Initialization (BRI) and Motion Decomposition-based Deblurring (MDD). Existing deblurring-based NVS methods attempt to predict hidden sharp light rays in blurry images, called latent sharp rays, by transforming a ray called the base ray. However, directly using input rays in blurry images as base rays can lead to inaccurate prediction. BRI addresses this issue by roughly reconstructing dynamic 3D scenes from blurry videos and refining the initialization of “base rays” from imprecise camera rays.

Next, these base rays are used in the MDD stage to accurately predict latent sharp rays through an Incremental Latent Sharp-rays Prediction (ILSP) method. ILSP incrementally decomposes motion blur into global camera motion and local object motion components, greatly improving the deblurring accuracy. MoBluRF also introduces two novel loss functions, one that separates static and dynamic regions without motion masks, and another that improves geometric accuracy of dynamic objects, two areas where previous methods struggled.

Owing to this innovative design, MoBluRF outperforms state-of-the-art methods with significant margins in various datasets, both quantitatively and qualitatively. It is also robust against varying degrees of blur.

“By enabling deblurring and 3D reconstruction from casual handheld captures, our framework enables smartphones and other consumer devices to produce sharper and more immersive content,” remarks Dr. Oh. “It could also help create crisp 3D models of shaky footages from museums, improve scene understanding and safety for robots and drones, and reduce the need for specialized capture setups in virtual and augmented reality.”

MoBluRF marks a new direction for NeRFs, enabling high-quality 3D reconstructions from ordinary blurry videos recorded with everyday devices.

More information:
Minh-Quan Viet Bui et al, MoBluRF: Motion Deblurring Neural Radiance Fields for Blurry Monocular Video, IEEE Transactions on Pattern Analysis and Machine Intelligence (2025). DOI: 10.1109/tpami.2025.3574644

Provided by
Chung Ang University

Citation:
Two-stage framework reconstructs sharp 4D scenes from blurry handheld videos (2025, September 19)
retrieved 19 September 2025
from https://techxplore.com/news/2025-09-stage-framework-reconstructs-sharp-4d.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.



Leave a Comment