• Acta Optica Sinica
  • Vol. 44, Issue 7, 0733001 (2024)
Qi Chen1、2, Zhibao Qin1、2, Xiaoyu Cai1、2, Shijie Li1、2, Zijun Wang1、2, Junsheng Shi1、2、*, and Yonghang Tai1、2、*
Author Affiliations
  • 1School of Physics and Electronic Information, Yunnan Normal University, Kunming 650500, Yunnan, China
  • 2Yunnan Key Laboratory of Optoelectronic Information Technology, Kunming 650500, Yunnan, China
  • show less
    DOI: 10.3788/AOS231537 Cite this Article Set citation alerts
    Qi Chen, Zhibao Qin, Xiaoyu Cai, Shijie Li, Zijun Wang, Junsheng Shi, Yonghang Tai. Dynamic Three-Dimensional Reconstruction of Soft Tissue in Neural Radiation Field for Robotic Surgery Simulators[J]. Acta Optica Sinica, 2024, 44(7): 0733001 Copy Citation Text show less

    Abstract

    Objective

    Reconstructing soft tissue structures based on the endoscope position with robotic surgery simulators plays an important role in robotic surgery simulator training. Traditional soft tissue structure reconstruction is mainly achieved through surface reconstruction algorithms using medical imaging data sets such as computed tomography and magnetic resonance imaging. These methods fail to reconstruct the color information of soft tissue models and are not suitable for complex surgical scenes. Therefore, we proposed a method based on neural radiation fields, combined it with classic volume rendering to segment robotic surgery simulator scenes from videos with deformable soft tissue captured by a monocular stereoscopic endoscope, and performed three-dimensional reconstruction of biological soft tissue structures to restore soft tissue. By using segmented arbitrary scene model (SASM) for segmentation modeling of time-varying objects and time-invariant objects in videos, specific dynamic occlusions in surgical scenes can be removed.

    Methods

    Inspired by recent advances in neural radiation fields, we first constructed a self-supervision-based framework that extracted multi-view images from monocular stereoscopic endoscopic videos and used the underlying 3D information in the images to construct geometric constraints of objects, so as to accurately reconstruct soft tissue structures. Then, the SASM was used to segment and decouple the dynamic surgical instruments, static abdominal scenes, and deformable soft tissue structures under the endoscope. In addition, this framework used a simple neural network multilayer perceptron (MLP) to represent moving surgical instruments and deformed soft tissue structures in dynamic neural radiation fields and proposed skew entropy loss to correctly predict surgical instruments, cavity scenes, and soft tissue structures in surgical scenes.

    Results and Discussions

    We employ MLP to represent robotic surgery simulator scenes in the neural radiation field to accommodate the inherent geometric complexity and deformable soft tissue. Furthermore, we establish a hybrid framework of the neural radiation field and SASM for efficient characterization and segmentation of endoscopic surgical scenes in an endoscopic robotic surgery simulator. To address the dynamic nature of scenes and facilitate accurate scene separation, we propose a self-supervised approach incorporating a novel loss function. For validation, we perform a comprehensive quantitative and qualitative evaluation of a data set captured using a stereoendoscope, including simulated robotic surgery scenes from different angles and distances. The results show that our method performs well in synthesizing realistic robotic surgery simulator scenes compared with existing methods, with an average improvement of 12.5% in peak signal-to-noise ratio (PSNR) and an average improvement of 8.43% in structural similarity (Table 1). It shows excellent results and performance in simulating scenes and achieving high-fidelity reconstruction of biological soft tissue structures, color, textures, and other details. Furthermore, our method shows significant efficacy in scene segmentation, enhancing overall scene understanding and accuracy.

    Conclusions

    We propose a novel NeRF-based framework for self-supervised 3D dynamic surgical scene decoupling and biological soft tissue reconstruction from arbitrary multi-viewpoint monocular stereoscopic endoscopic videos. Our method decouples dynamic surgical instrument occlusion and deformable soft tissue structures, recovers a static abdominal volume background representation, and enables high-quality new view synthesis. The key parts of our framework are the SASM and the neural radiation field. The highly segmentable module of SASM decomposes the surgical scene into dynamic, static, and deformable regions. A spatiotemporal hybrid representation is then designed to facilitate and efficiently model the decomposed neural radiation fields. Our method achieves excellent performance in various simulation scenes of robotic surgery data, such as large-scale moving surgical instruments and 3D reconstruction of deformable soft tissue structures. We believe that our method can facilitate robotic surgery simulator scene understanding and hope that emerging NeRF-based 3D reconstruction technology can provide inspiration for robotic surgery simulator scene understanding and empower various downstream clinically oriented tasks.

    Qi Chen, Zhibao Qin, Xiaoyu Cai, Shijie Li, Zijun Wang, Junsheng Shi, Yonghang Tai. Dynamic Three-Dimensional Reconstruction of Soft Tissue in Neural Radiation Field for Robotic Surgery Simulators[J]. Acta Optica Sinica, 2024, 44(7): 0733001
    Download Citation