[MSR Thesis Talk] Neural Implicit Representations for Medical Ultrasound Volumes and 3D Anatomy-specific Reconstructions
Abstract:
Most Robotic Ultrasound Systems (RUSs) equipped with ultrasound-interpreting algorithms rely on building 3D reconstructions of the entire scanned region or specific anatomies. These 3D reconstructions are typically created via methods that compound or stack 2D tomographic ultrasound images using known poses of the ultrasound transducer with the latter requiring 2D or 3D segmentation. While fast, this class of methods has many shortcomings. It requires interpolation-based gap-filling or extensive compounding and still yields volumes that generate implausible novel views. Additionally, storing these volumes can be memory-intensive.
These challenges can be overcome with neural implicit learning which provides interpolation in unobserved gaps through a smooth learned function as well as a lighter representation for the volume in terms of memory. In this thesis, a neural implicit representation (NIR) based on the physics of ultrasound image formation is presented. With this NIR, a physically-grounded version of tissue reflectivity function (TRF) is learned by regression using observed intensities in ultrasound images. Additionally, this NIR also learns a spatially-varying point spread function (PSF) of the ultrasound imaging system to improve the photorealism of rendered images. The TRF learned through this method can handle contrasting observations from different viewing-directions due to a differentiable rendering function that incorporates the angle of incidence between ultrasound rays and the tissue interfaces in the scanned volume. It is a stable representation of the tissue volume that when combined with the viewing-direction, can produce true-to-orientation ultrasound images.
Given that many diagnostic and surgical applications, robotic or otherwise, require anatomy-specific 3D reconstructions, it is not sufficient to learn entire ultrasound volumes without discerning the required anatomies. To circumvent the use of traditional 3D segmentation methods that are computationally-heavy, I show that the obtained TRF can be used to learn a neural implicit shape representation for anatomies that are largely homogeneous. This is formulated as a weakly-supervised binary voxel occupancy function that is learned in parallel with the NIR. All these contributions are substantiated on simulated, phantom-acquired and live subject-acquired ultrasound images capturing blood vessels. Finally, an application for the anatomy-specific reconstruction is discussed in the context of physical simulations for deformation modeling of soft tissue.
Committee:
Dr. Howie Choset (advisor)
Dr. John Galeotti (advisor)
Dr. Shubham Tulsiani
Yehonathan Litman