Abstract
Keywords
1 Introduction
Optical diffraction tomography (ODT) is an imaging technique for extracting the threedimensional (3D) refractive index distribution of a sample, e.g., a biological cell using multiple twodimensional (2D) images acquired at different illumination angles. The refractive index of the sample provides useful morphological information, making ODT an interesting approach for biological applications.1
In the last several years, many different iterative methods have been proposed to reconstruct accurate refractive indices from illposed measurements.5
PINNs have recently gotten intense research attention for solving different complex problems in physics.10^{,}11 These networks use physics laws as the loss function instead of the datadriven loss functions. In conventional supervised deep learning, a large dataset of labeled examples is used for the training process: by comparing the known ground truth with the predictions from a deep multilayer neural network, one can construct a loss function and tune the parameters of the network to solve complex physical problems. Different examples of these datadriven neural networks are proposed for optical applications such as resolution enhancement,12 imaging through multimode fibers,13^{,}14 phase retrieval,15 ODT,16 and digital holography.17^{,}18 In these networks, the knowledge acquired by the network strongly depends on the statistical information provided in the dataset, and training such a network requires access to a large dataset. In contrast, PINNs directly minimize the physical residual from the corresponding partial differential equation (PDE) that governs the problem instead of extrapolating physical laws after going through a large amount of examples. In the pioneering approach proposed by Lagaris et al.,19 the neural network maps independent variables, such as spatial and time coordinates, to a surrogate solution of a PDE. By applying the chain rule, e.g., through autodifferentiation integrated in many deeplearning packages, one can easily extract the derivatives of the output fields with respect to the input coordinates and consequently construct a physicsbased loss.20 The correct prediction can be therefore retrieved by minimizing the loss with respect to the network weights. This approach has been used to solve nonlinear differential equations,21
Sign up for Advanced Photonics TOC Get the latest issue of Advanced Photonics delivered right to you！Sign up now
Having the independent variables of PDE as the input of the neural network limits the use of PINNs when fast inference is required. For the example of optical scattering, the neural network should be trained for each refractive index distribution separately. A different idea was proposed recently in Ref. 27 to solve Maxwell’s equations for micro lenses with different permittivity distributions. The calculation of physical loss, in this case, is based on the finite difference scheme, and in contrast to the previous approach that is trained for a single example, this model proved to be wellsuited for cases in which fast inference is required. However, such a PINN was only demonstrated to work for homogeneous 2D samples.
In this paper, we extend this idea for inhomogeneous and 3D cases and present a MaxwellNet that is able to solve different forward scattering problems, such as light scattering from biological cells. In the first part of the work, we train MaxwellNet for 2D digital phantoms and show how this pretrained network can be finetuned to predict light scattering from more complex and experimentally relevant samples, in our case, HCT116 cells. We benchmark the performance of MaxwellNet in solving scattering problems for 2D and 3D objects. Next, we demonstrate that such PINN can be efficiently used to invert the scattering problem through an iterative scheme and improve the results of conventional ODT. We first demonstrate the reconstruction of the refractive index distribution from synthetic data and then we validate the technique with experimental measurements of scattering from polystyrene microspheres.
2 Methodology
The main idea of our work, shown in Fig. 1, consists of two blocks. The first, MaxwellNet, is a neural network that takes as an input the refractive index distribution $n(r)$ and predicts the scattered field ${U}^{s}$. Its structure is based on the UNet architecture,28 and the training is performed on a large dataset of digital phantoms using a physicsdefined loss function. Then, this network is used as a forward model in a second optimization task that compares the fields predicted by MaxwellNet for a candidate refractive index (RI) distribution with the ground truth projections, e.g., computed numerically or evaluated experimentally, and updates $n(r)$ up to convergence.
Figure 1.Schematic description of MaxwellNet, with UNet architecture, and its application for tomographic reconstruction. The input is a refractive index distribution and the output is the envelope of the scattered field. The output is modulated by the fastoscillating term
2.1 Forward Model: MaxwellNet
In this section, we describe the implementation of a PINN that predicts the scattered field for a known input RI distribution. For the sake of simplicity, we first describe the method for the 2D case, but we will show the extension to 3D in the following. In this case, MaxwellNet takes as an input the RI distribution as a discrete array of shape ${N}_{x}\times {N}_{z}\times 1$ and we do expect an output with size ${N}_{x}\times {N}_{z}\times 2$, where the two channels correspond to the real and imaginary parts of the complex field. Among all the available architectures, the choice of UNet appears favorable as we do expect to embed the latent features of the RI distributions in a lower dimensional space through consecutive 2D convolutions and then retrieve the complex electromagnetic field in the same spatial points through the decoding step. A similar architecture was also proven to provide good accuracy for the computation of the scattered field from micro lenses.27 We implement the present network in TensorFlow 2.6.0. For each step in the encoder, we use two Conv2D layers, each followed by batchnormalization and the elu activation function. A total number of five layers are adopted to encode the information and each one is terminated with average pooling to reduce the dimension. The maximum number of channels that we get in the latent space is 512. On the decoder side, we used transposed convolutional layers to the output with the size ${N}_{x}\times {N}_{z}\times 2$ (or ${N}_{x}\times {N}_{y}\times {N}_{z}\times 2$ in the 3D case). It should be noted that we also use residual skip connections from the encoder branch. In common datadriven training, we would tune the weights of this network by minimizing the difference between predictions and ground truth data computed with numerical solvers, in turn requiring a large database of simulations and consequently a massive computational cost. Here, we do not provide inputoutput pairs; instead we train the network by requiring that the Helmholtz equation is satisfied on the predicted field. To speed up the training and improve performances, we require the network to predict the slowly varying envelope of the scattered field ${U}_{\mathrm{env}}^{s}$ being the scattered field obtained after demodulated by the fastoscillating component along the propagation direction ${U}^{s}={U}_{\mathrm{env}}^{s}{e}^{j{k}_{0}{n}_{0}z}$. We define a physicsinformed loss function to be minimized by updating the weights of the network:
When we train MaxwellNet for a class of samples, it can accurately calculate the field for unseen samples from the same class. However, the key point to mention is that if we want to use MaxwellNet for a different set of RI distributions, we can fix some of the weights and adjust only a part of the network for the new dataset instead of restarting the training from scratch. This process, referred to in the following as finetuning, is much faster than the original training of MaxwellNet. We will elaborate and discuss this interesting feature in Sec. 3.
It should be mentioned that we train MaxwellNet based on the Helmholtz equation with a scalar field approximation, as described in Eq. (1). The scalar approximation allows us to have a network with 2channel output, representing the real and imaginary parts of the scalar field. We can also consider the fullvectorial Helmholtz equation where we need a larger network with 6channel output to represent the real and imaginary parts of the three components of the field vector. However, the depolarization term can be neglected for samples with low refractive index gradients,31^{,}32 allowing us to have a MaxwellNet with fewer parameters and the scalar Helmholtz equation as the loss function.
2.2 Optical Diffraction Tomography Using MaxwellNet
Once MaxwellNet has been trained on a class of RI distributions, it can be used to rapidly backpropagate reconstruction errors with an approach similar to learning tomography.6 Let us assume that we measure $L$ projections ${U}_{i}^{m}$, with $i=0,\dots ,L$, from an unknown RI distribution $\overline{n}(x,z)$ for different rotational angles. From these data, we can reconstruct a first inaccurate candidate $n(x,z)$ through the Wolf’s transform using the Rytov approximation. Then, we need to calculate the projections by MaxwellNet for different illumination angles. To implement illumination angle rotation, we can geometrically rotate $n(x,z)$ based on the corresponding illumination angle and calculate the scattered field for the rotated refractive index. By feeding MaxwellNet with ${n}_{i}(x,z)={\mathcal{R}}_{i}\{n(x,z)\}$, where ${\mathcal{R}}_{i}$ is the image rotation operator that corresponds to the $i$th projection, we predict the complex scattered fields ${U}_{i}^{s}$ for the same $L$ angles. Consequently, we can construct a datadriven loss function ${\mathcal{L}}_{D}$ given by the difference ${\Vert {U}_{i}^{s}{U}_{i}^{m}\Vert}^{2}$ plus any additional regularizer, compute its gradient through autodifferentiation, update $n(x,z)$, and iterate up to convergence:
Also, in this case, we use an Adam optimizer for updating the RI values. The regularizer in Eq. (3) consists of three parts: a totalvariation (TV), a nonnegativity, and physicsinformed terms, $\mathcal{R}\mathrm{eg}\{n,{U}_{l}^{s}(n)\}={\lambda}_{\mathrm{T}\mathrm{V}}{\mathcal{R}}_{\text{TV}}(n)+{\lambda}_{\text{NN}}{\mathcal{R}}_{\text{NN}}(n)+{\lambda}_{\text{Ph}}{\mathcal{L}}_{\text{Ph}}(n,{U}^{s})$. The TV regularizer helps smooth the RI reconstruction and the nonnegativity regularizer adds the prior information that $n(x,z)$ should be larger than the background refractive index:
Importantly, we have to remark that MaxwellNet is trained for a specific dataset and accurately predicts the scattered field for RI distributions that are not too far from this set. To take into account this effect, we add the physicsinformed loss to the regularizer. This further correction term helps to find RI values in a way that MaxwellNet can predict the scattered field for them correctly. In contrast to TV and nonnegativity constraints that are used due to the illposedness of the ODT problem, the physicsinformed regularizer is necessary in our methodology to ensure that the index distributions remain within the domain in which MaxwellNet has been trained.
The key advantages of using MaxwellNet with respect to other forward models are: differently from BPM, it can accurately calculate field scattering, considering reflection, multiplescattering, or any other electromagnetic effects;5
3 Results and Discussion
3.1 MaxwellNet Results
In this section, we evaluate the performance of MaxwellNet for the prediction of the scattered field from RI structures such as biological cells. First, we check the performance on a 2D sample assuming that the system is invariant along the $y$ axis. The number of pixels for our model is ${N}_{x}={N}_{z}=256$ for both the $x$ and $z$ directions, and their size is $\text{d}x=100\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{nm}$. We also use PML with a thickness of $1.6\text{\hspace{0.17em}\hspace{0.17em}}\mu \mathrm{m}$ at the edges of our computational domain. We create a dataset of digital cell phantoms and divide it into the training and testing sets. MaxwellNet has $\sim 5.9\text{\hspace{0.17em}\hspace{0.17em}}\times {10}^{6}$ parameters to train and we use the Adam optimizer with a learning rate of $1\times {10}^{4}$ and batch training. The details about the dataset and training and validation losses are discussed in Appendix B. We train and test MaxwellNet in TensorFlow 2.6.0 on a desktop computer (Intel Core i79700K CPU, 3.6 GHz, 64 GB RAM, GPU GeForce RTX 2080Ti).
In Figs. 2(a) and 2(b), we choose two random examples of the digital phantoms in the test set (which is not seen by the network during the training). For each test case, in the second and third rows, we present the prediction of the envelope of the scattered field by the network, and we compare it with the result achieved by the FEM using COMSOL Multiphysics 5.4. We can see a very small difference between the results of MaxwellNet and COMSOL, which we attribute to discretization errors. There are different schemes of discretization in two methods that can cause such differences. To quantitatively evaluate the performance of MaxwellNet, we define the relative error of MaxwellNet with respect to COMSOL as
Figure 2.Results of MaxwellNet and its comparison with COMSOL. (a) and (b) Two test cases from the digital phantom dataset and the prediction of the real and imaginary parts of the envelope of the scattered fields using MaxwellNet, COMSOL, and their difference. (c) Scattered field predictions from the network trained in (a) and (b) for the case of an experimentally measured RI of HCT116 cancer cells and comparison with COMSOL. The difference between the two is no longer negligible. (d) Comparison between MaxwellNet and COMSOL after finetuning the former for a set of HCT116 cells. MaxwellNet predictions reproduces much more accurate results after finetuning.
It should be noted that once MaxwellNet is trained, the scattered field calculation is much faster than numerical techniques such as FEM. We present a time comparison in Table 1. For the test phantoms in Fig. 2, it took 17 ms for MaxwellNet in comparison with 13 s for COMSOL, meaning three orders of magnitude acceleration.

Table 1. Computation time comparison.
Furthermore, performing a physicsbased instead of direct datadriven training holds promises for exploiting the advantages of transfer learning.33 Maxwell equations are general but having a neural network that predicts the scattered field for any class of RI distribution in milliseconds with a negligible physical loss is usually unfeasible. Most of the previous PINN studies for solving partial differential equations are trained for one example, and they will work for that specific example. In our case, the UNet architecture proved to be expressive enough to predict the field for a class of samples. However, if we use MaxwellNet for inference on an RI distribution completely uncorrelated with the training set, the accuracy drops. To evaluate the MaxwellNet extrapolation capability, we considered the model trained on phantoms samples in Fig. 2 and use it for inference on HCT116 cancer cells. The comparison between MaxwellNet and COMSOL is shown in Fig. 2(c). The input of the network is a 2D slice of the experimentally measured HCT116 cell in the plane of the best focus. The discrepancy between MaxwellNet and COMSOL is due to the fact that the former does not see examples of such RI distributions during the training. As a result, if we require accurate results for a new set of samples with different features, we have to retrain MaxwellNet for the new dataset, which would take a long time as shown in Table 1. However, it turns out that learning a physical law, as Maxwell equations, even though on a finite dataset, is better suited than datadriven training for transfer learning on new batches. Indeed, we can use the pretrained MaxwellNet on digital phantoms and finetune some parts of the network for HCT cells achieving good convergence in a few epochs. In this example, we create a dataset of 136 RI distributions of HCT116 cancer cells and divide them into the training and validation sets. Some examples of the HCT116 refractive index dataset are shown in Appendix B. A wide range of cells with different shapes are included in the dataset. We have single cancer cells, such as shown in Fig. 2(c), examples of cells in the mitosis process, or examples with multiple cells. In this case, we freeze the weights of the encoder part and finetune the decoder with the new dataset. We can see in Fig. 2(d) that after this correction step, the calculated field is much more accurate. As can be seen in Table 1, the finetuning process is two orders of magnitude faster than a complete training from scratch.
The 2D case is helpful for demonstrating the method and rapidly evaluating the performances. Nevertheless, full 3D fields are required for many practical applications. We can straightforwardly recast MaxwellNet in 3D using arrays of size ${N}_{x}\times {N}_{y}\times {N}_{z}\times 1$ as inputs of the network and requiring an ${N}_{x}\times {N}_{y}\times {N}_{z}\times 2$ output, with the two channels corresponding to the real and imaginary parts of the envelope of the scattered field. In this case, the network consists of Conv3D, AveragePool3D, and Conv3DTranspose layers instead of 2D counterparts. As a benchmark test, we created a dataset of 3D phantoms with 200 examples (180 for training and 20 for testing). The computational domain is defined with ${N}_{x}={N}_{y}={N}_{z}=64$, $\text{d}x=100\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{nm}$, and PML thickness of $0.8\text{\hspace{0.17em}\hspace{0.17em}}\mu \mathrm{m}$. To show the proof of concept of 3D MaxwellNet with limited computational resources, we used a lower number of pixels per dimension with respect to the 2D case, keeping the pixel size, $\text{d}x$, the same to have an accurate finite difference calculation. As a result, we have a limited computational domain size, which can be improved using more powerful resources.
The 3D version of MaxwellNet has $\sim 1.72\times {10}^{7}$ parameters. We use the Adam optimizer ($\text{learning rate}=1\times {10}^{4}$) and a batch size of 10. The results of the predicted field for an unseen example and its comparison with COMSOL are shown in Fig. 3. We can see that MaxwellNet performs as well as COMSOL in field calculation. The quantitative error described in Eq. (6) is $3.4\times {10}^{3}$ for the 3D example of Fig. 3. There are some marginal differences due to different discretization schemes. However, we can see in Table 1 that MaxwellNet is about 50,000 times faster than COMSOL in predicting fields (44.9 ms versus 41.2 min). This result and the significant efficiency in the computation time highlight the MaxwellNet potential for the calculation of the field in different applications. In the next subsection, we demonstrate how this method can be applied for improving ODT reconstruction fidelity.
Figure 3.Results of 3D MaxwellNet and its comparison with COMSOL. The RI distribution is shown in (a). The real part of the envelope of the scattered field calculated by 3D MaxwellNet is shown in (b), calculated by COMSOL in (c), and their difference in (d). The imaginary part of the envelope of the scattered field calculated by 3D MaxwellNet, COMSOL, and their difference are presented in (e)–(g), respectively.
3.2 Tomographic Reconstruction Results
To show the ability of MaxwellNet to be used for different imaging applications, we implement an optimization task with MaxwellNet as the forward model for ODT, as explained in Sec. 2.2. In this example, we consider one of the digital phantoms in the test set of Fig. 2, and we use 2D MaxwellNet as the forward model to compute the 1D scattered field along the transverse direction $x$ for $N=81$ different rotation angles $\theta $. We restrict ourselves to the range $\theta \in [40\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{deg},40\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{deg}]$ as this is consistent with the typical conditions in common tomographic setups. As is shown in Fig. 4(a), the Rytov reconstruction obtained from these field projections is elongated along the $z$ axis and underestimated due to missing frequencies. We then minimize the loss function [Eq. (3)] to improve the RI reconstruction choosing as regularizer parameters ${\lambda}_{\mathrm{TV}}=3.1\times {10}^{7}$, ${\lambda}_{\mathrm{NN}}=1\times {10}^{1}$, ${\lambda}_{\mathrm{Ph}}=5\times {10}^{2}$, and the Adam optimizer with an initial learning rate of $3\times {10}^{4}$. We also scheduled the learning rate, halving it every 1000 epochs to speed up convergence. The resulting RI distribution after 3000 epochs is shown in Fig. 4. It can be seen that the reconstructed RI is no longer underestimated nor elongated along the $z$ axis. This is a significant improvement in comparison with the Rytov prediction. The missing details in the reconstructed RI, which can be better visible in the 1D cutline in Fig. 4(b), can be due to the missing information in the 1D fields that the optimization of RI could not retrieve.
Figure 4.Tomographic reconstruction of RI using MaxwellNet. (a) The RI reconstruction was achieved by Rytov, MaxwellNet, and the ground truth. (b) 1D RI profile at
Next, we try a 3D digital phantom from the test set and we use the 3D MaxwellNet as the forward model in our tomographic reconstruction method. Since generating synthetic data with COMSOL is timeconsuming for multiple angles, we create synthetic scattered fields from the phantom with the Lippmann–Schwinger equation. 9 Later, we will show an experimental example where we illuminate the sample with a circular illumination pattern with an angle $\approx 10\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{deg}$. As a result, in this numerical example, we rotate the sample for 181 angles (including 1 normal incidence), equivalently to an illumination rotation with a fixed illumination angle of 10 deg. We keep the experimental conditions, $\lambda =1.030\text{\hspace{0.17em}\hspace{0.17em}}\mu \mathrm{m}$ and ${n}_{0}=1.33$. Then, we use these synthetic measurements for our optimization task along with TV, nonnegativity, and physicsinformed regularization. The reconstruction is achieved after 6000 epochs with ${\lambda}_{\mathrm{TV}}=1.2\times {10}^{8}$, ${\lambda}_{\mathrm{NN}}=2\times {10}^{1}$, and ${\lambda}_{\mathrm{Ph}}$ started with $5\times {10}^{1}$ and divided by two in every 500 epochs. The reconstructions are shown in Fig. 5 in $YX$, $YZ$, and $XZ$ planes. The first row shows the Rytov reconstruction where we can see a significant underestimation and elongation along the $z$ axis that is due to the small illumination angle (10 deg). The details in the reconstruction achieved using MaxwellNet are slightly blurred in comparison with the ground truth as a result of low resolution with $\lambda =1.030\text{\hspace{0.17em}\hspace{0.17em}}\mu \mathrm{m}$.
Figure 5.Tomographic RI reconstruction of 3D sample using MaxwellNet. The RI reconstruction is achieved by Rytov, MaxwellNet, learning tomography, and the ground truth in different rows at the
Additionally, we performed learning tomography6 for the synthetic measurements using 181 projections. The 3D tomographic reconstruction using learning tomography is shown in the third row of Fig. 5. In comparison with MaxwellNet, learning tomography has some elongated artifacts, which can be due to the fact that reflection is neglected in its forward model. However, the reconstruction with learning tomography is smoother in comparison with the reconstruction of MaxwellNet, which is slightly pixelated. This issue happens because the beam propagation method, as the forward model of learning tomography, is a smooth forward model with respect to the voxels of the refractive index distribution, which is not the case for a deep neural network such as MaxwellNet. However, the reconstructions are quantitatively comparable. If we assume the reconstruction error of $\epsilon ({n}_{\mathrm{recon}},{n}_{\text{truth}})={\Vert {n}_{\mathrm{recon}}{n}_{\text{truth}}\Vert}_{2}^{2}/{\Vert {n}_{\text{truth}}{n}_{0}\Vert}_{2}^{2}$, we get an error of 0.613 for Rytov, an error of 0.146 for learning tomography, and an error of 0.116 for MaxwellNet reconstructions, as shown in Fig. 5. In terms of computation time with the desktop specifications we mentioned earlier, we used 3000 epochs for iterative optimization with MaxwellNet, each epoch taking 570 ms and 600 epochs for learning tomography, each epoch taking 710 ms, which means a fourfold factor for MaxwellNet in the computation time.
We also evaluated our methodology experimentally. We mentioned earlier that MaxwellNet takes care of reflection as a forward model, and therefore, our reconstruction technique can be used for samples with high contrast. In our experimental analysis, we try a polystyrene microsphere immersed in water, where we expect to have a $\sim 0.25$ refractive index contrast. Polystyrene microspheres (Polybead Polystyrene 2.0 Micron) are immersed in water and placed between two #1 glass coverslips. We have an offaxis holographic setup where we use a yttriumdoped fiber laser (Amplitude Laser Satsuma) with $\lambda =1.030\text{\hspace{0.17em}\hspace{0.17em}}\mu \mathrm{m}$ and we change the illumination angle with two Galvo mirrors. Using a delay path, the optical lengths of the reference and signal arms are matched. We measure holograms for 181 illumination angles and extract the phase and amplitude of the complex scattered fields using Fourier holography. More details about the experimental setup are discussed in Appendix C. Then, we use the extracted scattered fields for different projections for our optimization task to reconstruct the 3D RI distribution of the sample. The experimental projections are 2D complex fields that are imaged in the center of the sample using a microscope objective lens, and we can propagate them in the background medium to calculate the scattered field in any other plane, perpendicular to $z$ axis, after the sample. This 2D field can be compared with the output of MaxwellNet in that plane, as described in Eq. (3). Additionally, the experimental projections are based on illumination rotation and we interpolate them to achieve the equivalent sample rotation projections. We iteratively optimize the loss function in Eq. (3) for 2000 epochs where we use the regularization parameters of ${\lambda}_{\mathrm{TV}}=3.8\times {10}^{9}$, ${\lambda}_{\mathrm{NN}}=5\times {10}^{1}$, and ${\lambda}_{\mathrm{Ph}}$ started with $1.5\times {10}^{1}$ and divided by two in every 500 epochs. The reconstruction is shown in Fig. 6 using Rytov, MaxwellNet, and learning tomography. It can be seen that the underestimation and $z$ axis elongation in the Rytov reconstruction are remarkably improved. The reconstruction using learning tomography in Fig. 6 has artifacts due to the high refractive index contrast of the polystyrene bead and reflections that cannot be considered in the beam propagation method.
Figure 6.Tomographic RI reconstruction of a polystyrene microsphere immersed in water. The projections are measured with offaxis holography for different angles. The RI reconstructions achieved by Rytov, MaxwellNet, and learning tomography are presented at the
4 Conclusion
In summary, we proposed a PINN that rapidly calculates the scattered field from inhomogeneous RI distributions such as biological cells. Our network is trained by minimizing a loss function based on Maxwell equations. We showed that the network can be trained for a set of samples and could predict the scattered field for unseen examples that are in the same class. As our PINN is not a datadriven neural network, it can be trained for different examples in different conditions. Even though the network is not efficiently extrapolating to classes that are statistically very different from the training dataset, we showed that by freezing the encoder weights and finetuning the decoder branch, one can get a new predictive model in a few minutes. We believe that this can be further used for changing the wavelength, boundary condition, or other physical parameters.
We used our PINN as a forward model in an optimization loop to retrieve the RI distribution from the scattered fields achieved by illuminating the sample from different directions, known as ODT. This example shows the ability of MaxwellNet to be used as an accurate forward model in optimization loops for inverse design or inverse scattering problems.
5 Appendix A: Calculation of PhysicsInformed Loss
During the training of MaxwellNet, we calculate at each epoch the loss function in Eq. (1) for the network output. To evaluate the Helmholtz equation residual, we should numerically compute the term $\frac{{\partial}^{2}{U}^{s}}{\partial {x}^{2}}+\frac{{\partial}^{2}{U}^{s}}{\partial {y}^{2}}+\frac{{\partial}^{2}{U}^{s}}{\partial {z}^{2}}$. In the previous PINN papers for solving PDEs,19
To minimize the discretization error, one can use a smaller pixel size, $\mathrm{\Delta}x$, or higherorder approximations. Here, we use the fourthorder finite difference scheme34 in which convolutional kernels of $[0,+1/24,9/8,+9/8,1/24]$ and $[+1/24,9/8,+9/8,1/\mathrm{24,}\text{\hspace{0.17em}}0]$ are used for the calculation of the derivatives in Eq. (1).
6 Appendix B: Training and Finetuning of MaxwellNet
As mentioned in Sec. 3, we create a dataset of digital cell phantoms to train and validate MaxwellNet. The dataset for 2D MaxwellNet includes 3000 phantoms with elliptical shapes oriented in different directions. The size of these phantoms is in the range from 5 to $10\text{\hspace{0.17em}\hspace{0.17em}}\mu \mathrm{m}$, their refractive index varies in the range of (1.38, 1.45), and the background refractive index is ${n}_{0}=1.33$. Two examples of these phantoms are shown in Fig. 2. We divide this dataset into 2700 phantoms for training and 300 phantoms for testing. We use batch training with a batch size of 10 for 5000 epochs. This training took 30.5 h and after 5000 epochs, no significant decrease in the validation loss could be observed. The training and validation curves of the physical loss are shown in Fig. 7(a). This figure shows that MaxwellNet performs very well for outofsample cases.
Figure 7.Training and finetuning of MaxwellNet. (a) Training (blue) and validation (orange) loss of MaxwellNet for the digital cell phantoms dataset. (b) Finetuning of the pretrained MaxwellNet for a dataset of HCT116 cells for 1000 epochs. (c) Examples of the HCT116 dataset.
We discussed in Sec. 3 using MaxwellNet that was trained for cell phantoms to predict the scattered field for real cells. A dataset of HCT116 cancer cells is used for this purpose. The 3D refractive index of these cells is reconstructed using the Rytov approximation, with projections achieved with an experimental setup utilizing a spatial light modulator, as described in Ref. 8. Then, a 2D slice of the refractive index is chosen in the plane of best focus. A total number of 8 cells are used, and we rotated and shifted these cells to create a dataset of 136 inhomogeneous cells whose refractive index range is (1.33, 1.41). We use 122 of these images for training and 14 for validation. Some examples of the HCT116 refractive index dataset are shown in Fig. 7(c). We freeze the encoder of MaxwellNet and finetune its decoder for this new dataset. The training and validation losses are shown in Fig. 7(b).
For 3D MaxwellNet, a dataset of 200 phantoms is created. These 3D phantoms have a spherical shape with some details inside them and the range of their diameter is 1.8 to $2.4\text{\hspace{0.17em}\hspace{0.17em}}\mu \mathrm{m}$. We randomly choose 180 phantoms for training and 20 phantoms for testing. We train 3D MaxwellNet with the training dataset with a batch size of 10. The example of Figs. 3 and 5 is one of the phantoms in the testing dataset.
7 Appendix C: Experimental Setup for ODT
For ODT, we require complex scattered fields from multiple illumination angles. The offaxis holographic setup to accomplish that is shown in Fig. 8. It relies on an ytterbiumdoped fiber laser at $\lambda =1.030\text{\hspace{0.17em}\hspace{0.17em}}\mu \mathrm{m}$ whose power is controlled with a halfwave plate (HW) and a polarizing beam splitter. The optical beam is divided into the signal and reference arms using a beam splitter (BS1). In the signal arm, we use two galvo mirrors, GMV and GMH, to control the illumination angle in the vertical and horizontal directions. Using two 4F systems (L1–L4), we image these galvo mirrors on the sample plane, so the position of the beam remains fixed while changing the illumination angle. This way, we can illuminate the sample with a condensed plane wave. The sample is then imaged on the camera (Andor sCMOS Neo 5.5) using another 4F system consisting of a $60\times $ water dipping objective (Obj1) and a tube lens L5. The signal and reference arms are then combined with another beam splitter, BS2, to create the offaxis hologram on the camera. A motorized delay line controls the optical path of the reference arm to match the optical path of the signal arm.
Figure 8.Experimental setup for multiple illumination angle offaxis holography. HW: halfwave plate; P: polarizer; BS: beam splitter; L: lens; Obj: microscope objective; and M: mirror.
Amirhossein Saba is a PhD student in photonics at École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland. He received his BS degree in electrical engineering with a minor in physics from Sharif University of Technology, Tehran, Iran, in 2018. His current research interests include computational imaging, polarizationsensitive, and nonlinear imaging techniques.
Carlo Gigli received his MS degree in nanotechnologies for ICTs from Politecnico di Torino and Université Paris Diderot in 2017, and his PhD in physics at Université de Paris in 2021. During this period, his research interests included the design, fabrication, and characterization of dielectric resonators and metasurfaces for nonlinear optics. Since September 2021, he has been working as postdoc in the Optics Laboratory at EPFL, where his main activities focus on optical computing, physics informed neural network, and nonlinear tomography.
Ahmed B. Ayoub received his BS degree in electrical engineering from Alexandria University in Egypt in 2013. He received his MS degree in physics from the American University, Cairo, Egypt, in 2017. He is pursuing his PhD in electrical engineering at the École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland. His current research interests include optical imaging and threedimensional refractive index reconstructions for biological samples.
Demetri Psaltis received his BSc, MSc, and PhD degrees from CarnegieMellon University, Pittsburgh, Pennsylvania. He is a professor of optics and director of the Optics Laboratory at the École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland. In 1980, he joined the faculty at the California Institute of Technology, Pasadena, California. He moved to EPFL in 2006. His research interests include imaging, holography, biophotonics, nonlinear optics, and optofluidics. He has authored or coauthored over 400 publications in these areas. He is a fellow of the Optical Society of America, the European Optical Society, and SPIE. He was the recipient of the International Commission of Optics Prize, the Humboldt Award, the Leith Medal, and the Gabor Prize.
References
[1] W. Choi et al. Tomographic phase microscopy. Nat. Methods, 4, 717719(2007).
[6] U. S. Kamilov et al. Learning approach to optical tomography. Optica, 2, 517522(2015).
[10] G. E. Karniadakis et al. Physicsinformed machine learning. Nat. Rev. Phys., 3, 422440(2021).
[12] Y. Rivenson et al. Deep learning microscopy. Optica, 4, 14371443(2017).
[13] N. Borhani et al. Learning to see through multimode fibers. Optica, 5, 960966(2018).
[22] S. M. H. Hashemi, D. Psaltis. Deeplearning PDEs with unlabeled data and hardwiring physics laws(2019).
[31] A. Ishimaru. Wave Propagation and Scattering in Random Media, 2(1978).
[32] A. Saba et al. Polarizationsensitive optical diffraction tomography. Optica, 8, 402408(2021).
Set citation alerts for the article
Please enter your email address