Upsampled PSF enables high accuracy 3D superresolution imaging with sparse sampling rate

Jianwei Chen; Wei Shi; Jianzheng Feng; Jianlin Wang; Sheng Liu; Yiming Li

doi:10.1364/PRJ.547778

Abstract

Single-molecule localization microscopy (SMLM) provides nanoscale imaging, but pixel integration of acquired SMLM images limited the choice of sampling rate, which restricts the information content conveyed within each image. We propose an upsampled point spread function (PSF) inverse modeling method for large-pixel single-molecule localization, enabling precise three-dimensional superresolution imaging with a sparse sampling rate. Our approach could reduce data volume or expand the field of view by nearly an order of magnitude, while maintaining high localization accuracy and greatly improving the imaging throughput with the limited pixels available in existing cameras.

Single-molecule localization microscopy (SMLM) has emerged as one of the most powerful techniques for surpassing the diffraction limit of light microscopy, enabling the visualization of biological structures at the nanoscale [1,2]. With its high localization precision and single-molecule resolution, SMLM has been widely applied in cell biology [3], neuroscience [4], and the study of disease mechanisms [5]. However, achieving one superresolved SMLM image typically demands the acquisition of tens of thousands of single-molecule images. This is particularly challenging in reproducible biological research, where statistically robust results require large data sets [6]. The sheer volume of data generated not only requires significant computational power but also demands extensive storage management, creating a heavy burden in both data processing and resource allocation, especially when a large number of SMLM images are needed [7 –11].

A fundamental aspect of SMLM involves the precise localization of diffraction-limited single-molecule spots, typically captured by cameras as pixelated images. Large-pixel imaging, which is simply achieved by lowering the microscope magnification, could largely reduce the data volume with the same field of view (FOV). It has long been noted that the pixelation effect becomes especially problematic when the pixel size exceeds the ideal sampling rate ( $\sim 100 nm$ ) for single-molecule localization under a high numerical aperture (NA) objective [12 –14]. One of the earliest criteria for optimizing pixel size in SMLM was proposed by Thompson et al. in 2002 [12]. Based on a Gaussian model, they suggested that the optimal pixel size should approximately match the standard deviation of the point spread function (PSF) to maximize localization precision. Later, Chang et al. extended this work by modeling pixelated PSFs under large-pixel conditions and quantitatively evaluating the impact of different PSF types, including astigmatic PSF [6], DH-PSF [15,16], tetrapod PSF [17], and 4Pi PSF [18,19], on three-dimensional (3D) localization accuracy [14]. Through Cramér–Rao lower bound (CRLB) analysis, they systematically optimized the optimal pixel size required to achieve the best 3D localization precision under varying signal-to-background ratio (SBR) conditions, providing a theoretical foundation for applying SMLM to large-pixel imaging scenarios.

In recent years, researchers have proposed various methods to optimize PSF modeling in SMLM to enhance localization accuracy. For example, the spline interpolation method reconstructs PSFs from experimentally measured bead data. It has been demonstrated to achieve the theoretical localization precision defined by the CRLB for arbitrary PSF shapes [20,21]. Additionally, in situ PSF modeling directly learns PSF characteristics from single-molecule blinking data, making it adaptable to different complex imaging conditions [22]. However, a universal algorithm that effectively addresses pixelation noise is still lacking, forcing researchers to compromise between pixel size and imaging quality. Therefore, developing a PSF inverse modeling approach that can accommodate different pixel sampling rates while reducing data volume without sacrificing localization accuracy remains a critical challenge in the field of SMLM.

Sign up for Photonics Research TOC. Get the latest issue of Photonics Research delivered right to you！Sign up now

Empirical observations suggest that varying the integration positions of bead data (Fig. 1, and Fig. 5 in Appendix G) can retain continuous PSF information that is otherwise lost in pixelated images. This discovery has led to the development of our inverse integration technique, which could estimate a pre-integrated, upsampled PSF from pixelated data. By recovering high-frequency details that are normally blurred by pixelation, this upsampled PSF significantly enhances localization precision for undersampled data. As a result, our approach not only allows for optimal localization accuracy as defined by the CRLB, but also mitigates the computational and storage challenges associated with large-scale SMLM. By reducing the widely used sampling rate for high NA objective by 3 times, we could decrease the data volume by 89% or expand the FOV ninefold while maintaining high localization accuracy. Our method facilitates more efficient high-precision imaging, which normally requires fine sampling, addressing the pressing need for computationally accessible solutions in reproducible large-scale biological studies with superresolution.

Figure 1.Concept of upsampled PSF modeling. First, the z-stack data of beads are collected (top left), from which ROIs are extracted to construct an experimental PSF library at different positions (top center). Next, the initial global parameters (upsampled PSF represented by a 3D matrix) and local parameters (3D position, photon count, and background) are used to generate a fine PSF library. Then, the fine PSF library is pixel-integrated to create a coarse PSF library (top right). The loss function, based on MLE, is calculated between the coarse PSF library and the experimental PSF library (top center). Finally, through backpropagation (L-BFGS-B optimization algorithm), both global and local parameters are updated iteratively, ultimately yielding the final upsampled PSF. We also localize the experimental bead data with both the upsampled PSF (bottom center) and averaged coarse PSF, revealing that only the upsampled PSF could achieve unbiased and optimal localization results (bottom left).

The upsampled PSF inverse modeling process involves reconstructing the upsampled PSF from pixelated beads data, a common inverse problem. In microscopy, the intensity field from bead fluorescence is integrated on the camera chip to form an image, making forward modeling straightforward when parameters such as the emission position and the upsampled 3D PSF are known. However, the inverse process of determining the upsampled PSF is much more complex, requiring sufficient sampling of different bead positions and an optimization algorithm to retrieve the underlying shared continuous PSF model. Our upsampled PSF modeling process (Fig. 1) starts by extracting regions of interest (ROIs) from experimental beads z-stack data, generating a set of pixelated PSFs randomly distributed at different positions. An initial upsampled PSF model, along with parameters like 3D position, photon counts, and background, is used to generate fine PSFs. The fine PSFs are normally upsampling an integer multiple times from the under-sampled coarse PSFs. Therefore, we bin the fine PSFs over the pixel size to create coarse PSFs, which are fitted to the experimental data based on Poisson maximum likelihood estimation (MLE) [23]. The optimization algorithm L-BFGS-B [24] is then employed to iteratively adjust the upsampled PSF model and related parameters (Appendix B).

After obtaining the upsampled PSF from the beads data, we then developed a new localization algorithm that could minimize the localization bias for pixelated single molecule data with large pixel size. For upsampled PSF localization, a spline-interpolated PSF was employed so that it is applicable to PSFs with arbitrary shape. Different from the conventional spline PSF fitting, we introduced a new input constant, “bin,” to perform pixel integration, allowing for precise localization across different sampling rates between the upsampled PSF and the single-molecule data (Appendix C). Prior to localization, the upsampled PSF is preprocessed by convolving it with a $bin \times bin$ kernel with all elements as 1 [Appendix C and Fig. 6(a) in Appendix G]. This step is essential, since the experimental data and the upsampled PSF are integrated over different pixel sizes, leading to shape discrepancies. Shape discrepancies can lead to significant localization bias, preventing the achievement of the theoretical CRLB [Fig. 7(a) in Appendix G]. After convolution-based preprocessing, spline interpolation (Appendix C) is applied to the upsampled PSF and used for bead or single-molecule localization. Compared with the conventional spline fitter without considering the pixel integration effect, we showed that our upsampled spline-fitting algorithm could achieve CRLB at different pixel sizes and effectively reduced the localization bias by a factor of 2 for 330 nm pixel size data [Fig. 7(b)].

We then systematically investigated the optimal pixel size for large-pixel imaging by calculating the theoretical CRLB for PSFs of varying pixel sizes using a vectorial PSF model [25] (Appendixes D and E). As shown in Fig. 8, the CRLB of large pixel PSF is spatially variant. To accurately reflect the localization precision of large-pixel PSF, we uniformly selected 121 points within a single pixel, calculated the CRLB for each point, and averaged these values to obtain the final mean CRLB. As shown in Figs. 2(a)–2(c), compared to the traditional 110 nm pixel size PSF, the ${CRLB}_{3 D}^{1 / 2}$ [26] of a 220 nm pixel size PSF increased by only about 6.7% for astigmatic PSF [6], while the ${CRLB}_{3 D}^{1 / 2}$ for a 330 nm pixel size PSF increased by just 18.9%. Therefore, a factor of 2 to 3 times increase of traditional pixel size ( $\sim 100 nm$ ) would greatly reduce the data volume while maintaining high localization precision.

Figure 2.Comparison of the CRLB and RMSE for PSFs with different pixel sizes. (a)–(c) Comparison of the theoretical CRLB for

x

y

, and

z

for PSFs with different pixel sizes. (d)–(f) Comparison of the RMSE in the

x

y

, and

z

directions localized with PSFs of different pixel sizes.

R_{u p}

represents the percentage degradation in CRLB or RMSE averaged over the range from

- 600

to 600 nm, relative to the values obtained with a pixel size of 110 nm. Each molecule contributes a total of 2000 photons, with a background level of

2 \times 10^{- 3} photons / {nm}^{2}

. Localization accuracy at each

z

position is computed from RMSE of the localization results from 1000 single molecules.

To validate the accuracy of the upsampled PSF, we conducted a series of evaluations using simulated data. First, a fine sampled vector PSF model (Appendix D) of 22 nm pixel size was employed to accurately represent the continuous astigmatic PSF. These simulated data sets were then binned using a $15 \times 15$ pixel aggregation to produce pixelated astigmatic PSF data of 330 nm pixel size, mimicking the effects of camera pixelation. Next, from these 330 nm pixel size simulated data, we derived an upsampled astigmatic PSF with a pixel size of 110 nm. Finally, we utilized the estimated upsampled astigmatic PSF to relocalize the simulated data set. The localization results are shown in Fig. 3. Single-molecule localization using the upsampled astigmatic PSF with a 110 nm pixel size achieves unbiased and optimal localization accuracy in three dimensions (purple solid line). In contrast, single-molecule localization using the pixelated astigmatic PSF with a 330 nm pixel size results in a significant localization bias of approximately 18 nm along the $z$ axis (blue solid line).

Figure 3.Validation of the upsampled PSF modeling using simulated data from astigmatic PSF. (a) The 330 nm pixel size PSF (top) and 110 nm pixel size astigmatic PSF (bottom) estimated from 300 simulated data sets with a pixel size of 330 nm. (b)–(d) Comparison of the impact of the estimated 330 and 110 nm pixel size astigmatic PSF on the localization accuracy of the 330 nm pixel size simulated data along the

x

y

, and

z

directions, respectively. Each molecule contributes a total of 10,000 photons, with a background level of

2 \times 10^{- 3} photons / {nm}^{2}

. The

x / y / z

bias as a function of

z

is defined as in Appendix F Eqs. (F1) and (F2). Solid lines and shaded areas in localization plots indicate mean and standard deviation of bias, respectively.

Figure 4.Nup96-AF647 in U2OS cells reconstructed using the upsampled PSF model. (a) Single-molecule data of Nup96-AF647 with a pixel size of 127 nm were

2 \times 2

binned to produce undersampled data with a pixel size of 254 nm. Localization of the undersampled data using the upsampled PSF yielded a superresolved image. The red boxes indicate the single-molecule image with different pixel sizes. In the superresolution image, the white dashed box in the lower left corner corresponds to a magnified view of the region marked by the solid box. (b)

X Z

view of the selected region (white dashed lines) in (a) reconstructed from single-molecule data using PSFs of different pixel sizes. (c) Superresolution image of 321 nm pixel size single-molecule data of Nup96-AF647 analyzed with 321 and 107 nm pixel size PSFs, respectively. The

X Z

views of the selected region [white dashed lines in the left image of (c)] were compared.

To further validate the robustness of the upsampled PSF, we also performed tests using simulated data based on tetrapod PSFs. As shown in Fig. 9 (Appendix G), unbiased localization accuracy is achieved only when using the upsampled tetrapod PSF [17] with a pixel size of 110 nm. In contrast, localization using the pixelated tetrapod PSF with a pixel size of 330 nm results in a periodic bias of approximately 18 nm along the $z$ axis.

Next, we validated the accuracy of the upsampled PSF-based localization using experimental data. To this end, we set up a microscope with a pixel size of 321 nm for the images acquired by the camera to balance between resolution and FOV/data volume (Appendix A). We collected many z-stack beads data of 321 nm pixel size randomly distributed at different positions. We then estimated a PSF model with 107 nm pixel size using these beads data as described before. The estimated upsampled PSF model was then applied to localize the z-stack beads data at various positions. We then plot the retrieved $x$ , $y$ , and $z$ position bias in each frame as a function of $z$ positions. Any positional deviation in any direction would indicate a mismatch in the model. Similar to the simulated data, the localization results showed that the original 321 nm pixel size PSF (with bin factor 1, Appendix B) based localization led to significant positional bias, while the upsampled 107 nm pixel size PSF (with bin factor 3, Appendix B) achieved unbiased and precise localization (Fig. 10 in Appendix G). The only difference from the simulation data is a slight variation in localization bias with large-pixel PSFs. Due to OTF rescaling [Eq. (D8)], $σ_{x}$ and $σ_{y}$ in experimental data ( $\sim 83 nm$ ) are slightly larger than in simulations ( $\sim 55 nm$ ), effectively smoothing the PSF and slightly reducing localization bias.

To further demonstrate the performance of the upsampled PSF-based localization on biological samples, we first performed $2 \times 2$ binning to conventional 127 nm pixel size SMLM data acquired from our reference standard nucleoporin Nup96 [27]. Therefore, the data volume was decreased by 75%, and the pixel size was downsampled to 254 nm [Fig. 4(a)]. Before downsampling, we could nicely resolve the double-ring structure of the nuclear pore complex (NPC), which separated by only $\sim 50 nm$ using spline PSF fitter. After downsampling, the conventional spline PSF fitter could not resolve the ring structure with the 254 nm pixel PSF averaged from the downsampled beads data (254 nm pixel size) anymore [Fig. 4(b)]. We then derived the 127 nm pixel size PSF from the downsampled beads data. Combined with the upsampled spline-fitting algorithm, the 127 nm pixel size PSF-based localization could nicely resolve the double-ring structure from the downsampled single-molecule data, indicating that upsampled PSF localization could notably improve the accuracy of sparsely sampled single-molecule data.

After showing the effectiveness of the upsampled PSF-based single-molecule localization, we further increased the pixel size of the acquired image using a microscope (Fig. 11 in Appendix G) with a pixel size of 321 nm. The upsampled PSF model (107 nm pixel size) was estimated from the beads data as described before. As shown in Fig. 4(c), the double-ring structure cannot be resolved if a conventional spline fitter was used, while the upsampled spline fitter combined with the upsampled PSF model could nicely resolve the two-ring structure. These findings strongly support the superiority of the upsampled PSF in large-pixel imaging, particularly for applications requiring high-content images with limited pixels and data size compared to the size of the sample (e.g., pathological samples, neuron).

In summary, we proposed an upsampled PSF inverse modeling method to overcome the limitations of large-pixel imaging by recovering high-frequency details and mitigating the trade-offs between pixel size and FOV. We demonstrated that an upsampled PSF model could correct the localization bias introduced by pixel integration noise, thus increasing the pixel size by $\sim 2$ to 3 times without losing much resolution. This method not only enhances data accessibility and management but also inspires a rethinking of sampling rates across various biological imaging applications. Accurate upsampled PSF modeling could improve techniques like deconvolution and image denoising, and its potential extends to other fields such as astronomy, where large FOV imaging is critical. By providing a means to achieve both high precision and reduced data burden, this approach could catalyze broader innovations in imaging technologies.

SMLM imaging was performed at room temperature using a custom-built microscope setup (Fig. 11 in Appendix G). The sample was illuminated via a laser box coupled to a single-mode fiber. The excitation laser was reflected by a dichroic mirror (Di01-R405/488/561/635, Semrock) and focused on the back focal plane of a high NA objective (NA 1.5, UPLAPO 100XOHR, Olympus) for superresolution imaging. Fluorescence emission was collected by the objective and filtered through a quad-band emission filter (NF03-405/488/561/635E, Semrock). Following the tube lens (SWTLU-C,

f = 180 mm

, Olympus), a bandpass filter (ET685/70m, Chroma) was inserted to remove residual laser light. Subsequently, the fluorescence passed through a 4f system composed of lenses L3 (AC254-150-A-ML,

f = 150 mm

, Thorlabs) and L4 (AC254-030-A-ML,

f = 30 mm

, Thorlabs). Finally, images were acquired using an sCMOS camera (ORCA-Flash4.0 V3, Hamamatsu) with a pixel size of

6.5 μm \times 6.5 μm

. Additionally, a plano–convex cylindrical lens (LJ1516L1-C, Thorlabs) was positioned in front of the camera to introduce astigmatism for 3D imaging.

Cell Culture. U2OS cells (Nup96-SNAP, catalog No. 300444, Cell Line Services) were cultured in DMEM supplemented with 10% (volume fraction) fetal bovine serum (FBS),

1 \times

penicillin-streptomycin (PS), and

1 \times

MEM nonessential amino acids (NEAAs; catalog No. 11140-050, Gibco). The cells were maintained at 37°C in a humidified incubator with 5%

{CO}_{2}

and were passaged every 2–3 days. Prior to seeding, 25 mm high-precision round glass coverslips (No. 1.5H, catalog No. CG15XH, Thorlabs) were thoroughly cleaned by sequential sonication in 1 mol/L potassium hydroxide (KOH), Milli-Q water, and ethanol, followed by 30 min of UV sterilization. For superresolution imaging, the U2OS cells were seeded onto the cleaned coverslips and cultured for 2 days until reaching 80%–90% confluency. Routine mycoplasma tests confirmed the absence of contamination throughout the study.

Fluorescence bead sample. 100 nm TetraSpeck beads (Thermo Fisher Scientific, catalog No. T7279) were used. A clean 25 mm coverslip was incubated with 40 μL of 100 mmol/L

Mg {Cl}_{2}

and 360 μL of 1:1000 diluted bead solution for 5 min. The coverslip was then thoroughly rinsed with Milli-Q water 3 times. After washing, the coverslip was placed in a custom sample holder, and 1 mL of Milli-Q water was added to maintain hydration.

Nup96-SNAP-AF647 in U2OS cells. To label Nup96, U2OS-Nup96-SNAP cells were prepared following established protocols. Cells were initially prefixed in 2.4% paraformaldehyde (PFA) for 30 s, permeabilized with 0.4% Triton X-100 for 3 min, and then fixed in 2.4% PFA for an additional 30 min. Following fixation, the cells were quenched with 0.1 mol/L

{NH}_{4} Cl

for 5 min and washed twice with PBS. To minimize nonspecific binding, cells were blocked for 30 min using Image-iT FX Signal Enhancer (Invitrogen, catalog No. I36933). For labeling, cells were incubated for 2 h in a dye solution containing 1 μmol/L SNAP-tag ligand BG-AF647 (New England Biolabs, catalog No. S9136S), 1 mmol/L dithiothreitol (neoFroxx, catalog No. 1111GR005), and 0.5% bovine serum albumin (BSA) in PBS. After incubation, cells were washed 3 times with PBS for 5 min each to remove unbound dye. Finally, cells were postfixed with 4% PFA for 10 min, washed 3 times with PBS, and stored at 4°C until imaging.

Collection of bead data on single-objective SMLM systems. The fluorescence bead sample was prepared as described above. When

bin = 2

, we typically consider that collecting data from more than 10 bead stacks is sufficient for robust upsampled PSF estimation; when

bin = 3

, collecting data from over 30 bead stacks is considered adequate. Each bead stack was acquired by moving the sample stage from

- 1

to 1 μm, with a step size of 50 nm. One frame per

z

position was collected, and each frame could include multiple beads, with approximately 16,000 photons and 80 background photons per bead. If the photon count per bead is low, the number of beads can be increased accordingly. Our method, based on Poisson distribution estimation, is adaptable to data with different photon counts (SNR). When using the demo on our GitHub to fit the upsampled PSF, a localization bias average below 5 nm indicates reliable results.

Imaging of Nup96-AF647 on single-objective SMLM systems. Nup96-SNAP-AF647 labeled U2OS cells were imaged with the Hamamatsu sCMOS camera over an FOV covering

56 pixels \times 56 pixels

. The camera was operated under the rolling shutter readout mode with an exposure time of 20 ms. 150,000 frames were acquired. The position of the cylindrical lens before camera was adjusted so that

\sim 80 nm

astigmatism aberration was introduced to the system.

The initial position for each bead is

x_{init} = y_{init} = z_{init} = 0 .

(B1)

The background value for each bead is estimated as

{bg}_{i} = \min (D_{i} (x, y, z) \otimes G (x, y, z, σ_{x} = σ_{y} = σ_{z} = 2)) / {bin}^{2},

(B2)

where

D_{i} (x, y, z)

represents the cropped 3D image stack of bead

i

, while

G

denotes a 3D Gaussian kernel with standard deviations

σ_{x}

σ_{y}

, and

σ_{z}

in the respective

x

y

, and

z

directions. bin factor refers to the ratio between the pixel size of the data and the pixel size of the upsampled PSF. The symbol

\otimes

denotes the convolution operation. The minimum value is taken over all voxels within the cropped and filtered region. The initial background for each bead is set to the median value of

{bg}_{i}

across all bead data,

{bg}_{init} = \underset{i}{median} (b_{i}) .

(B3)

The initial photon value for each bead is estimated as

{Np}_{init}^{i} = \underset{z}{avg} (\sum_{x, y} D_{i} (x, y, z) - b_{i}) .

(B4)

The initial PSF model is a 3D array with each element set to

1 / (N_{s}^{2} \cdot {bin}^{2})

with

N_{s}

the ROI size of bead data. For example, if the ROI size is

N_{s} \times N_{s} = 21 \times 21

pixels and the sampling factor bin is 4, the initial value is 0.00014. This assumes that the sum of each axial slice of the PSF model is 1.

The upsampled PSF model at a given bead location (

x_{i}

y_{i}

z_{i}

) can be calculated as follows [22]:

U_{i}^{up} (x - x_{i}, y - y_{i}, z - z_{i}) = F_{3 D}^{- 1} (F_{3 D} ({Np}_{i} \cdot {PSF}_{up} (x, y, z) + {bg}_{i}) \cdot g (q) \cdot e^{i φ_{shift}}),

(B5)

where the shifting phase is calculated from

φ_{shift} = 2 π (q_{x} x_{i} + q_{y} y_{i} + q_{z} z_{i}),

(B6)

where

q_{x}

q_{y}

q_{z}

are the Cartesian coordinates of the frequency space as given by

q_{x} = \frac{x}{L_{x}}, q_{y} = \frac{y}{L_{y}}, q_{z} = \frac{z}{L_{z}},

(B7)

where

L_{x}

L_{y}

, and

L_{z}

are length of the upsampled PSF model in

x

y

, and

z

directions. Considering the bead size, we model the bead as a sphere and

g (q)

is the analytical Fourier transform of a solid sphere of radius

r_{0}

g (q) = \frac{J_{\frac{3}{2}} (2 π q r_{0})}{{(q r_{0})}^{\frac{3}{2}}} r_{0}^{3},

(B8)

where

J

is the Bessel function of the first kind and

q

is the spherical coordinate in the frequency space and is calculated from

q = \sqrt{q_{x}^{2} + q_{y}^{2} + q_{z}^{2}} .

(B9)

To ensure that the variation of

U_{i}^{up}

is not affected,

g (q)

is normalized to its maximum value. To match

U_{i}^{up}

with the pixel size of the data, we further apply a binning operation to

U_{i}^{up}

U_{i} (k, l) = \sum_{i = 1}^{bin} \sum_{j = 1}^{bin} U_{i}^{up} (bin \cdot k + i, bin \cdot l + j),

(B10)

where

U_{i}

is the binned (merged) forward model, which can be directly compared with the measured bead data

M_{i}

k

and

l

denote the horizontal and vertical pixel indices of the forward model.

We utilized the L-BFGS-B algorithm from the SciPy optimization package for our optimization process. To account for the varying scales of gradients across different variable types, we applied a tailored scaling factor to each type. This adjustment ensured uniform update rates for all variables throughout each iteration. The original variables in the forward model were substituted with transformed variables, ensuring consistent and balanced updates during the optimization procedure. For example,

{Np}_{i} = \frac{{Np}_{i}}{w_{{Np}_{i}}} w_{{Np}_{i}} = {Np}_{w, i} w_{Np}, {bg}_{i} = \frac{{bg}_{i}}{w_{{bg}_{i}}} w_{{bg}_{i}} = {bg}_{w, i} w_{bg}, {PSF}_{up} = \frac{{PSF}_{up}}{w_{{PSF}_{up}} {bin}^{2}} w_{{PSF}_{up}} = {PSF}_{w} w_{PSF},

(B11)

where

{PSF}_{w}

{Np}_{w, i}

{bg}_{w, i}

are scaled variables, with respect to which the gradient will be calculated, and

w_{PSF}

w_{Np}

w_{bg}

are scaling factors for the upsampled PSF model, photons, and background. The scaling factors can be determined based on the gradient values of their respective parameters, ensuring that all gradient values are within the same order of magnitude. Proper variable scaling is crucial in PSF learning, as it ensures that the optimization process converges to the global minimum and facilitates the efficient optimization of all variables.

The loss function for upsampled PSF estimation is

loss = LL + a_{drift} f_{drift} + β (a_{PSF, \min} f_{PSF, \min} + a_{bg, \min} f_{bg, \min} + a_{Np, \min} f_{Np, \min} + a_{norm} f_{norm}),

(B12)

where

LL

is the log-likelihood function, assuming that the converted pixel values follow a Poisson distribution,

LL = \underset{i}{avg} (U_{i} - D_{i} - D_{i} \log (U_{i}) + D_{i} \log (D_{i})) .

(B13)

Note that

D_{i}

and

D_{i} \log (D_{i})

are normalization factors for the likelihood probabilities, aimed at improving the stability of the fitting process.

When sample drift is considered,

f_{drift}

is equal to the

L^{1}

norm of all drift rates over the bead data; this term is to constrain the estimated drift rates to be close to zero, to avoid adding an arbitrary constant drift rate. We define the bead’s lateral position at each axial slice as

x_{i, z} = g_{x, i} z, y_{i, z} = g_{y, i} z,

(B14)

where

z = 1, 2, \dots, N_{z}

, and

g_{x, i}

and

g_{y, i}

are additional learning variables that define the drift rates along

x

and

y

for each bead stack. To apply the

z

-dependent lateral drift, the obtained forward model is shifted slice-wise through a 2D Fourier transform,

U_{drift, i} (x - x_{i, z}, y - y_{i, z}, z) = F_{2 D}^{- 1} (F_{2 D} (U_{i} (x, y, z)) e^{i 2 π (k_{x} x_{i, z} + k_{y} y_{i, z})}) .

(B15)

The next three terms serve to constrain the values of the upsampled PSF model, photons, and background to be positive,

f_{PSF, \min} = \sum_{x, y, z} \min {({PSF}_{up}, 0)}^{2}, f_{Np, \min} = \sum_{i} \min {({Np}_{i}, 0)}^{2}, f_{bg, \min} = \sum_{i} \min {({bg}_{i}, 0)}^{2} .

(B16)

Here min denotes an element-wise minimum, comparing each value in an array with zero. As default we set

a_{PSF, \min} = 100

a_{Np, \min} = 1

a_{bg, \min} = 1

The final term,

f_{norm}

, in the loss function is introduced to enforce the sum of each axial slice of the PSF model to remain constant. This constraint is based on the principle of energy conservation, ensuring that the total photon count of a single emitter remains invariant across different axial positions. Thus, we define

f_{norm} = \underset{z}{avg} {(\sum_{x, y} U - \sum_{x, y, z} \frac{U}{N_{z}})}^{2} .

(B17)

This term is often utilized in conjunction with the estimation of

z

-dependent photon fluctuations to account for photobleaching and variations in illumination intensity during data acquisition. By default, we set

a_{norm} = 0

Before performing cubic spline interpolation on the estimated upsampled PSF, a convolution operation must first be applied. This is because the experimental data are obtained through integration over large pixel sizes, while the estimated upsampled PSF is based on integration over small pixel sizes. The difference between the two is determined by the bin factor, which represents the ratio between the large and small pixel sizes. To align the two, a convolution is performed on the upsampled PSF using a kernel of size corresponding to the bin factor, with all elements set to 1, and a stride of 1. The detailed convolution procedure is as follows:,

{PSF}_{con} (k, l) = {PSF}_{raw} \otimes ones (bin, bin) = \sum_{i = 1}^{bin} \sum_{j = 1}^{bin} {PSF}_{raw} (s k + i, s l + j),

(C1)

where

{PSF}_{con}

represents the convolved upsampled PSF,

\otimes

denotes the convolution operation, and

ones (bin, bin)

is a matrix of size equal to the bin factor, with all elements set to 1.

s

denotes the convolution stride, fixed at 1 in this instance. The bin factor refers to the ratio between the pixel size of the data and the pixel size of the upsampled PSF.

For the localization step using the

{PSF}_{con}

, the data were analyzed with the cubic spline-fitting method. The cubic spline interpolation of a given PSF model is [19,20]

f_{i, j, k} (x, y, z) = \sum_{m = 0}^{3} \sum_{n = 0}^{3} \sum_{p = 0}^{3} a_{i, j, k, m, n, p} {(\frac{x - x_{i}}{\frac{Δ x}{bin}})}^{m} {(\frac{y - y_{j}}{\frac{Δ y}{bin}})}^{n} {(\frac{z - z_{k}}{Δ z})}^{p},

(C2)

where

Δ x

and

Δ y

are the

x

and

y

pixel sizes of the camera,

Δ z

is the axial step size of the z-stack,

a_{i, j, k, m, n, p}

are the spline coefficients, and

x_{i}

y_{j}

z_{k}

are the start position of each voxel

(i, j, k)

in the upsampled PSF.

x

y

are the positions corresponding to the camera pixels, while

z

represents the position in the z-stack. After building the spline PSF model, MLE with Poisson statistics was used to localize beads or single molecules with the objective function given by [21]

χ_{mle}^{2} = 2 (\sum_{k, j} (U_{k, j} - M_{k, j}) - \sum_{k, j, M_{k, j} > 0} M_{k, j} \ln \frac{U_{k, j}}{M_{k, j}}),

(C3)

where

U_{k, j}

and

M_{k, j}

are the expected photon number and measured photon number in the pixel

(k, j)

, respectively. We used a modified Levenberg–Marquardt (L-M) algorithm [28] to minimize

χ_{mle}^{2}

for the parameter estimation.

To accurately model the image formation process in a microscope equipped with a high NA objective, a vectorial PSF model is employed, accounting for the refractive index mismatch between the medium-cover slip interface and the cover slip-immersion medium interface. Given that fluorescent probes are typically flexibly attached to the target molecules and are capable of free rotation, we assume an isotropic emitter PSF model for our analysis. The vectorial PSF can be expressed as [25,26,29,30]

PSF (x - x_{i}, y - y_{i}, z - z_{i}) = \sum_{\begin{matrix} m = x, y n = p_{x}, p_{y}, p_{z} \end{matrix}} | F_{czt} (h (k_{x}, k_{y}) w_{m n} e^{i 2 π (k_{x} (x - x_{i}) + k_{y} (y - y_{i}) - k_{z} (z - z_{i}))}) |^{2}, k_{z} = \sqrt{k^{2} - k_{x}^{2} - k_{y}^{2}},

(D1)

where

h (k_{x}, k_{y})

is the pupil function and can be written as

h (k_{x}, k_{y}) = T_{a} A (k_{x}, k_{y}) e^{i Φ (k_{x}, k_{y})},

(D2)

where

A

and

Φ

are the magnitude and phase components of the pupil function, and each of them is a 2D array.

T_{a}

is the apodization factor of the objective lens and is equal to

T_{a} = \frac{\sqrt{\cos θ_{imm}}}{\cos θ_{med}},

(D3)

where

θ_{med}

and

θ_{imm}

represent the angles of the optical rays in the sample medium and the immersion medium, respectively, and are constrained by the NA of the objective lens. The wave vector

k

has Cartesian components

k_{x}

k_{y}

k_{z}

, with its magnitude

k = \frac{n_{imm}}{λ}

, where

n_{imm}

is the refractive index of the immersion medium, and

λ

is the central wavelength corresponding to the emission filter. The term

e^{- i 2 π k_{z} (z - z_{i})}

accounts for the defocus phase in the propagation.

We employed the chirp Z-transform based on Bluestein’s algorithm (denoted as

F_{czt}

) to compute the 2D Fourier transform of the pupil function. The advantage of using Bluestein’s algorithm is that the pupil function size becomes independent of the camera’s pixel size, allowing a pupil size of

64 pixels \times 64 pixels

to offer sufficient sampling for accurate representation.

In this context,

w_{m n}

denotes the m-component of the electric field at the pupil plane generated by the

n

-component of the dipole moment of the fluorophore. The calculation of

w_{m n}

is as follows:

w_{x n} = P_{n} \cos φ - S_{n} \sin φ, w_{y n} = P_{n} \sin φ - S_{n} \cos φ,

(D4)

where

φ

is the angular component in the polar coordinate of the frequency space.

P_{n}

and

S_{n}

are electric field components in p and s polarizations relative to the incident plane at the sample space,

P_{p_{x}} = T_{p} \cos θ_{1} \cos φ, P_{p_{y}} = T_{p} \cos θ_{1} \sin φ, P_{p_{z}} = - T_{p} \sin θ_{1}, S_{p_{x}} = - T_{s} \sin φ, S_{p_{y}} = T_{s} \cos φ, S_{p_{z}} = 0,

(D5)

where

p_{x}

p_{y}

p_{z}

are the Cartesian components of the dipole moments, and

T_{p}

and

T_{s}

are the total transmission coefficients of

p

- and

s

-polarized light. The fluorescence light starting from the dipole emitter propagates through the sample medium, the coverslip, and the immersion medium. In this three-layer system, we ignore the multiple reflections at the two interfaces; then

T_{p}

and

T_{s}

can be calculated from

T_{p} = τ_{P 13} τ_{P 23}, T_{s} = τ_{S 13} τ_{S 23},

(D6)

where

τ_{P i j}

and

τ_{S i j}

are the Fresnel transmission coefficients of

s

- and

p

-polarized light traveling from medium

i

to medium

j

τ_{P i j} = \frac{2 n_{i} \cos θ_{i}}{n_{i} \cos θ_{j} + n_{j} \cos θ_{i}}, τ_{S i j} = \frac{2 n_{i} \cos θ_{i}}{n_{i} \cos θ_{i} + n_{j} \cos θ_{j}},

(D7)

where

n_{i}

and

θ_{i}

are the refractive index and the light propagation angle in medium

i

, and the subscripts 1, 2, 3 denote the sample medium, the coverslip, and the immersion medium, respectively.

In order to realistically account for the effects of fluorescence dipole emission, pixelation, particle size, dispersion, and chromatic aberration, the PSF model is rescaled using the OTF as

{PSF}_{otf} = F_{3 D}^{- 1} (F_{3 D} (PSF) \cdot e^{- i 2 {(σ_{x} q_{x} π)}^{2} - i 2 {(σ_{y} q_{y} π)}^{2}}),

(D8)

where

σ_{x}

and

σ_{y}

are the standard deviation (in pixel unit) of a 2D Gaussian kernel in real space. By including the photon count and background of each bead stack, the final forward model is

U = Np \cdot {PSF}_{otf} + bg,

(D9)

where

Np

and

bg

represent the photon count and background, respectively.

The CRLB is calculated from the diagonal elements of the inverse of the Fisher information matrix

I (Θ)

, which measures the amount of information that an observation (PSF) carries about the estimated parameters

Θ

CRLB {(Θ)}_{i j} = diag (I {(Θ)}_{i j}^{- 1}),

(E1)

where diag refers to extracting the diagonal elements of a matrix. The Fisher information matrix is defined as

I {(Θ)}_{i j} = \sum_{j} \sum_{k} \frac{1}{U_{k}} \frac{\partial U_{k, l}}{\partial Θ_{i}} \frac{\partial U_{k, l}}{\partial Θ_{j}}, U_{k, l} = {(Np \cdot {PSF}_{pixelated} (k, l) + bg)}_{k},

(E2)

where

Θ

is a set of parameters being estimated, and

k

and

l

denote the horizontal and vertical pixel indices of the pixelated PSF, respectively. The parameters

Θ

for pixelated PSF are

x

y

z

, photon

Np

, and background

bg

for each emitter. The

U_{k, l}

is the forward model as defined in Eq. (B10).

The pixelated PSF and its first-order derivatives with respect to the parameters are obtained by integrating the upsampled PSF and its corresponding derivatives. The first-order derivatives of the pixelated forward model

U_{k}

with respect to the parameters

Θ

are given by

{PSF}_{pixelated} (k, l) = \sum_{i = 1}^{bin} \sum_{j = 1}^{bin} {PSF}_{upsampled} (bin \cdot k + i, bin \cdot l + j), \frac{\partial U_{k, j}}{\partial Θ_{x, y, z}} = Np \sum_{i = 1}^{bin} \sum_{j = 1}^{bin} \sum_{\begin{matrix} m = x, y n = p_{x}, p_{y}, p_{z} \end{matrix}} Re (E_{m n}^{*} \frac{\partial E_{m n}}{\partial Θ_{x, y, z}}), \frac{\partial U_{k}}{\partial Np} = {PSF}_{pixelated} (k, l), \frac{\partial U_{k}}{\partial bg} = 1, E_{mn} = F_{czt} (h (k_{x}, k_{y}) w_{mn} e^{i 2 π (k_{z med} z_{i} - k_{z} z_{s})} e^{i 2 π (k_{x} x_{i} + k_{y} y_{i})}), \frac{\partial E_{mn}}{\partial Θ_{x, y, z}} = F_{czt} (i k_{x, y, z med} h (k_{x}, k_{y}) w_{mn} e^{i 2 π (k_{z med} z_{i} - k_{z} z_{s})} e^{i 2 π (k_{x} x_{i} + k_{y} y_{i})}),

(E3)

where bin factor refers to the ratio between the pixel size of the data and the pixel size of the upsampled PSF.

Due to the significant variation in CRLB at different positions within the large-pixel vector PSF (Fig. 8 in Appendix G), 121 points were uniformly selected within a single pixel. The CRLB was calculated for each point, and the average of these values was taken to obtain the final mean CRLB,

{CRLB}_{mean}^{\frac{1}{2}} = \sum_{i = 1}^{num} {CRLB}^{\frac{1}{2}} (x_{i}, y_{i}) / num,

(E4)

where

x_{i}

and

y_{i}

represent different position coordinates within a single pixel. The num represents the number of different positions sampled within a single pixel.

The CRLB calculated in the current work is slightly better than that in previous work using a numerical approach, especially for larger pixel conditions. We found that blurring of the theoretical PSF with a Gaussian function [Eq. (D8)] to mimic experimental PSF will reduce the impact of the pixel size on the theoretical CRLB, which was not considered by Huang et al. (Fig. 12 in Appendix G). We also showed that our upsampled spline PSF fitter could achieve a localization accuracy approaching the calculated CRLB (Fig. 8), showing that both the upsampled PSF modeling and upsampled spline PSF fitter could achieve the optimal resolution.

The calculation of

{CRLB}_{3 D}

is as follows [26]:

{CRLB}_{3 D} = \frac{1}{N_{z}} \sum_{i = 1}^{N_{z}} ({CRLB}_{mean, x} + {CRLB}_{mean, y} + {CRLB}_{mean, z}) .

(E5)

Similarly, the calculation of

{RMSE}_{3 D}

is as follows:

{RMSE}_{3 D} = \frac{1}{N_{z}} \sum_{i = 1}^{N_{z}} {({RMSE}_{x}^{2} + {RMSE}_{y}^{2} + {RMSE}_{z}^{2})}^{\frac{1}{2}},

(E6)

where

N_{z}

represents the number of z-positions within the axial range.

To evaluate the accuracy of the learned upsampled PSF model, we performed localization tests using the same bead data that were utilized during training. The accuracy was assessed by measuring the localization bias in the

x

y

, and

z

dimensions, with the localization bias in

z

calculated from

z_{bias, i, z} = z_{i, z} - z_{GT, i, z},

(F1)

where

Z_{GT}

represents the ground-truth position, which corresponds to the stage position. Similarly, the localization bias in

x / y

is calculated from

x_{bias, i, z} = x_{i, z} - \underset{z}{median} x_{i, z}, y_{bias, i, z} = y_{i, z} - \underset{z}{median} y_{i, z},

(F2)

where

x_{i, z}

and

y_{i, z}

are estimated using MLE for the lateral localization of each 2D bead image, and the subscripts

i

and

z

refer to the indices of the beads and the axial slices within each bead stack, respectively.

Figs. 5–12 are figures supplementary to the main text. More details are shown in their figure captions.

Figure 5.Comparison of the shape and intensity distribution of 330 nm pixelated PSFs at different positions. (a) PSFs with a pixel size of 330 nm at different

x

positions showing the shape of PSF is spatially variant.

x_{c}

is the

x

position of the PSF, with 0 nm locating at the center of the middle pixel. (b) Relative intensity distribution at the white dashed line in (a) for PSFs at different

x

positions.

Figure 6.Validation of spline fitting for different sampling rates. (a) To match the pixel size of the upsampled PSF with the simulated or experimental data, a convolution operation is performed on the estimated raw upsampled PSF (Raw) to generate the pixel integrated upsampled PSF (Convolved), using a convolution kernel size of

bin \times bin

with all values set to 1 [Eq. (B15)]. The convolved PSF is used for localizing experimental data. The bin factor represents the ratio between the pixel size of the simulated or experimental data and the pixel size of the upsampled PSF, with its value constrained to integer values. (b) The spline fitter returns the

z

-position results by fitting the same simulated data using upsampled PSFs at different sampling rates. (c) Localization results of 110, 55, and 27.5 nm pixel size PSF for 110 nm pixel size single-molecule data. The simulated data consist of 51 axial positions evenly distributed along

- 600

to 600 nm with 1000 single molecules at each axial position. Each molecule contributes a total of 2000 photons, with a background level of

2 \times 10^{- 3} photons / {nm}^{2}

Figure 7.CRLB and RMSE improved by using upsampled PSF model and novel spline PSF fitter. (a) Comparison of RMSE for 330 nm pixel data, with and without pixel integration [convolution in Fig. 6(a)]. RMSE_con shows results after convolution, while RMSE_raw shows results without convolution. Only with convolution (RMSE_con) does the localization achieve the theoretical CRLB limit. (b) Comparison of RMSE for localizing 330 nm data with 110 nm upsampled PSF and 330 nm PSF using our novel fitting algorithm. The results show that only the upsampled PSF achieves the theoretical CRLB limit. Simulation parameters match those in Fig. 6.

Figure 8.Comparison of localization accuracy at different lateral positions. Localization accuracy of 110 nm pixel size PSFs on 330 nm single-molecule data at different lateral positions, (0 nm, 0 nm), (0 nm, 115 nm), (115 nm, 0 nm), and (115 nm, 115 nm) for (a), (b), (c), and (d), respectively. Here, the center of the central pixel is set as coordinate (0 nm, 0 nm). We used a vector PSF model (Appendix D) to generate 1000 simulated single molecules at different

x

and

y

positions at each axial position. Twenty-five axial positions uniformly distributed along

- 600

to 600 nm were evaluated. Each molecule contributes a total of 2000 photons, with a background level of

2 \times 10^{- 3} photons / {nm}^{2}

. Localization accuracy at each

z

position is computed from the RMSE of the localization results from 1000 single molecules.

Figure 9.Validation of the upsampled PSF modeling using simulated data from tetrapod PSF. (a) The 330 nm pixel size PSF (top) and 110 nm pixel size PSF (bottom) estimated from 300 simulated tetrapod PSF data sets with a pixel size of 330 nm. (b)–(d) Comparison of the impact of the estimated 330 and 110 nm pixel size tetrapod PSF on the localization accuracy of the 330 nm pixel size simulated data along the

x

y

, and

z

directions, respectively. Each molecule contributes a total of 10,000 photons, with a background level of

2 \times 10^{- 3} photons / {nm}^{2}

. The

x / y / z

bias as a function of z is defined as in Appendix F, Eqs. (F1) and (F2). Solid lines and shaded areas in localization plots indicate mean and standard deviation of bias, respectively.

Figure 10.Validation of the upsampled PSF modeling using experimental data from astigmatic PSF. (a) 321 nm pixel size experimental bead data (top) and the estimated 321 and 107 nm pixel size astigmatic PSF (bottom). (b)–(d) Comparison of the impact of the estimated 321 and 107 nm pixel size astigmatic PSF on the localization accuracy of the 321 nm pixel size experimental data along the

x

y

, and

z

directions, respectively. The

x / y / z

bias as a function of

z

is defined as in Appendix F, Eqs. (F1) and (F2). Solid lines and shaded areas in localization plots indicate mean and standard deviation of bias over

50

beads, respectively.

Figure 11.Detailed layout of the optical setup. M, mirror; DM, dichroic mirror; L, lens; TS, translation stage; FC, fiber coupler; fiber, single-mode fiber; BFP, back focal plane; FW, filter wheel; TBL, tube lens; AP, aperture; QPD, quadrant photodiode. The excitation lasers are first reflected by dichroic mirror DM1 and coupled into a single-mode fiber through the fiber coupler FC. Before being reflected by the main dichroic mirror DM2 to enter the objective for sample illumination, the beam is collimated and reshaped by a pair of lenses (L1 and L2) and a slit at AP1. In the imaging path, the fluorescence collected by the objective passes through the dichroic mirror DM2 and is filtered by the filter wheel FW. It is then focused using a tube lens TBL. Subsequently, the fluorescence passes through a 4f system composed of lenses L3 (

focal length = 150 nm

) and L4 (

focal length = 30 nm

), before being detected by the camera. Additionally, a beam excited by a 785 nm laser, reflected off the coverslip, is detected by the quadrant photodiode (QPD), providing feedback control to the Z-stage for focus locking.

Figure 12.Comparison of the impact of PSF Gaussian blurring on the theoretical CRLB. The

σ

corresponds to the

σ_{x}

and

σ_{y}

terms in Appendix D, Eq. (D8). When

σ = 0

, Gaussian blurring is not applied. The pixel size of the data used is 330 nm. The simulation parameters are the same as those in Fig. 5.

微信扫一扫：分享

微信扫一扫：分享