Channel estimation-based time-frequency neural network for post-equalization in underwater visible light communication

Haoyu Zhang; Li Yao; Chaoxu Chen; Yuan Wei; Chao Shen; Jianyang Shi; Junwen Zhang; Ziwei Li; Nan Chi

doi:10.3788/COL202422.060602

Note: This section is automatically generated by AI . The website and platform operators shall not be liable for any commercial or legal consequences arising from your use of AI generated content on this website. Please be aware of this.

Abstract

This Letter proposes a post-equalizer for underwater visible light communication (UVLC) systems that combines channel estimation and joint time-frequency analysis, named channel-estimation-based bandpass variable-order time-frequency network (CBV-TFNet). By utilizing a bandpass variable-order loss function with communication prior knowledge, CBV-TFNet enhances communication performance and training stability. It enables lightweight implementation and faster convergence through a channel estimation-based mask. The superior performance of the proposed equalization method over Volterra and deep neural network (DNN)-based methods has been studied experimentally. Using bit-power loading discrete multitone (DMT) modulation, the proposed method achieves a transmission bitrate of 4.956 Gbps through a 1.2 m underwater channel utilizing only 38.15% of real multiplication calculations compared to the DNN equalizer and achieving a bitrate gain of 440 Mbps and a significantly larger dynamic range over the LMS-Volterra equalizer. Results highlight CBV-TFNet’s potential for future post-equalization in UVLC systems.

Keywords

channel estimation neural network post-equalizer underwater visible light communication

1. Introduction

The Internet of Everything (IoE) era, envisioned by 6G technology, has brought about an increase of data collected from underwater objects and activities, which has in turn intensified the demand for more efficient and reliable underwater communication methods. Among the available options, underwater visible light communication (UVLC) stands out due to its superior speed, low latency, and enhanced security features in comparison to conventional acoustic and radio frequency communication methods. Consequently, UVLC has become a promising solution for future underwater wireless communication, attracting considerable attention worldwide^[1].

Among the many methods to achieve underwater wireless optical communication (UWOC), UVLC systems using blue-green LEDs have demonstrated potential for achieving high-speed and long-distance underwater transmission^[2-5]. However, one of the major challenges that restrict the performance of UVLC is the presence of nonlinear effects. These effects, caused by both linear and nonlinear components within the system, lead to signal distortions that greatly degrade the overall system performance. Among these, nonlinear effects, primarily resulting from photonic conversion at the receiver and transmitter ends, pose a more significant challenge compared to linear distortion. As the modulation order and rate continue to increase, there is a growing demand for higher signal quality within the UVLC system.

Traditional post-equalization methods such as decision feedback equalizers (DFEs)^[6] and the commonly used Volterra series^[7] have limited compensation abilities when it comes to addressing nonlinear distortions in UVLC systems. In recent years, neural networks (NNs) have demonstrated significant potential in capturing and approximating complex functions, leading to the adoption of AI algorithms in UVLC. These include LSTM-NN equalizers^[8], the time-frequency network TFDNet^[9], SVM-based post-equalizers^[10], and dynamic pre-equalization NNAEM^[11]. Despite their excellent equalization performance, AI equalizers face challenges such as high computational complexity, slower convergence, and poor generalization. However, the emerging approach of integrating prior knowledge of communication and physics into NNs can help overcome these limitations.

Sign up for Chinese Optics Letters TOC. Get the latest issue of Chinese Optics Letters delivered right to you！Sign up now

In this Letter, we propose a novel channel-estimation-based bandpass variable-order time-frequency network (CBV-TFNet). Given that nonlinear effects tend to be exposed and emphasized in the frequency domain, the integration of the time-frequency joint approach offers additional advantages to the model. In addition, a channel-estimation-generated mask is introduced before the network’s input layer. During the training process, we have innovatively designed a bandpass variable-order (BV) loss function to guide the neural network (NN) in effectively equalizing both out-of-band and in-band regions of the signal spectrum. The experimental results demonstrate that CBV-TFNet achieves a communication rate gain of 172 Mbps compared to DNN-based post-equalizers, while significantly reducing the time complexity. Furthermore, our proposed method achieves a remarkable bitrate of 4.95 Gbps over a 1.2 m underwater channel, using bit-power loading discrete multitone (DMT) modulation^[12]. To the best of our knowledge, this represents the highest bitrate achieved in UVLC systems utilizing a single-wavelength LED.

2. Principle

Figure 1 illustrates the architecture of the CBV-TFNet post-equalizer. Assume that the time-domain expression of the transmitted signal is $x (n) \in R^{N}$ and the signal received is $y (n) \in R^{N}$ . In the proposed model, the short-time Fourier transform (STFT) is employed for feature extraction of the received signal, transforming the time-domain signal $y (n)$ into a two-dimensional (2-D) time-frequency spectrum $Y (ω) \in C^{M^{'} \times N^{'}}$ . The STFT process unfolds as follows. First, the time-domain sequence undergoes sliding window processing. To avert spectral leakage caused by the marginal effect of the truncated time-domain sequence, a Hamming window of length $W$ is employed for this processing, with sliding window hop size $H$ continuing until all data points are processed. At each sampling time, an overlapping portion of length $O$ is retained from the data obtained by the sliding window, which satisfies the relation of $W = O + H$ . Notably, when performing the inverse transform of STFT, the COLA constraint^[13] must be met to ensure lossless signal reduction in the conversion process, which can be satisfied when $O \geq W / 2$ . For the data procured by the sliding window, each window’s data are subjected to D points of discrete Fourier transform (DFT). Given the necessity for higher frequency resolution in post-equalization, the DFT points are generally considered as equivalent to $W$ . The time-frequency signal can be expressed as $Y (ω) = [Y_{1} (ω), Y_{2} (ω), \dots, Y_{N} (ω)]$ . In the matrix, the $t$ th column can be formulated as follows: $Y_{t} (ω) = \sum_{n = t H}^{t H + W - 1} y (n) w (n - t H) \exp [- j ω (n - t H) / W],$ (1)where $w (\cdot)$ represents the window function of length $W$ . Given that the 2-D time-frequency signal obtained via DFT yields a complex signal, it is transformed into a real number prior to inputting into the DNN. The real and imaginary parts of each sliding window spectrum are integrated into a single column, setting the height and width of the 2-D signal at $2 W$ and $⌊ (N - W + H) / H ⌋$ , respectively. Finally, the 2-D spectrum is fed into NN.

Figure 1.Architecture of CBV-TFNet post-equalizer.

The above process preprocesses time-domain waveform data in TFDNet^[9]. However, our experiments found that fitting DNN to frequency domain signals is slightly less effective than for time domain signals, potentially due to longer concatenated spectral data length. This reduces data processing capacity under the same NN. To mitigate this, transformations are introduced in two ways, as shown in Fig. 1. First, the pilot signal estimates the channel to generate a mask, preprocessing the spectrum sequence before network input, thus sharing the equalization burden and simplifying the network structure. Simultaneously, our proposed BV loss function guides the NN’s attention for spectrum equalization, allowing the NN to focus on the passband containing more effective information, resulting in improved performance compared to average allocation. For the mask generation component, transmitters generate a random sequence $x_{P} (n) \in R^{P}$ as the pilot signal with a sampling rate identical to the transmitted signal. The LED transmits the pilot signal at the same working point, resulting in $y_{P} (n) \in R^{P}$ after traversing the underwater channel. STFT is executed on both signals, eventually obtaining the channel transmission matrix $HP (ω) \in C^{M' \times P'}$ by dot division, $HP (ω) = X_{P} (ω) / Y_{P} (ω),$ (2)and then aggregating it on the time dimension, $M (ω) = (1 / P) \cdot \sum_{t = 1}^{P} {HP}_{t} (ω) .$ (3)

Considering the equalization effect of NNs on nonlinear effect equalization, in order to avoid filtering out the out-of-band part of the signal spectrum, which is important to the nonlinear equalizer, by the mask derived from the inverse channel transmission matrix, we apply a smoothing operation on the resulting sequence, namely mean filtering: $M_{R} (ω) = smooth [M (ω), s] = \sum_{i = 0}^{M^{'}} \frac{1}{s} \sum_{j = i - (s - 1) / 2}^{i + (s - 1) / 2} M (i) .$ (4)

The STFT spectrum of the received signal is the dot produced with $M_{R} (ω)$ before inputting into the network each time. Finally, the $t$ th column of the 2-D spectrum can be expressed as $M_{R} (ω) \cdot Y_{t} (ω)$ . For the loss function part of the network, BV loss function is defined as follows: $BV (Θ) = \frac{1}{T} \sum_{t = 1}^{T} (α_{Cut} α_{Pass} α_{Cut}) \cdot (\begin{matrix} \sum_{ω = 0}^{ω_{Start}} ‖ {\hat{Y}}_{t} (ω) - X_{t} (ω) ‖_{2}^{2} \\ \sum_{ω = ω_{Start}}^{ω_{Stop}} ‖ {\hat{Y}}_{t} (ω) - X_{t} (ω) ‖_{2}^{2} \\ \sum_{ω = ω_{Stop}}^{ω} ‖ {\hat{Y}}_{t} (ω) - X_{t} (ω) ‖_{2}^{2} \end{matrix}) .$ (5)

In this equation, $α_{Pass}$ represents the weight of the model mismatch value within the signal passband, while $α_{Cut} = 1 - α_{Pass}$ signifies the weight outside the passband. The STFT spectrum outputted by the NN and the transmitted signal’s STFT spectrum acting as the label are divided into inside and outside passband frequency domain sequences, with the passband bandwidth $PB = (ω_{Stop} - ω_{Start}) / 2 π$ . By computing the L2 norms separately, the mismatch values for each part of the spectrum are obtained. Ultimately, the sum of these weighted mismatch values serves as the final loss.

It is worth mentioning that the bandwidth in the loss function, as a hyperparameter of the model, may not align with the theoretical bandwidth in communication systems. Due to the influence of nonlinear effects in the channel, a portion of the signal that was originally within the bandwidth might leak outside the passband. Therefore, when performing post-equalization at the receiver, it is important to consider the out-of-band information to reconstruct the original signal as accurately as possible.

Upon completion of network training, the prediction signal $\hat{Y} (n, ω)$ is outputted, and the predicted time domain signal $\hat{x} (n)$ can be obtained by performing the inverse transform of STFT. Let ${\hat{x}}_{t} (n)$ denote the inverse Fourier transform of ${\hat{Y}}_{t} (ω)$ when $t H \leq n \leq t H + W$ : ${\hat{x}}_{t} (n) = (1 / N) \sum_{f = 0}^{W - 1} {\hat{Y}}_{t} (ω) \exp [j ω (n - t H) / W] .$ (6)

Subsequently, the obtained ${\hat{x}}_{t} (n)$ undergoes windowing and overlap addition, followed by taking the ratio of the result with the overlapping addition of the squared window function. This yields the recovered time domain signal $\hat{x} (n)$ as follows: $\hat{x} (n) = [\sum_{t} {\hat{x}}_{t} (n) w (n - t H)] / [\sum_{t} w^{2} (n - t H)] .$ (7)

3. Experimental Setup

Figure 2 presents the experimental setup of our UVLC system. The communication process comprises two stages: channel estimation and communication. First, QPSK modulated original data undergo DMT modulation and digital pre-equalization (DZN). The processed data are loaded onto an arbitrary waveform generator (AWG, Keysight M8190A), amplified through an electrical amplifier (EA, ZHL-6A-S+), and coupled with the LED’s driving current using a bias tee (ZFBT-4R2GW-FT+). The emitted signal passes through a 1.2 m water tank, lens, and aperture in the optical path. The received signal is captured by a photodetector (PD), producing a dual-output via a TIA, and fed into an oscilloscope (OSC, MSO9404A) for offline data processing. Synchronization, CBV-TFNet post-equalization, and DMT demodulation are performed for signal recovery. A bit-power loading algorithm optimizes modulation level allocation based on signal-to-noise ratio (SNR) calculations. In the next stage, this process is repeated with the newly generated modulation scheme until the desired bit error rate (BER) threshold for forward error correction (FEC) is achieved. Multiple tests confirm the viable communication rate.

Figure 2.Experimental setup of UVLC system.

4. Results

The proposed CBV-TFNet employs critical hyperparameters, including the STFT window length, the number of hidden layers and nodes, and the passband bandwidth (PB) and weight of the BV loss function.

For proposed BV loss function, it is pivotal to adjust the bandwidth $W_{P}$ and weight $α_{Pass}$ of the passband. Separate traversal of these at a working point data yields intriguing results by testing the communication system’s BER, as illustrated in Fig. 3. It is worth mentioning that this specific working point is a set of operating parameters selected by the benchmark method after traversing the working point, combined with theoretical derivation, with relatively strong nonlinear effects, but still near the optimal operating point.

Figure 3.(a) BER performance of different PBs and passband weights used by loss function. Spectra of the original, received, and NN equalized signal of (b) using PB of 723 MHz and weight of 1.0 and (c) using PB of 797 MHz and weight of 0.9.

The bandwidth of the generated transmitted signal is determined to be 775 MHz. In the general trend depicted in Fig. 3(a), the BER exhibits a decreasing trend with the increase of passband weight when values are less than 1, except bandwidth values of 723 MHz. This highlights the efficacy of the BV loss function in guiding the NN based on prior knowledge. However, the minimal BER is obtained at smaller weights when PB values are 723 MHz and 797 MHz due to nonlinear distortion causing leakage of the original band spectrum information. Consequently, equalizing in-band information while also giving some weight to out-of-band is crucial. The in-band to out-of-band information ratio differs across bandwidths, so do the weights corresponding to the minimal BER. As validation, the BER of 723 MHz at a bandpass weight of 1 is exceedingly high, and the equalized signal spectrum is illustrated in Fig. 3(b). When PB values are 797 MHz and the weight values are 0.9, the BER performance significantly improves, and the equalized spectrum is depicted in Fig. 3(c). It should be noted that the optimal hyperparameters of the loss function are related to the degree of nonlinearity of the operating point. Although they are not guaranteed to be the globally optimal parameters, considering the complexity of the work, the performance of the model is relatively robust and sufficient to support our work. Therefore, the values at this working point are chosen as the hyperparameter of BV loss function.

As for the mask added before NN input, it should be clarified that the mask proposed in this Letter is not intended to improve the model’s equalization performance per se, but to improve the convergence speed of model training. A drawback of CBV-TFNet and TFDNet in the previous work should be mentioned, that is, the training rounds are relatively long. With TFDNet, each training may require more than 700 rounds. By adding the mask, the number of training rounds can be significantly reduced. Figure 4(a) shows the error band plots for equalization performance with and without the mask or BV loss function model, across various training rounds. The results highlight the significant impact of the BV loss function on enhancing the system’s bit rate, which improves 100 MHz. Additionally, BV loss function narrows down the error band substantially, leading to improved model stability. Incorporating the mask further enhances the convergence speed of the model. The right subfigures in Fig. 4 depict the STFT spectrum of the received, transmitted, and mask-received signal.

Figure 4.(a) Error band diagram of bitrate in continuously changing epoch using different loss function and model input. STFT spectrum of (b) received signal, (c) original signal, (d) received signal after mask.

The optimal STFT window length found through traversal and Bayesian optimization is used for NN parameters including baseline methods. The parameters and number of real multiplications are shown in Table 1. Upon setting hyperparameters, we apply different post-equalization methods to examine the system’s operation points using bit-power loading-DMT modulation. The bitrate is calculated and experimentally verified by the bit-power loading-DMT algorithm after assigning different subcarrier modulation orders according to the estimated SNR of different subcarriers, which ensure that the BER of the communication system transmission is below 7% FEC, that is, 3.8 × 10⁻³ threshold. Three post-equalizers’ rate contour plots are shown in Figs. 5(a)–5(c): Volterra-based, DNN-based, and TFDNet, reaching peak bitrates of 4.516 Gbps, 4.774 Gbps, and 4.855 Gbps, respectively. Figure 5(d) demonstrates CBV-TFNet’s superior nonlinear compensation performance, achieving a peak rate of 4.956 Gbps with the widest dynamic range.

Method	Window length	Nodes of NN structure	Number of real multiplications	Peak bitrate (Gbps)	Dynamic range (4.75 Gbps threshold)
LMS-Volterra	73		7205/sym	4.516	0
DNN	73	(73, 256, 1)	37,705/sym	4.774	0.216
TFDNet	72	(144, 256, 128, 144)	16,276/sym	4.855	0.373
CBV-TFNet (this work)	72	(144, 200, 128, 144)	14,383/sym	4.956	0.491

Table 1. Hyperparameters and Communication Performance of Different Methods-Based Equalizers

View all Tables

Figure 5.The bitrate contour plot of different working points with the post-equalizer using (a) LMS-Volterra; (b) DNN; (c) TFDNet; (d) CBV-TFNet. The bit-power loading result in the communication test with the post-equalizer using (e) LMS-Volterra; (f) DNN; (g) TFDNet; (h) CBV-TFNet.

The lower four graphs illustrate estimated SNR, QAM order, and power loading scheme when achieving maximum bitrates. Evidently, the CBV-TFNet equalizer improves high-frequency performance significantly, leading to higher bitrates. When using the number of real multiplications as an indicator for time complexity, CBV-TFNet outperforms the DNN equalizer, completing the same signal compensation task consuming only 38.15% of the multiplication times. This efficiency relies on the hop size of the STFT, with a longer hop size reducing the number of multiplications, without significantly sacrificing performance. In addition, using 4.75 Gbps as the threshold communication bitrate to quantify the dynamic range of each method, the data shown in Table 1 can be obtained. It can be seen that the proposed method is able to guarantee the highest dynamic range. Although the LMS-Volterra post-equalizer has smaller computational complexity, its performance in terms of peak bitrate and dynamic range is far from other methods.

It is worth noting that due to the large computational complexity of all the post-equalization methods we implemented, regarding their hardware parallel implementation poses a challenge. Regarding this problem, from an algorithmic point of view, we can use other optimization strategies to reduce the computational complexity, such as the pruning algorithm of NNs, which has been proven in our previous work to be able to drastically reduce the number of connections without significant degradation of MLP performance^[14]. Using knowledge distillation to compress the model can also achieve the effect of reducing the computational complexity of the model^[15].

We also encourage future researchers to further explore and optimize the post-equalization method in terms of hardware parallel implementation. By combining new algorithmic design and hardware optimization techniques, we can better address the issue of computational complexity and improve the feasibility and efficiency of the post-equalization method in hardware implementation.

Excellent performance of the proposed method can be attributed to the features of STFT, as reflected in Fig. 6. STFT enables the model to establish a multi-input-multi-output parallel structure, which outputs step-length data each time, thus reducing time complexity. Additionally, by offering a broader receptive field for the NN, STFT amplifies the model’s receptivity, with each data point passing through a network prediction process involving nine overlapping windows, yielding a receptive field of up to 136 data lengths—nearly double that of a DNN equalizer. Consequently, CBV-TFNet exhibits superior performance in learning nonlinear effects and inter-symbol crosstalk, displaying robust balancing capabilities.

Figure 6.Comparison of consecutive window input and output capability of DNN based and CBV-TFNet equalizers.

5. Conclusion

In this Letter, we propose a post-equalizer for UVLC systems based on a time-frequency joint NN that incorporates channel estimation, CBV-TFNet. A novel BV loss function, also leveraging channel estimation, guides the NN’s focus on the spectrum within the passband that carries the majority of the information. Simultaneously, a pilot signal-based mask accelerates model convergence and streamlines the structure. Utilizing only 38.15% of real multiplication calculations compared to the DNN equalizer, the system achieves a bitrate of 4.956 Gbps in a 1.2 m UVLC system, which is the highest transmission rate for a single-wavelength LED used in the UVLC system as far as we know. Compared to the traditional LMS-Volterra post-equalizer, the proposed method achieves a bitrate gain of 440 Mbps and a significantly larger dynamic range. CBV-TFNet demonstrates a promising post-equalization scheme for free-space optical communication, including UVLC.

References

[1] X. Sun, C. H. Kang, M. Kong et al. A review on practical considerations and solutions in underwater wireless optical communication. J. Lightwave Technol., 38, 421(2020).

[2] N. Chi, Y. Zhou, Y. Wei et al. Visible light communication in 6G: advances, challenges, and prospects. IEEE Veh. Technol. Mag., 15, 93(2020).

[3] Z. Y. Xu, W. Q. Niu, Y. Liu et al. 31.38 Gb/s GaN-based LED array visible light communication system enhanced with V-pit and sidewall quantum well structure. Opto-Electron. Sci., 2, 230005(2023).

[4] X. Li, C. Cheng, Z. Wei et al. Net 5.75 Gbps/2 m single-pixel blue mini-LED based underwater wireless communication system enabled by partial pre-emphasis and nonlinear pre-distortion. J. Lightwave Technol., 40, 6116(2022).

[5] X. Yang, Z. Tong, H. Zhang et al. 7-M/130-Mbps LED-to-LED underwater wireless optical communication based on arrays of series-connected LEDs and a coaxial lens group. J. Lightwave Technol., 40, 5901(2022).

[6] F.-M. Wu, C.-T. Lin, C.-C. Wei et al. 3.22-Gb/s WDM visible light communication of a single RGB LED employing carrier-less amplitude and phase modulation. Optical Fiber Communication Conference/National Fiber Optic Engineers Conference(2013).

[7] G. Stepniak, J. Siuzdak, P. Zwierko. Compensation of a VLC phosphorescent white LED nonlinearity by means of Volterra DFE. IEEE Photon. Technol. Lett., 25, 1597(2013).

[8] X. Lu, C. Lu, W. Yu et al. Memory-controlled deep LSTM neural network post-equalizer used in high-speed PAM VLC system. Opt. Express, 27, 7822(2019).

[9] H. Chen, Y. Zhao, F. Hu et al. Nonlinear resilient learning method based on joint time-frequency image analysis in underwater visible light communication. IEEE Photonics J., 12, 7901610(2020).

[10] W. Niu, J. Cai, Z. Luo et al. Support vector machine-based soft decision for consecutive-symbol-expanded 4-dimensional constellation in underwater visible light communication system. Photonics, 9, 804(2022).

[11] J. Shi, W. Niu, Z. Li et al. Optimal adaptive waveform design utilizing an end-to-end learning-based pre-equalization neural network in an UVLC system. J. Lightwave Technol., 41, 1626(2023).

[12] J. Shi, W. Xiao, Y. Ha et al. 3.76-Gbps yellow-light visible light communication system over 1.2 m free space transmission utilizing a Si-substrate LED and a cascaded pre-equalizer network. Opt. Express, 30, 33337(2022).

[13] D. Griffin, J. Lim. Signal estimation from modified short-time Fourier transform. IEEE Trans. Acoust. Speech Signal Process., 32, 236(1984).

[14] Y. Zhao, N. Chi. Partial pruning strategy for a dual-branch multilayer perceptron-based post-equalizer in underwater visible light communication systems. Opt. Express, 28, 15562(2020).

[15] F. Li, L. Yao, W. Niu et al. Feature decoupled knowledge distillation enabled lightweight image transmission through multimode fibers. Opt. Express, 32, 4201(2024).

微信扫一扫：分享

微信扫一扫：分享