Carlo M. Valensise1, Ivana Grecco2, Davide Pierangeli1、2、3、*, and Claudio Conti1、2、3
Author Affiliations
1Enrico Fermi Research Center (CREF), 00184 Rome, Italy2Physics Department, Sapienza University of Rome, 00185 Rome, Italy3Institute for Complex Systems, National Research Council (ISC-CNR), 00185 Rome, Italyshow less
Fig. 1. Three-dimensional PELM for language processing. (A) The text database entry is a paragraph of variable length. Text pre-processing: a sparse representation of the input paragraph is mapped into a Hadamard matrix with phase values in [0,π]. (B) The mask is encoded into the optical wavefront by a phase-only SLM. Free-space propagation of the optical field maps the input data into a 3D intensity distribution (speckle-like volume). (C) Sampling the propagating laser beam in multiple far-field planes enables upscaling the feature space. Intensities picked from all the spatial modes form the output layer H3D that undergoes training via ridge regression. By using three planes (j=3), we get a network capacity C>1010. (D) The example shows a binary text classification problem for large-scale rating.
Fig. 2. Photonic sentiment analysis. (A), (B) Training and test accuracy of the 3D-PELM on the IMDb dataset as a function of the number of output channels. The shaded area corresponds to the over-parameterized region. The configuration in (B) allows us to reach very high accuracy in the over-parameterized region with a dataset limited to Ntrain=1186 training points. In (A), the same accuracy is reached in the under-parameterized region with Ntrain=12,278. Black horizontal lines correspond to the maximum test accuracy achieved (0.77). (C) IMDb classification accuracy by varying the number of features M and training dataset size Ntrain. The boundary between the under and over-parameterized region (interpolation threshold), Ntrain=M, is characterized by a sharp accuracy drop (cyan contour line).
Fig. 3. Performances at ultralarge scale. (A)–(C) Test accuracy as a function of M for different input sizes L. In all cases, the 3D-PELM performance saturates in the over-parameterized region, reaching a plateau. A linear fit of the data preceding the plateau shows that the onset of the saturation is faster for datasets with a larger input space. The corresponding angular coefficient m is inset in each panel. (D) Test accuracy varying the training set size for M=0.8×105 and M=1.2×105.
Fig. 4. Analysis of the IMDb accuracy. (A), (B) The comparison reports the accuracy for the experimental device (3D-PELM device), the simulated device (3D-PELM numerics), the random projection method with ridge regression (RP), the support vector machine (SVM), and a convolutional neural network (CNN) in both the under-parameterized (M=1×103) and over-parameterized (M=4×104) regimes, for (A) Ntrain=6700 and (B) Ntrain=1500. 8-bit numerical results, when applicable, refer to the over-parameterized regime.
Working Principle | | | | Machine Learning Task | Ref. |
---|
Time-multiplexed cavity | 1400 | 7129 | | Regression | [39] | Amplitude modulation | 16,384 | 2000 | | Human action recognition | [27] | Frequency multiplexing | 200 | 640 | | Time series recovery | [41] | Optical multiple scattering | 50,000 | 64 | | Chaotic series prediction | [38] | Amplitude Fourier filtering | 1024 | 43,263 | | Image classification | [30] | Multimode fiber | 240 | 240 | | Classification, regression | [35] | Free-space propagation | 6400 | 784 | | Classification, regression | [34] | 3D optical field | 120,000 | 131,044 | | Natural language processing | 3D-PELM |
|
Table 1. Maximum Network Capacity of Current Photonic Neuromorphic Computing Hardware for Supervised Learning