Class-specific differential detection in diffractive optical neural networks improves inference accuracy

Jingxi Li; Deniz Mengu; Yi Luo; Yair Rivenson; Aydogan Ozcan

doi:10.1117/1.AP.1.4.046001

Journals >Advanced Photonics >Volume 1 >Issue 4 >Page 046001 > Article

Advanced Photonics
Vol. 1, Issue 4, 046001 (2019)

Class-specific differential detection in diffractive optical neural networks improves inference accuracy

Jingxi Li^{1、2、3、†}, Deniz Mengu^1、2、3, Yi Luo^1、2、3, Yair Rivenson^1、2、3, and Aydogan Ozcan^{1、2、3、*}

Author Affiliations

¹University of California at Los Angeles, Department of Electrical and Computer Engineering, Los Angeles, California, United States

²University of California at Los Angeles, Department of Bioengineering, Los Angeles, California, United States

³University of California at Los Angeles, California NanoSystems Institute, Los Angeles, California, United States

show less

DOI: 10.1117/1.AP.1.4.046001 Cite this Article Set citation alerts

Jingxi Li, Deniz Mengu, Yi Luo, Yair Rivenson, Aydogan Ozcan. Class-specific differential detection in diffractive optical neural networks improves inference accuracy[J]. Advanced Photonics, 2019, 1(4): 046001 Copy Citation Text

EndNote(RIS)

BibTex

Plain Text

show less

$Illustration of different diffractive neural network design strategies. (a) Standard design refers to D([M,0],[1,L,P]), where M is the number of classes in the target dataset, which in this specific design is also equal to the number of detectors per diffractive neural network, L is the number of diffractive layers per optical network, and P refers to the number of neurons per diffractive layer. In the examples shown in this figure, L=5, P=40k, meaning 0.2 million neurons in total. (b) Differential design shown on the left refers to D([M,M],[1,L,P]), whereas the one on the right refers to D([M][M],[2,L,P]) as it uses two different jointly optimized diffractive networks, separating the positive and the negative detectors by placing them at different output planes without optical coupling between the two. (c) Class-specific design shown here refers to D([M/N,0],[N,L,P]), where N>1 is the number of class subsets (in this example, N=2 case is shown). (d) Class-specific differential design shown here refers to D([M/N,M/N],[N,L,P]) where N=2 is illustrated. In general, there can be another version of a class-specific differential design where each diffractive neural network has only positive or negative detectors at the corresponding output plane; this special case is denoted with D([M/N][M/N],[2N,L,P]), where 2N>2 refers to the number of jointly designed diffractive neural networks. N=1 case, i.e., D([M][M],[2,L,P]) is included as part of (b) right panel, and we do not consider it under the class-specific neural network design since there is no class separation at the output/detector planes.$

Fig. 1. Illustration of different diffractive neural network design strategies. (a) Standard design refers to

D ([M, 0], [1, L, P])

, where

M

is the number of classes in the target dataset, which in this specific design is also equal to the number of detectors per diffractive neural network,

L

is the number of diffractive layers per optical network, and

P

refers to the number of neurons per diffractive layer. In the examples shown in this figure,

L = 5

P = 40 k

, meaning 0.2 million neurons in total. (b) Differential design shown on the left refers to

D ([M, M], [1, L, P])

, whereas the one on the right refers to

D ([M] [M], [2, L, P])

as it uses two different jointly optimized diffractive networks, separating the positive and the negative detectors by placing them at different output planes without optical coupling between the two. (c) Class-specific design shown here refers to

D ([M / N, 0], [N, L, P])

, where

N > 1

is the number of class subsets (in this example,

N = 2

case is shown). (d) Class-specific differential design shown here refers to

D ([M / N, M / N], [N, L, P])

where

N = 2

is illustrated. In general, there can be another version of a class-specific differential design where each diffractive neural network has only positive or negative detectors at the corresponding output plane; this special case is denoted with

D ([M / N] [M / N], [2 N, L, P])

, where

2 N > 2

refers to the number of jointly designed diffractive neural networks.

N = 1

case, i.e.,

D ([M] [M], [2, L, P])

is included as part of (b) right panel, and we do not consider it under the class-specific neural network design since there is no class separation at the output/detector planes.

Download full size | View in the Article

$Operation principles of a differential diffractive optical neural network. (a) Setup of the differential design, D([M,M],[1,L,P]). In the example shown in this figure, M=10, L=5, P=40k. (b) A correctly classified test object from the MNIST dataset is shown. Subparts of (b) illustrate the following: (i) target object placed at the input plane and illuminated by a uniform plane wave, (ii) normalized intensity distribution observed at the output plane of the diffractive optical neural network, (iii) normalized optical signal detected by the positive (red) and the negative (blue) detectors, (iv) differential class scores computed according to Eq. (1) using the values in (iii). (c) and (d) are the same as in (b), except for Fashion-MNIST and CIFAR-10 datasets, respectively. Note that while the input object in (b) is modeled as an amplitude-encoded object, the gray levels shown in (c) and (d) represent phase-encoded perfectly transparent input objects. Since diffractive optical neural networks operate using coherent illumination, phase and/or amplitude channels of the input plane can be used to represent information.$

Fig. 2. Operation principles of a differential diffractive optical neural network. (a) Setup of the differential design,

D ([M, M], [1, L, P])

. In the example shown in this figure,

M = 10

L = 5

P = 40 k

. (b) A correctly classified test object from the MNIST dataset is shown. Subparts of (b) illustrate the following: (i) target object placed at the input plane and illuminated by a uniform plane wave, (ii) normalized intensity distribution observed at the output plane of the diffractive optical neural network, (iii) normalized optical signal detected by the positive (red) and the negative (blue) detectors, (iv) differential class scores computed according to Eq. (1) using the values in (iii). (c) and (d) are the same as in (b), except for Fashion-MNIST and CIFAR-10 datasets, respectively. Note that while the input object in (b) is modeled as an amplitude-encoded object, the gray levels shown in (c) and (d) represent phase-encoded perfectly transparent input objects. Since diffractive optical neural networks operate using coherent illumination, phase and/or amplitude channels of the input plane can be used to represent information.

Download full size | View in the Article

$Operation principles of a diffractive optical neural network using differential detection scheme, where the positive and the negative detectors are split into two jointly optimized networks based on their sign. (a) Setup of the differential design, D([M][M],[2,L,P]). In the example shown in this figure, M=10, L=5, P=40k. (b) A correctly classified test object from the MNIST dataset is shown. Subparts of (b) illustrate the following: (i) target object placed at the input plane and illuminated by a uniform plane wave, (ii) normalized intensity distribution observed at the output plane of the diffractive optical neural network, (iii) normalized optical signal detected by the positive (red) and the negative (blue) detectors, (iv) differential class scores computed according Eq. (1) using the values in (iii). (c) and (d) are the same as in (b), except for Fashion-MNIST and CIFAR-10 datasets, respectively. Note that while the input object in (b) is modeled as an amplitude-encoded object, the gray levels shown in (c) and (d) represent phase-encoded perfectly transparent input objects.$

Fig. 3. Operation principles of a diffractive optical neural network using differential detection scheme, where the positive and the negative detectors are split into two jointly optimized networks based on their sign. (a) Setup of the differential design,

D ([M] [M], [2, L, P])

. In the example shown in this figure,

M = 10

L = 5

P = 40 k

. (b) A correctly classified test object from the MNIST dataset is shown. Subparts of (b) illustrate the following: (i) target object placed at the input plane and illuminated by a uniform plane wave, (ii) normalized intensity distribution observed at the output plane of the diffractive optical neural network, (iii) normalized optical signal detected by the positive (red) and the negative (blue) detectors, (iv) differential class scores computed according Eq. (1) using the values in (iii). (c) and (d) are the same as in (b), except for Fashion-MNIST and CIFAR-10 datasets, respectively. Note that while the input object in (b) is modeled as an amplitude-encoded object, the gray levels shown in (c) and (d) represent phase-encoded perfectly transparent input objects.

Download full size | View in the Article

$Operation principles of a diffractive optical neural network using class-specific detection scheme, where the individual class detectors are split into separate networks based on their classes. Unlike Figs. 2 and 3, there are no negative detectors in this design. (a) Setup of the class-specific design, D([M/2,0],[2,L,P]). In the example shown in this figure, M=10, L=5, P=40k. (b) A correctly classified test object from the MNIST dataset is shown. Subparts of (b) illustrate the following: (i) target object placed at the input plane and illuminated by a uniform plane wave, (ii) normalized intensity distribution observed at the two output planes of the diffractive optical neural networks, (iii) normalized optical signal detected by the detectors. (c) and (d) are the same as in (b), except for Fashion-MNIST and CIFAR-10 datasets, respectively. Note that while the input object in (b) is modeled as an amplitude-encoded object, the gray levels shown in (c) and (d) represent phase-encoded perfectly transparent input objects.$

Fig. 4. Operation principles of a diffractive optical neural network using class-specific detection scheme, where the individual class detectors are split into separate networks based on their classes. Unlike Figs. 2 and 3, there are no negative detectors in this design. (a) Setup of the class-specific design,

D ([M / 2,0], [2, L, P])

. In the example shown in this figure,

M = 10

L = 5

P = 40 k

Download full size | View in the Article

$Performance comparison of different diffractive neural network systems as a function of N, the number of class subsets. M=10 classes exist for each dataset: MNIST, Fashion MNIST, and grayscale CIFAR-10. Based on our notation, N=M=10 refers to a jointly optimized diffractive neural network system that specializes to each one of the classes separately. These results confirm that class-specific differential diffractive neural networks (D([M/N][M/N],[2N,L,P]) for N>1 outperform other counterpart diffractive neural network designs. For each data point, the training of the corresponding diffractive optical neural network model was repeated six times with random initial phase modulation variables and random batch sequences; therefore, each data point reflects the mean blind testing accuracy of these six trained networks, also showing the corresponding standard deviation.$

Fig. 5. Performance comparison of different diffractive neural network systems as a function of

N

, the number of class subsets.

M = 10

classes exist for each dataset: MNIST, Fashion MNIST, and grayscale CIFAR-10. Based on our notation,

N = M = 10

refers to a jointly optimized diffractive neural network system that specializes to each one of the classes separately. These results confirm that class-specific differential diffractive neural networks (

D ([M / N] [M / N], [2 N, L, P]

) for

N > 1

outperform other counterpart diffractive neural network designs. For each data point, the training of the corresponding diffractive optical neural network model was repeated six times with random initial phase modulation variables and random batch sequences; therefore, each data point reflects the mean blind testing accuracy of these six trained networks, also showing the corresponding standard deviation.

Download full size | View in the Article

$The comparison between the classification accuracies of ensemble models formed by 1, 2, and 3 independently optimized diffractive neural networks that optically project their diffracted light onto the same output/detector plane. Blue and orange curves represent D([10, 0],[1, 5, 40k]) and D([10,10],[1, 5, 40k]) designs, respectively. (a) MNIST, (b) Fashion-MNIST, and (c) grayscale CIFAR-10. Not to perturb the inference results of each diffractive network due to constructive/destructive interference of light, incoherent summation of the optical signals of each diffractive network at the common output plane is considered here, which can be achieved by adjusting the relative optical path length differences between the individual diffractive networks to be larger than the temporal coherence length of the illumination source.$

Fig. 6. The comparison between the classification accuracies of ensemble models formed by 1, 2, and 3 independently optimized diffractive neural networks that optically project their diffracted light onto the same output/detector plane. Blue and orange curves represent

D ([10, 0], [1, 5, 40 k])

and

D ([10, 10], [1, 5, 40 k])

designs, respectively. (a) MNIST, (b) Fashion-MNIST, and (c) grayscale CIFAR-10. Not to perturb the inference results of each diffractive network due to constructive/destructive interference of light, incoherent summation of the optical signals of each diffractive network at the common output plane is considered here, which can be achieved by adjusting the relative optical path length differences between the individual diffractive networks to be larger than the temporal coherence length of the illumination source.

Download full size | View in the Article


Architecture	MNIST	Fashion	CIFAR-10 (grayscale)
$D ([10,0], [1, 5, 40 k])$	$97.51 \pm 0.03$	$89.85 \pm 0.18$	$45.20 \pm 0.35$
$D ([10,10], [1, 5, 40 k])$	$98.54 \pm 0.03$	$90.54 \pm 0.16$	$48.51 \pm 0.30$
$D ([10] [10], [2, 5, 40 k])$	$98.49 \pm 0.03$	$90.94 \pm 0.16$	$49.10 \pm 0.30$

Table 1. Blind testing classification accuracies of nondifferential (top row) and differential diffractive optical networks, without any class specificity or division.

M = 10

classes exist for each dataset: MNIST, Fashion MNIST, and gray-scaled CIFAR-10. For each data point, the training of the corresponding diffractive optical neural network model was independently repeated six times with random initial phase modulation variables and random batch sequences; therefore, each data point reflects the mean blind testing accuracy of these six trained networks, also showing the corresponding standard deviation.

View in the Article


Type	Architecture	MNIST	Fashion	CIFAR-10 (grayscale)
Class-specific nondifferential, $D ([M / N, 0], [N, L, P])$ , $N > 1$	$D ([5, 0], [2, 5, 40 k])$	$97.53 \pm 0.08$	$90.19 \pm 0.14$	$46.37 \pm 0.35$
$D ([2, 0], [5, 5, 40 k])$	$97.57 \pm 0.07$	$90.14 \pm 0.16$	$47.05 \pm 0.16$
$D ([1, 0], [10, 5, 40 k])$	$97.61 \pm 0.08$	$90.34 \pm 0.08$	$48.02 \pm 0.70$
Class-specific differential, $D ([M / N, M / N], [N, L, P])$ , $N > 1$	$D ([5, 5], [2, 5, 40 k])$	$98.50 \pm 0.09$	$90.89 \pm 0.24$	$49.09 \pm 0.24$
$D ([2, 2], [5, 5, 40 k])$	$98.57 \pm 0.06$	$91.08 \pm 0.25$	$49.68 \pm 0.17$
$D ([1, 1], [10, 5, 40 k])$	$98.59 \pm 0.03$	$91.37 \pm 0.19$	$50.09 \pm 0.23$
Class-specific differential, $D ([M / N] [M / N], [2 N, L, P])$ , $N > 1$	$D ([5] [5], [4, 5, 40 k])$	$98.51 \pm 0.08$	$91.04 \pm 0.22$	$49.82 \pm 0.38$
$D ([2] [2], [10, 5, 40 k])$	$98.58 \pm 0.06$	$91.36 \pm 0.13$	$50.47 \pm 0.63$
$D ([1] [1], [20, 5, 40 k])$	$98.52 \pm 0.05$	$91.48 \pm 0.03$	$50.82 \pm 0.26$

Table 2. Blind testing classification accuracies of different class division architectures combined with nondifferential and differential diffractive neural network designs. For each data point, the training of the corresponding diffractive optical neural network model was independently repeated six times with random initial phase modulation variables and random batch sequences; therefore, each data point reflects the mean blind testing accuracy of these six trained networks, also showing the corresponding standard deviation.

View in the Article


Type	Network architecture	MNIST (%)	Fashion (%)	CIFAR-10 (%)
Optical (diffractive)	Standard design $D ([10, 0], [1, 5, 40 k])$	$97.51 \pm 0.03$	$89.85 \pm 0.18$	$45.20 \pm 0.35$
Differential design $D ([10, 10], [1, 5, 40 k])$	$98.54 \pm 0.03$	$90.54 \pm 0.16$	$48.51 \pm 0.30$
Ensemble of 3 differential designs $D ([10, 10], [1, 5, 40 k])$	98.59	91.06	51.44
Class-specific differential design $D ([1, 1], [10, 5, 40 k])$	$98.59 \pm 0.03$	$91.37 \pm 0.19$	$50.24 \pm 0.17$
Class-specific differential design $D ([1] [1], [20, 5, 40 k])$	$98.52 \pm 0.05$	$91.48 \pm 0.03$	$50.82 \pm 0.26$
Hybrid (optical + electronic)	Ref. 27	98.97	90.45	—
Ref. 26	—	—	$51.00 \pm 1.40$
Electronic	SVM²⁹	91.90	83.20	37.13
LeNet³⁰	98.77	90.27	55.21
AlexNet²	99.20	89.90	72.14
ResNet³¹	99.51	93.23	88.78