Author Affiliations
1The Institute for Advanced Studies, Wuhan University, Wuhan 430072, China2Shanghai Institute of Applied Physics, Chinese Academy of Sciences, Shanghai 201800, China3University of Chinese Academy of Sciences, Beijing 100049, China4Shanghai Advanced Research Institute, Chinese Academy of Sciences, Shanghai 201204, Chinashow less
Fig. 1. Comparison of LN83 diffraction pattern before (a) and after (b) gray value equalization
Fig. 2. Diffraction pattern of protein crystal after gray value equalization
Fig. 3. LN83 diffraction pattern image enhancement results (a) Original image, (b) Flip left and right, (c) Rotate 90° counterclockwise, (d) Rotate 25° counterclockwise and move 10 pixels to the right, (e) Rotate 110° clockwise, move 5 pixels to the right and 5 pixels to the down, (f) Rotate 60° clockwise
Fig. 4. Flow chart of convolutional neural network for training and prediction
Fig. 5. Accuracy and operation rate of verification set and test set based on different networks(a) Verification set accuracy, (b) Test set accuracy, (c) Verification set running rate, (d) Test setverification set running rate
Fig. 6. t-SNE dimensionality reduction results of six convolutional neural networks (the circle is the "maybe " sample, the cross is the "Miss" sample, and the pentagram is the "hit" sample)(a) MobileNets, (b) ResNet, (c) Inception-v1, (d) Inception-v3, (e) Vgg16, (f) AlexNet
Fig. 7. Running rate of LN83 on GPU and CPU
Fig. 8. MobileNets hit /maybe (a) and miss sample (b) reliability distribution
Fig. 9. Sample selected by MobileNets (a) Hit, (b) Maybe, (c) Miss
数据 Dataset | 蛋白质 Protein | 入射能量 Incident energy / keV | 仪器 Instrument | 探测器 Detector |
---|
LN83 | 氢化酶蛋白质晶体 Hydrogenase | 9.498 | MFX | Rayonix | LN84 | 光系统 II Photosystem II | 9.516 | MFX | Rayonix | LO19 | 辛环素 Cyclophilin A | 9.442 | MFX | Rayonix | L498 | 嗜热菌蛋白酶 Thermolysin | 9.773 | CXI | CSPAD |
|
Table 1. Experimental data
数据类型 Data type | 布拉格点的数量 Number of Bragg points | 有效信息含量 Effective information content |
---|
命中Hit | X≥10 | 较多有效信息 More valid information | 也许命中 Maybe | 10>X≥4 | 较少有效信息 Less valid information | 未命中 Miss | X≤3 | 缺失有效信息 Loss valid information |
|
Table 2. Data classification
网络 Net | 网络深度 / 层 Depth / layer | 特点 Characteristic |
---|
AlexNet | 8 | 网络层数少,采用ReLu激活函数 Less layer, use ReLu activation function | Vgg16 | 16 | 采用小卷积核,收敛速度加快 Small convolution kernels to speed up convergence | Inception-V1 | 22 | 并行计算,去除全连接层 Parallel computing, remove the full connection layer | Inception-V3 | 46 | 并行计算,将卷积拆分,减少数据规模 Parallel computing, split convolution | ResNet101 | 101 | 采用残差网络优化学习目标 Optimize learning objectives using residual network | MobileNets-V1 | 28 | 卷积可分离,引入全局超参数 Separate the convolution depth, use global hyperparameters |
|
Table 3. Five convolutional neural networks
样品 Samples | 验证集准确度 Accurancy / % | 测试集准确度 Accurancy |
---|
L498-氢化酶蛋白质晶体 Thermolysin | 62.2 | 7/10 | LN84-光系统 II Photosystem II | 82.3 | 8/10 | LN83-嗜热菌蛋白酶Hydrogenase | 81.8 | 8/10 | LO19-辛环素Cyclophilin A | 78.0 | 9/10 |
|
Table 4. Verification set and test set accuracy of each samples based on MobileNets
网络 Nets | 标签 Label | LN83-氢化酶蛋白质晶体 Hydrogenase |
---|
命中 Hit | 也许命中 Maybe | 未命中 Miss |
---|
MobileNets | 命中 Hit | 0.919 | 0.070 | 0.011 | 也许命中 Maybe | 0.168 | 0.701 | 0.131 | 未命中 Miss | 0.014 | 0.043 | 0.943 | Inception-v1 | 命中 Hit | 0.935 | 0.043 | 0.022 | 也许命中 Maybe | 0.350 | 0.416 | 0.234 | 未命中 Miss | 0.008 | 0.028 | 0.964 | Inception-v3 | 命中 Hit | 0.958 | 0.029 | 0.013 | 也许命中 Maybe | 0.547 | 0.343 | 0.109 | 未命中 Miss | 0.058 | 0.202 | 0.740 | Vgg16 | 命中 Hit | 0.893 | 0.086 | 0.021 | 也许命中 Maybe | 0.073 | 0.876 | 0.051 | 未命中 Miss | 0.020 | 0.141 | 0.840 | ResNet | 命中 Hit | 0.854 | 0.084 | 0.063 | 也许命中 Maybe | 0.015 | 0.518 | 0.467 | 未命中 Miss | 0.001 | 0.004 | 0.995 | AlexNet | 命中 Hit | 0.907 | 0.014 | 0.079 | 也许命中 Maybe | 0.927 | 0.022 | 0.051 | 未命中 Miss | 0.509 | 0.016 | 0.475 |
|
Table 5. Accuracy of verification set and test set using different networks based on LN83
网络Nets | 命中/也许命中Hit/maybe | 未命中Miss |
---|
MobileNets | 0.970 | 0.943 | Inception-V1 | 0.944 | 0.964 | Inception-V3 | 0.972 | 0.740 | Vgg16 | 0.974 | 0.840 | ResNet | 0.873 | 0.955 | AlexNet | 0.925 | 0.475 |
|
Table 6. Accuracy of two classification based on Ln83 sample