Fig. 1. Traditional teacher-student network and proposed method. (a) Traditional teacher-student network; (b) proposed method
Fig. 2. Structural diagram of proposed model
Fig. 3. Feature activation maps obtained from feature maps of different convolution layers in tutor network
Fig. 4. Flow chart of guiding tailoring module
Fig. 5. Ten kinds of action image samples from Kaggle dataset
Fig. 6. Ten kinds of action image samples from AUC data set
Fig. 7. Visualization results on Kaggle data set
Model | Randomcropping | Imageresolution/PPI | Accuracy /% |
---|
ResNet18 | No | 224 | 89.30 | ResNet18 | Yes | 224 | 92.56 | ResNet50 | No | 224 | 90.17 | ResNet50 | Yes | 224 | 94.93 |
|
Table 1. Comparative experimental results with and without data enhancement in Kaggle dataset
Model | Parameterquantity | Flops | Imageresolution/PPI | Accuracy /% |
---|
ResNet18 | 11,181,642 | 1.82G | 224 | 92.56 | ResNet18 | 11,181,642 | 7.28G | 448 | 94.21 | ResNet50 | 23,528,522 | 4.12G | 224 | 94.93 | ResNet50 | 23,528,522 | 16.47G | 448 | 96.48 |
|
Table 2. Experimental results of setting different resolutions in Kaggle data set
Model | Parameterquantity | Flops | Imageresolution/PPI | Accuracy /% |
---|
ResNet18 | 11,181,642 | 1.82G | 224 | 92.56 | ResNet34 | 21,289,802 | 3.67G | 224 | 94.67 | ResNet50 | 23,528,522 | 4.12G | 224 | 94.93 | ResNet101 | 42,520,650 | 7.84G | 224 | 95.69 | S-Net(ResNet18) | 22,363,284 | 9.10G | 224 | 92.78 | S-Net(ResNet50) | 34,710,164 | 11.40G | 224 | 96.29 |
|
Table 3. Comparative experimental results of different models in Kaggle data set
Model | Parameter quantity | Flops | Image resolution/PPI | Accuracy /% |
---|
S-Net (ResNet18) | 22,363,284 | 9.10G | 224 | 92.78 | S-Net (ResNet50) | 34,710,164 | 11.40G | 224 | 96.29 | ResNet18+ResNet50(ensemble) | 34,710,164 | 5.94G | 224(T-Net)+224(S-Net) | 95.95 | ResNet18+ ResNet50(ensemble) | 34,710,164 | 11.40G | 448(T-Net)+224(S-Net) | 96.10 | T-Net(ResNet18)+S-Net(ResNet18) | 22,363,284 | 9.10G | 448(T-Net)+224(S-Net) | 95.99 | T-Net(ResNet18)+S-Net(ResNet50) | 34,710,164 | 5.94G | 224(T-Net)+224(S-Net) | 96.56 | T-Net(ResNet18)+S-Net(ResNet50) | 34,710,164 | 11.40G | 448(T-Net)+224(S-Net) | 97.92 |
|
Table 4. Comparative experimental results of model joint discrimination in Kaggle data set
Model | Source | Accuracy /% |
---|
AlexNet[16] | Original | 93.65 | AlexNet[16] | Skin segmented | 93.60 | AlexNet[16] | Face | 84.28 | AlexNet[16] | Hands | 89.52 | AlexNet[16] | Face+hands | 86.68 | Inception V3[16] | Original | 95.17 | Inception V3[16] | Skin segmented | 94.57 | Inception V3[16] | Face | 88.82 | Inception V3[16] | Hands | 91.62 | Inception V3[16] | Face+hands | 90.88 | ResNet50 | Original | 94.87 | S-Net(ResNet50) | Original | 95.20 | T-Net(ResNet18)+S-Net(ResNet50) | Original | 95.71 |
|
Table 5. Model comparison results on AUC dataset