Author Affiliations
College of Physics, Electronics and Electrical Engineering, Ningxia University, Yinchuan 750021, Ningxia , Chinashow less
Fig. 1. Block diagram of system structure
Fig. 2. Network structure of OpenPose
Fig. 3. Depthwise separable convolution decomposition process. (a) Standard convolution; (b) depthwise convolution; (c) pointwise convolution
Fig. 4. Skeleton diagram
Fig. 5. Examples of skeleton diagrams corresponding to various behaviors
Fig. 6. Training result graph
Fig. 7. Human behavior recognition test device. (a) Test device; (b) Jetson Xavier NX development board
Fig. 8. Effect pictures of successful test
Convolution type | Convolution kernel size | Stride | Dilation | Padding |
---|
conv | 3×3×32 | 2 | 1 | 0 | conv dw_1 | 3×3×64 | 1 | 1 | 0 | conv dw_2 | 3×3×128 | 2 | 1 | 0 | conv dw_3 | 3×3×128 | 1 | 1 | 0 | conv dw_4 | 3×3×256 | 2 | 1 | 0 | conv dw_5 | 3×3×256 | 1 | 1 | 0 | conv dw_6 | 3×3×512 | 1 | 1 | 0 | conv dw_7 | 3×3×512 | 1 | 2 | 2 | conv dw_8 | 3×3×512 | 1 | 1 | 0 | conv dw_9 | 3×3×512 | 1 | 1 | 0 | conv dw_10 | 3×3×512 | 1 | 1 | 0 | conv dw_11 | 3×3×512 | 1 | 1 | 0 |
|
Table 1. Adjusted feature extraction network structure
Category | 11 types of human behavior data set (17454) |
---|
Number of samples | Number of samples in training set | Number of samples in validation set |
---|
Stand | 1644 | 13963 | 3491 | Squat | 1288 | Run | 1608 | Bend | 1428 | Fall | 1008 | Operate the PC | 2001 | Leg press | 2060 | Walk | 1389 | Wave | 782 | Kick | 2300 | Hug | 1946 |
|
Table 2. Number of samples of various behaviors in dataset
Software and hardware platform | Parameter |
---|
Embedded development board | NVIDIA Jetson Xavier NX | Operating system | Ubuntu 18.04 | Deep learning framework | Tensorflow | CPU | 6-core NVIDIA Carmel ARM®v8.2 64-bit CPU | GPU | NVIDIA Volta™ Architecture 384 NVIDIA® CUDA® cores and 48 Tensor cores | CUDA | 10.2 | cuDNN | 8.0 | Programming language | Python 3.6 |
|
Table 3. Software and hardware used in experiment
Parameter name | Parameter value |
---|
Input_size | 224×224 | Epoch | 160 | Batch_size | 64 | Learning_rate | 0.0001 | Loss function | Cross entropy loss | Optimizer | Adam |
|
Table 4. Experimental parameter description
Category | Stand | Squat | Run | Bend | Fall | Operate the PC | Leg press | Walk | Wave | Kick | Hug |
---|
Stand | 100 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | Squat | 0 | 100 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | Run | 1.3 | 0 | 94.5 | 0 | 0 | 0 | 0 | 4.2 | 0 | 0 | 0 | Bend | 0 | 16.7 | 0 | 83.3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | Fall | 0 | 2.5 | 0 | 0 | 97.5 | 0 | 0 | 0 | 0 | 0 | 0 | Operate the PC | 0 | 0 | 0 | 0 | 0 | 100 | 0 | 0 | 0 | 0 | 0 | Leg press | 0 | 3.3 | 0 | 0 | 5.7 | 0 | 91.0 | 0 | 0 | 0 | 0 | Walk | 1.5 | 0 | 1.2 | 0 | 0 | 0 | 0 | 97.3 | 0 | 0 | 0 | Wave | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 100 | 0 | 0 | Kick | 0 | 0 | | 0 | 0 | 0 | 0 | 0 | 0 | 100 | 0 | Hug | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 6.7 | 0 | 93.3 |
|
Table 5. Recognition confusion matrix of 11 types of human behavior
Model | Feature extraction network | File size /MB | Recognition accuracy /% | Detection speed /(frame·s-1) |
---|
OpenPose | VGG19 | 200 | 96.24 | 3.98 | Lightweight OpenPose | MobileNet | 7.5 | 96.08 | 11.04 |
|
Table 6. Comparison of different models
Method | Type of behavior | Recognition rate /% |
---|
Reference[4] | clap, walk, dribble, play golf | 86.25 | Reference[13] | walk, jog, go up and down, sit, stand | 91.60 | Reference[14] | walk, run, go up and down, stand still, sit-stand, stand-sit, stand-squat, squat-stand | 95.05 | Reference[15] | walk, run, jump, go up and down stairs | 85.00 | Proposed method | stand, walk, run, squat, bend, kick, hug, fall, wave, side press, computer the PC | 96.08 |
|
Table 7. Related research comparison