• Laser & Optoelectronics Progress
  • Vol. 57, Issue 8, 081021 (2020)
Chen Jiao*, Tao Zhang, and Jianhong Sun
Author Affiliations
  • School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China
  • show less
    DOI: 10.3788/LOP57.081021 Cite this Article Set citation alerts
    Chen Jiao, Tao Zhang, Jianhong Sun. Convolutional Neural Network Based Indoor Microphone Array Sound Source Localization[J]. Laser & Optoelectronics Progress, 2020, 57(8): 081021 Copy Citation Text show less
    Space cluster classification for SSL
    Fig. 1. Space cluster classification for SSL
    Flow chart of localization method
    Fig. 2. Flow chart of localization method
    Comparison of classification accuracy of different algorithms in different environments
    Fig. 3. Comparison of classification accuracy of different algorithms in different environments
    Network layerNetwork parameter
    Input layerDimension: 112×2688
    Convolution layerNumber of convolution kernel: 8; kernel_size: 5×5; stride: 1; pad: 0
    Pooling layerPooling: max pooling; kernel_size: 3×3; stride: 1; pad: 0; dropout: 50%
    Convolution layerNumber of convolution kernel: 16; kernel_size: 5×5; stride: 1; pad 0
    Pooling layerPooling: max pooling; kernel_size: 3×3; stride: 1; pad 0; dropout: 50%
    Convolution layerNumber of convolution kernel: 32; kernel_size: 5×5; stride: 1; pad 0
    Pooling layerPooling: max pooling; kernel_size: 3×3; stride: 1; pad 0; dropout: 50%
    Convolution layerNumber of convolution kernel: 64; kernel_size: 5×5; stride: 1; pad 0
    Pooling layerPooling: max pooling; kernel_size: 3×3; stride: 1; pad 0; dropout: 50%
    Connection layerNumber of neurons: 1024; activation function: ReLU; dropout: 50%
    Output layerActivation function: softmax; learning rate : 0.001; iterations: 1000; batch_size: 64
    Table 1. CNN structure
    SignalReverberationtime /msAccuracyof TDOA /%Accuracyof SVM /%Accuracyof PNN /%Accuracyof BP /%Accuracyof CNN /%
    Clean voice062.6178.5896.8796.3296.67
    Clean voice30061.6475.2896.3095.5395.03
    Clean voice60060.0274.8693.2592.8493.80
    SNR: 10 dB042.9244.9390.1694.0895.83
    SNR: 10 dB30046.3149.8289.4490.3693.87
    SNR: 10 dB60036.5346.3588.7387.6992.79
    SNR: 0 dB038.6145.5789.8988.8194.32
    SNR: 0 dB30034.6144.0389.0986.3793.49
    SNR: 0 dB30024.8744.6088.6185.2090.78
    Table 2. Classification accuracy of different algorithms in different environments
    SignalReverberationtime /msIterationsLocalizationtime /sIterationsLocalizationtime ofPNN /sIterationsLocalizationtime ofBP /sIterationsLocalizationtime ofCNN /s
    Clean voice044967.446716.81050410.31872311.2
    Clean voice30042157.853877.1136619.81902510.8
    Clean voice60046158.160626.12257210.32978110.9
    SNR: 10 dB048539.352837.4219739.73367112.4
    SNR: 10 dB30049.828.262317.3346039.74210111.8
    SNR: 10 dB60054309.157357.13852410.94877611.2
    SNR: 0 dB045519.250896.93587110.65478611.2
    SNR: 0 dB30048769.460868.34165410.85762310.5
    SNR: 0 dB30055769.364878.14772610.85834111.9
    Table 3. Real-time localization time of different algorithms
    Chen Jiao, Tao Zhang, Jianhong Sun. Convolutional Neural Network Based Indoor Microphone Array Sound Source Localization[J]. Laser & Optoelectronics Progress, 2020, 57(8): 081021
    Download Citation