• Laser & Optoelectronics Progress
  • Vol. 57, Issue 18, 181702 (2020)
Kailong Ren, Yi Wang*, Xiaodong Chen, and Huaiyu Cai
Author Affiliations
  • School of Precision Instruments and Optoelectronics Engineering, Tianjin University, Tianjin 300072, China
  • show less
    DOI: 10.3788/LOP57.181702 Cite this Article Set citation alerts
    Kailong Ren, Yi Wang, Xiaodong Chen, Huaiyu Cai. Speaker-Dependent Speech Recognition Algorithm for Laparoscopic Supporter Control[J]. Laser & Optoelectronics Progress, 2020, 57(18): 181702 Copy Citation Text show less

    Abstract

    A long short-term memory (LSTM) recurrent neural network based on an i-vector feature is presented for speech control of laparoscopic supporter to realize short-term isolated word command recognition from the speech of a specific doctor using small training samples. In this model, LSTM recurrent neural network is used as the basic model, Mel-frequency cepstrum coefficient (MFCC) is used as the input characteristic parameter, i-vector feature is used as the deep input information of LSTM recurrent neural network, and the deep feature information behind LSTM layer in the neural network is spliced to achieve the purpose of parameter fusion, so as to realize the accurate recognition of the voice instructions of the specific surgeon and the rejection recognition of the voice instructions of the non surgeon. This approach offers a secure and intelligent speech recognition scheme for laparoscopic surgeries. Further, a self-built speech database is used as a training library to verify speech recognition performance of the proposed algorithm as well as its rejection performance for the speech not included in the training library. Experiments show that compared with dynamic time warping(DTW)and Gaussian mixture model-Hidden Markov model (GMM-HMM), the proposed model exhibits a 99.6% correct recognition rate for voice commands of specific people recorded in the training library while maintaining a false acceptance rate of 0%, with an average false acceptance rate of 2.5% for voices not included in the training library. The proposed model meets the requirements of accuracy and safety expected by laparoscopic supporter control standards.
    Kailong Ren, Yi Wang, Xiaodong Chen, Huaiyu Cai. Speaker-Dependent Speech Recognition Algorithm for Laparoscopic Supporter Control[J]. Laser & Optoelectronics Progress, 2020, 57(18): 181702
    Download Citation