Research on Speech Segmentation and Clustering Based on Mixed Features

LIU Jing-tian; JIANG Nan

Journals >Electro-Optic Technology Application >Volume 34 >Issue 5 >Page 37 > Article

Electro-Optic Technology Application
Vol. 34, Issue 5, 37 (2019)

Research on Speech Segmentation and Clustering Based on Mixed Features

LIU Jing-tian and JIANG Nan

Author Affiliations

[in Chinese]

show less

DOI: Cite this Article

LIU Jing-tian, JIANG Nan. Research on Speech Segmentation and Clustering Based on Mixed Features[J]. Electro-Optic Technology Application, 2019, 34(5): 37 Copy Citation Text

show less

Abstract

The problem of extracting the target speaker speech from multiple speaker speech is researched. In order to improve the accuracy of multi-speaker speech segmentation and clustering, a speech segmentation and clustering algorithm based on Mel frequency cepstral coefficient (MFCC) and Gammatone frequency cepstral coefficient (GFCC) hybrid features is proposed, which can effectively avoid problems such as poor robustness of noisy speech segmentation and clustering. For the superimposed pink noise and factory noise speech, a comparative analysis is made based on the conventional algorithm and the improved segmentation clustering algorithm respectively. The results show that the proposed segmentation clustering algorithm based on mixed features is more accurate in extracting target human speech.

Keywords

Gammatone frequency cepstral coefficient (GFCC)Mel frequency cepstral coefficient (MFCC)robustness speech segmentation and clustering

LIU Jing-tian, JIANG Nan. Research on Speech Segmentation and Clustering Based on Mixed Features[J]. Electro-Optic Technology Application, 2019, 34(5): 37

Download Citation

Tools

Save the article for my favorites

Paper Information