School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
Clc Number:
Fund Project:
Article
|
Figures
|
Metrics
|
Reference
|
Related
|
Cited by
|
Materials
|
Comments
Abstract:
With the development of deep learning, speaker verification has made great progress. Compared with other biometric identification technologies, this technology has advantages of remote operation, low cost, easy human-computer interaction, etc., thus it shows a wide range of application prospects in the fields of public security, criminal investigation, and financial services. A systematic overview of the development lineage of deep learning-based speaker verification techniques is provided. Firstly, the development history and research status of deep learning-based speaker representation model are introduced in four aspects: Model input and structure, pooling layer, supervised loss function, and self-supervised learning and pre-training model. Then, the challenges faced by speaker verification are discussed, such as cross-domain mismatch problems like noise interference, channel mismatch and far-field speech, and the corresponding domain adaptation and domain generalization methods are outlined. Finally, the further research directions are presented.
Reference
Related
Cited by
Get Citation
LI Jianchen, HAN Jiqing. State of the Art and Prospects of Deep Learning-Based Speaker Verification[J].,2024,39(5):1062-1084.