Cover Image for System.Linq.Enumerable+EnumerablePartition`1[System.Char]

CBC-Based Synthetic Speech Detection

OAI: oai:igi-global.com:223942 DOI: 10.4018/IJDCF.2019040105
Published by: IGI Global

Abstract

In previous studies of synthetic speech detection (SSD), the most widely used features are based on a linear power spectrum. Different from conventional methods, this article proposes a new feature extraction method for SSD from octave power spectrum which is obtained from constant-Q transform (CQT). By combining CQT, block transform (BT) and discrete cosine transform (DCT), a new feature is obtained, namely, constant-Q block coefficients (CBC). In which, CQT is used to transform speech from the time domain into the frequency domain, BT is used to segment octave power spectrum into many blocks and DCT is used to extract principal information of every block. The experimental results on ASVspoof 2015 corpus shows that CBC is superior to other front-ends features that have been benchmarked on ASVspoof 2015 evaluation set in terms of equal error rate (EER).