Loading…
Trainable windows for SincNet architecture
SincNet architecture has shown significant benefits over traditional Convolutional Neural Networks (CNN), especially for speaker recognition applications. SincNet comprises parameterized Sinc functions as filters in the first layer followed by convolutional layers. Although SincNet is compact in nat...
Saved in:
Published in: | EURASIP journal on audio, speech, and music processing speech, and music processing, 2023-01, Vol.2023 (1), p.3-9, Article 3 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | SincNet architecture has shown significant benefits over traditional Convolutional Neural Networks (CNN), especially for speaker recognition applications. SincNet comprises parameterized Sinc functions as filters in the first layer followed by convolutional layers. Although SincNet is compact in nature and offers top-level understanding of the features extracted, the effect of window function used in SincNet is not thoroughly addressed yet. Hamming and Hann are popularly used as the default time-localized windows to reduce spectral leakage. Hence, a comprehensive investigation of 28 different windowing functions on SincNet architecture towards speaker recognition task using TIMIT dataset was performed in this work. Additionally, “trainable” window functions were configured with tunable parameters to characterize the performance. The paper benchmarks the effect of the time-localized windowing function in terms of the bandwidth, side-lobe suppression, and spectral leakage for the filter banks employed in the first layer of the SincNet architecture. Trainable Gaussian and Cosine-Sum functions exhibited relative improvement of 41.46% and 82.11% in the sentence level classification error rate over Hamming window when employed on SincNet architecture. |
---|---|
ISSN: | 1687-4722 1687-4714 1687-4722 |
DOI: | 10.1186/s13636-023-00271-0 |