To Wake-up or Not to Wake-up: Keyword False Alarm reduction by Succesive Refinement
Keyword spotting systems continuously process audio steams to detect keywords. One of the most challenging aspects of such systems is the case when the system falsely registers a keyword (false alarm) despite the keyword not being uttered. In this paper, we propose a simple yet elegant solution that follows from the law of total probability. We show that existing deep keyword spotting mechanisms can be improved by successive refinement, where the system first classifies whether the input audio is speech or not, followed by whether the input is keyword-like or not, and finally classifies which keyword was uttered. We show across multiple models ranging from 13K parameters to 340K parameters, the successive refinement technique reduces false alarm by a factor of 3 on both held-out test dataset and out-of-domain (unseen) data. Further, our proposed approach is “plug-and-play and can applied to any baseline keyword spotting method.
Author: Yashas Malur Saidutta, Rakshith Srinivasa, Chinghua Lee, Chouchang Yang, Yilin Shen, Hongxia Jin
Published: International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
Date: Jun 4, 2023