Hyperparameter-free Continuous Learning for NLU Domain Classification
Domain classification is the fundamental task in natural language understanding (NLU), which often requires fast accommodation to new emerging domains.
This constraint makes it impossible to retrain all previous domains, even if they are accessible to the new model.
Most existing continual learning approaches are designed for the scenario that zero old data are observable.
However, these methods may result in low accuracy and performance fluctuation, when the old and new data distributions are significantly different. And extensive efforts are often required in parameter tuning.
The key problem in many practical cases such as domain classification is not the absence of old data, but the inefficiency to retrain the model with the whole old dataset.
Is it potential to utilize little old data to yield high accuracy and maintain stable performance, while at the same time, without introducing extra parameters?
In this paper, we proposed a parameter-free continual learning model for text data that can stably produce high performance under various environments.
Specifically, we utilize Fisher information to select exemplars that can “record key information of original model.
Also, a novel scheme called dynamical weight consolidation is proposed to enable parameter-free learning during the retrain process.
Extensive experiments demonstrate baselines provide fluctuated performance which makes them useless in practice.
On the contrary, our proposed model significantly and consistently outperforms the best state-of-the-art method by up to 20\% in average accuracy, and each of its component contributes effectively to overall performance.
Author: Ting Hua, Yilin Shen, Changsheng Zhao, Yen-Chang Hsu, Hongxia Jin
Published: Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL)
Date: Jun 8, 2021