HTML5 Webook
52/194

SPEECH, pp.403–407, 2015.5S. Fernando, V. Sethu, E. Ambikairajah, and J. Epps, “Bidirectional Modelling for Short Duration Language Identification,” Proc. INTER-SPEECH , pp.2809–2813, 2017.6D. Snyder, D. Garcia-Romero, A. McCree, G. Sell, D. Povey, and S. Khudanpur, “Spoken language recognition using x-vectors,” Proc. Odyssey, pp.105–111, 2018.7G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neu-ral network,” arXiv preprint arXiv:1503.02531, 2015.8P. Shen, X. Lu, S. Li, and H. Kawai, “Feature representation of short utterances based on knowledge distillation for spoken language iden-tification.” Proc. INTERSPEECH, 2018, pp.1813–1817.9L. Maaten and G. Hinton, “Visualizing High-Dimensional Data Using t-SNE,” Journal of Machine Learning Research 9 (Nov.), pp.2579–2605, 2008.10P. Shen, X. Lu, K. Sugiura, S. Li, and H. Kawai, “Compensation on x-vector for Short Utterance Spoken Language Identification.” Proc. Odyssey, 2020, pp.47–52.11P. Shen, X. Lu, S. Li, and H. Kawai, “Interactive learning of teacher-student model for short utterance spoken language identification.” Proc. ICASSP, pp.5981–5985, 2019.12X. Lu, P. Shen, Y. Tsao, and H. Kawai, “Unsupervised Neural Adaptation Model Based on Optimal Transport for Spoken Language Identification,” Proc. ICASSP, pp.7213–7217, 2021.13G. Peyre and M. Cuturi, “Computational Optimal Transport,” ArX-iv:1803.00567, 2018.14N. Courty, R. Flamary, D. Tuia, and A. Rakotomamonjy, “Optimal trans-port for domain adaptation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.36, no.9, pp.1853–1865, 2017.15P. Shen, X. Lu, and H. Kawai, “Transducer-based language embedding for spoken language identification,” Proc. INTERSPEECH, 2022.16J. Hansen and T. Hasan, “Speaker recognition by machines and hu-mans: A tutorial review,” IEEE Signal processing magazine, vol.32, no.6, pp.74–99, 2015.17N. Dehak, P. Kenny, R. Dehak, P. Dumouchel, and P. Ouellet, “Front-end factor analysis for speaker verification,” IEEE Transactions on Audio, Speech, and Language Processing, vol.19, no.4, pp.788–798, 2011.18E. Variani, X. Lei, E. McDermott, I. L. Moreno, and J. Dominguez, “Deep neural networks for small footprint text-dependent speaker verification,” Proc. ICASSP, pp.4052–4056, 2014.19D. Snyder, D. Garcia-Romero, G. Sell, D. Povey, and S. Khudanpur, “X-vectors: Robust DNN embeddings for speaker recognition,” Proc. ICASSP, pp.5329–5333, 2018.20A. Sizov, K. Lee, and T. Kinnunen, “Unifying probabilistic linear dis-criminant analysis variants in biometric authentication,” Joint IAPR, In-ternational Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR), Springer, pp.464–475, 2014.21D. Chen, X. Cao, D. Wipf, F. Wen, and J. Sun, “An efficient joint for-mulation for Bayesian face verification,” IEEE Transactions on pattern analysis and machine intelligence, vol.39, pp.32–46, 2016.22X. Lu, P. Shen, Y. Tsao, and H. Kawai, “Coupling a Generative Model With a Discriminative Learning Framework for Speaker Verifica-tion,” IEEE Transactions on Audio, Speech, and Language Processing, vol.29, pp.3631–3641, 2021.23A. Baevski, Y. Zhou, A. Mohamed, and M. Auli, “wav2vec 2.0: A frame-work for self-supervised learning of speech representations,” Advances in Neural Information Processing Systems, 2020.沈 鵬 (しん ほう)ユニバーサルコミュニケーション研究所先進的音声翻訳研究開発推進センター先進的音声技術研究室主任研究員博士(工学)音声認識、言語識別、話者識別、イベント検出Xugang Luユニバーサルコミュニケーション研究所先進的音声翻訳研究開発推進センター先進的音声技術研究室主任研究員博士(工学)音声認識、機械学習46   情報通信研究機構研究報告 Vol.68 No.2 (2022)2 多言語コミュニケーション技術

元のページ  ../index.html#52

このブックを見る