Sessions
Session Chair
Hemant Patil
Johan Rohdin
Tanel Alumäe
Dong Wang
Askar Hamdulla
Lantian Li
Jahangir Alam, Woo Hyun Kang
aaa
Speaker Recognition 1 (SR1)
Session Chair: Hemant Patil
Time: 21:30 ~ 21:45, June 28th Beijing Time (UTC+8) / 09:30 ~ 09:45, June 28th New York Time (UTC-4)
Magnitude-Aware Probabilistic Speaker Embeddings
Nikita Kuzmin, Igor Fedorov and Alexey Sholokhov

Time: 21:45 ~ 22:00, June 28th Beijing Time (UTC+8) / 09:45 ~ 10:00, June 28th New York Time (UTC-4)
Analyzing Speaker Verification Embedding Extractors and Back-Ends Under Language and Channel Mismatch
Anna Silnova, Themos Stafylakis, Ladislav Mošner, Oldřich Plchot, Johan Rohdin, Pavel Matějka, Lukáš Burget, Ondřej Glembek and Niko Brummer

Time: 22:00 ~ 22:15, June 28th Beijing Time (UTC+8) / 10:00 ~ 10:15, June 28th New York Time (UTC-4)
Progressive Contrastive Learning for Self-Supervised Text-Independent Speaker Verification
Junyi Peng, Chunlei Zhang, Jan "Honza" Černocký and Dong Yu

Time: 22:15 ~ 22:30, June 28th Beijing Time (UTC+8) / 10:15 ~ 10:30, June 28th New York Time (UTC-4)
Impostor Score Statistics as Quality Measures for the Calibration of Speaker Verification Systems
Sandro Cumani and Salvatore Sarni

Time: 22:30 ~ 22:45, June 28th Beijing Time (UTC+8) / 10:30 ~ 10:45, June 28th New York Time (UTC-4)
Hybrid Neural Network-Based Deep Embedding Extractors for Text-Independent Speaker Verification
Jahangir Alam, Woo Hyun Kang and Abderrahim Fathan

Time: 22:45 ~ 23:00, June 28th Beijing Time (UTC+8) / 10:45 ~ 11:00, June 28th New York Time (UTC-4)
Learning Noise Robust ResNet-Based Speaker Embedding for Speaker Recognition
Mohammad Mohammadamini, Driss Matrouf, Jean-François Bonatsre, Sandipana Dowerah, Romain Serizel and Denis Jouvet
Spoofing and Countermeasure 1 (SC1)
Session Chair: Johan Rohdin
Time: 23:05 ~ 23:20, June 28th Beijing Time (UTC+8) / 11:05 ~ 11:20, June 28th New York Time (UTC-4)
Teager Energy Based-Detection of One-Point and Two-Point Replay Attacks: Towards Cross-Database Generalization
Anand Therattil, Priyanka Gupta, Piyushkumar K. Chodingala and Hemant A. Patil

Time: 23:20 ~ 23:35, June 28th Beijing Time (UTC+8) / 11:20 ~ 11:35, June 28th New York Time (UTC-4)
Investigation on Mixup Strategies for End-to-End Voice Spoof Detection System
Woo Hyun Kang, Jahangir Alam and Abderrahim Fathan

Time: 23:35 ~ 23:50, June 28th Beijing Time (UTC+8) / 11:35 ~ 11:50, June 28th New York Time (UTC-4)
Speaker-Targeted Synthetic Speech Detection
Diego Castán, Md Hafizur Rahman, Sarah Bakst, Chris Cobo-Kroenke, Mitchell McLaren, Martin Graciarena and Aaron Lawson

Time: 23:50, June 28th ~ 00:05, June 29th Beijing Time (UTC+8) / 11:50 ~ 12:05, June 28th New York Time (UTC-4)
Explainable Deepfake and Spoofing Detection: An Attack Analysis Using SHapley Additive exPlanations
Wanying Ge, Massimiliano Todisco and Nicholas Evans

Time: 00:05 ~ 00:20, June 29th Beijing Time (UTC+8) / 12:05 ~ 12:20, June 28th New York Time (UTC-4)
A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification
You Zhang, Ge Zhu and Zhiyao Duan

Time: 00:20 ~ 00:35, June 29th Beijing Time (UTC+8) / 12:20 ~ 12:35, June 28th New York Time (UTC-4)
Spoofing-Aware Speaker Verification with Unsupervised Domain Adaptation
Xuechen Liu, Md Sahidullah and Tomi Kinnunen
Spoofing and Countermeasure 2 (SC2)
Session Chair: Tanel Alumäe

Time: 21:05 ~ 21:20, June 29th Beijing Time (UTC+8) / 09:05 ~ 09:20, June 29th New York Time (UTC-4)
Tackling Spoofing-Aware Speaker Verification with Multi-Model Fusion
Haibin Wu, Jiawen Kang, Lingwei Meng, Yang Zhang, Xixin Wu, Zhiyong Wu, Hung-Yi Lee and Helen Meng

Time: 21:20 ~ 21:35, June 29th Beijing Time (UTC+8) / 09:20 ~ 09:35, June 29th New York Time (UTC-4)
Investigating Self-Supervised Front Ends for Speech Spoofing Countermeasures
Xin Wang and Junichi Yamagishi

Time: 21:35 ~ 21:50, June 29th Beijing Time (UTC+8) / 09:35 ~ 09:50, June 29th New York Time (UTC-4)
A Novel Feature Based on Graph Signal Processing for Detection of Physical Access Attacks Longting
Xu, Mianxin Tian, Xing Guo, Zhiyong Shan, Jie Jia, Yiyuan Peng, Jichen Yang and Rohan Kumar Das

Time: 21:50 ~ 22:05, June 29th Beijing Time (UTC+8) / 09:50 ~ 10:05, June 29th New York Time (UTC-4)
Automatic Speaker Verification Spoofing and Deepfake Detection Using Wav2vec 2.0 and Data Augmentation
Hemlata Tak, Massimiliano Todisco, Xin Wang, Jee-weon Jung, Junichi Yamagishi and Nicholas Evans

Time: 22:05 ~ 22:20, June 29th Beijing Time (UTC+8) / 10:05 ~ 10:20, June 29th New York Time (UTC-4)
A Multi-Resolution Front-End for End-to-End Speech Anti-Spoofing
Wei Liu, Meng Sun, Xiongwei Zhang, Hugo Van Hamme and Thomas Fang Zheng

Time: 22:20 ~ 22:35, June 29th Beijing Time (UTC+8) / 10:20 ~ 10:35, June 29th New York Time (UTC-4)
Robust Cross-SubBand Countermeasure Against Replay Attacks
Jingze Lu, Yuxiang Zhang, Wenchao Wang and Pengyuan Zhang
Speaker Diarization (SD)
Session Chair: Dong Wang

Time: 22:40 ~ 22:55, June 29th Beijing Time (UTC+8) / 10:40 ~ 10:55, June 29th New York Time (UTC-4)
Improving the Naturalness of Simulated Conversations for End-to-End Neural Diarization
Natsuo Yamashita, Shota Horiguchi and Takeshi Homma

Time: 22:55 ~ 23:10, June 29th Beijing Time (UTC+8) / 10:55 ~ 11:10, June 29th New York Time (UTC-4)
Collar-Aware Training for Streaming Speaker Change Detection in Broadcast Speech
Joonas Kalda and Tanel Alumäe

Time: 23:10 ~ 23:25, June 29th Beijing Time (UTC+8) / 11:10 ~ 11:25, June 29th New York Time (UTC-4)
BIT Submission for the Conversational Speaker Diarization Challenge
Chenguang Hu, Qingran Zhan, Miao Liu and Xiang Xie

Time: 23:25 ~ 23:40, June 29th Beijing Time (UTC+8) / 11:25 ~ 11:40, June 29th New York Time (UTC-4)
DP-Means: An Efficient Bayesian Nonparametric Model for Speaker Diarization
Yijun Gong and Xiao-Lei Zhang

Time: 23:40 ~ 23:55, June 29th Beijing Time (UTC+8) / 11:40 ~ 11:55, June 29th New York Time (UTC-4)
Low-Latency Online Speaker Diarization with Graph-Based Label Generation
Yucong Zhang, Qinjian Lin, Weiqing Wang, Lin Yang, Xuyang Wang, Junjie Wang and Ming Li

Time: 23:55, June 29th ~ 00:10, June 30th Beijing Time (UTC+8) / 11:55 ~ 12:10, June 29th New York Time (UTC-4)
A Quick and Effective Speaker Diarization System
Zuoer Chen and Liang He
Speaker Recognition 2 (SR2)
Session Chair: Askar Hamdulla

Time: 00:15 ~ 00:30, June 30th Beijing Time (UTC+8) / 12:15 ~ 12:30, June 29th New York Time (UTC-4)
Domain Generalized Speaker Embedding Learning via Mutual Information Minimization
Woo Hyun Kang, Jahangir Alam and Abderrahim Fathan

Time: 00:30 ~ 00:40, June 30th Beijing Time (UTC+8) / 12:30 ~ 12:45, June 29th New York Time (UTC-4)
Baselines and Protocols for Household Speaker Recognition
Alexey Sholokhov, Xuechen Liu, Md Sahidullah and Tomi Kinnunen

Time: 00:45 ~ 01:00, June 30th Beijing Time (UTC+8) / 12:45 ~ 13:00, June 29th New York Time (UTC-4)
Speaker Recognition on Mono-Channel Telephony Recordings
Yosef Solewicz, Noa Cohen, Johan Rohdin, Srikanth Madikeri and Jan "Honza" Čercnocký

Time: 01:00 ~ 01:15, June 30th Beijing Time (UTC+8) / 13:00 ~ 13:15, June 29th New York Time (UTC-4)
Parameter-Free Attentive Scoring for Speaker Verification
Jason Pelecanos, Quan Wang, Yiling Huang and Ignacio Lopez Moreno

Time: 01:15 ~ 01:30, June 30th Beijing Time (UTC+8) / 13:15 ~ 13:30, June 29th New York Time (UTC-4)
Time-Varying Score Reliability Prediction in Speaker Identification
Sarah Bakst, Chris Cobo-Kroenke, Aaron Lawson, Mitchell McLaren and Allen Stauffer

Time: 01:30 ~ 01:45, June 30th Beijing Time (UTC+8) / 13:30 ~ 13:45, June 29th New York Time (UTC-4)
Advances in Cross-Lingual and Cross-Source Audio-Visual Speaker Recognition: The JHU-MIT System for NIST SRE21
Jesús Villalba, Bengt J. Borgstrom, Saurabh Kataria, Magdalena Rybicka, Carlos Castillo, Jaejin Cho, L. Paola García-Perera, Pedro A. Torres-Carrasquillo and Najim Dehak
Speaker and Language Recognition (SLR)
Session Chair: Sayaka Shiota

Time: 20:00 ~ 20:15, June 30th Beijing Time (UTC+8) / 08:00 ~ 08:15, June 30th New York Time (UTC-4)
Few-Shot Speaker Identification Using Depthwise Separable Convolutional Network with Channel Attention
Yanxiong Li, Wucheng Wang, Hao Chen, Wenchang Cao, Wei Li and Qianhua He

Time: 20:15 ~ 20:30, June 30th Beijing Time (UTC+8) / 08:15 ~ 08:30, June 30th New York Time (UTC-4)
Deep Representation Decomposition for Rate-Invariant Speaker Verification
Fuchuan Tong, Siqi Zheng, Haodong Zhou, Xingjia Xie, Qingyang Hong and Lin Li

Time: 20:30 ~ 20:45, June 30th Beijing Time (UTC+8) / 08:30 ~ 08:45, June 30th New York Time (UTC-4)
A Study of Multimodal Person Verification Using Audio-Visual-Thermal Data
Madina Abdrakhmanova, Saniya Abushakimova, Yerbolat Khassanov and Huseyin Atakan Varol

Time: 20:45 ~ 21:00, June 30th Beijing Time (UTC+8) / 08:45 ~ 09:00, June 30th New York Time (UTC-4)
Pretraining Approaches for Spoken Language Recognition: TalTech Submission to the OLR 2021 Challenge
Tanel Alumäe and Kunnar Kukk

Time: 21:00 ~ 21:15, June 30th Beijing Time (UTC+8) / 09:00 ~ 09:15, June 30th New York Time (UTC-4)
Enhancing Language Identification Using Dual-Mode Model with Knowledge Distillation
Hexin Liu, Leibny Paola Garcia Perera, Andy W. H. Khong, Justin Dauwels, Suzy J. Styles and Sanjeev Khudanpur

Time: 21:15 ~ 21:30, June 30th Beijing Time (UTC+8) / 09:15 ~ 09:30, June 30th New York Time (UTC-4)
Attentive Temporal Pooling for Conformer-Based Streaming Language Identification in Long-Form Speech
Quan Wang, Yang Yu, Jason Pelecanos, Yiling Huang and Ignacio Lopez Moreno
Voice Synthesis, Anonymization and Separation (VSAS)
Session Chair: XiaoLei Zhang

Time: 21:35 ~ 21:50, June 30th Beijing Time (UTC+8) / 09:35 ~ 09:50, June 30th New York Time (UTC-4)
BreizhCorpus: A Large Breton Language Speech Corpus and Its Use for Text-to-Speech Synthesis
David Guennec, Hassan Hajipoor, Gwénolé Lecorvé, Pascal Lintanf, Damien Lolive, Antoine Perquin and Gaëlle Vidal

Time: 21:50 ~ 22:05, June 30th Beijing Time (UTC+8) / 09:50 ~ 10:05, June 30th New York Time (UTC-4)
Cycleflow: Purify Information Factors by Cycle Loss
Haoran Sun, Chen Chen, Lantian Li and Dong Wang

Time: 22:05 ~ 22:20, June 30th Beijing Time (UTC+8) / 10:05 ~ 10:20, June 30th New York Time (UTC-4)
Language-Independent Speaker Anonymization Approach Using Self-Supervised Pre-Trained Models
Xiaoxiao Miao, Xin Wang, Erica Cooper, Junichi Yamagishi and Natalia Tomashenko

Time: 22:20 ~ 22:35, June 30th Beijing Time (UTC+8) / 10:20 ~ 10:35, June 30th New York Time (UTC-4)
Robustness of Signal Processing-Based Pseudonymization Method Against Decryption Attack
Hiroto Kai, Shinnosuke Takamichi, Sayaka Shiota and Hitoshi Kiya

Time: 22:35 ~ 22:50, June 30th Beijing Time (UTC+8) / 10:35 ~ 10:50, June 30th New York Time (UTC-4)
Closing the Gap Between Single-User and Multi-User VoiceFilter-Lite
Rajeev Rikhye, Quan Wang, Qiao Liang, Yanzhang He and Ian McGraw

Time: 22:50 ~ 23:05, June 30th Beijing Time (UTC+8) / 10:50 ~ 11:05, June 30th New York Time (UTC-4)
Single-Channel Target Speaker Separation Using Joint Training with Target Speaker's Pitch Information
Jincheng He, Yuanyuan Bao, Na Xu, Hongfeng Li, Shicong Li, Linzhang Wang, Fei Xiang and Ming Li
Evaluation and Benchmarking (EB)
Session Chair: Hao Huang

Time: 23:10 ~ 23:25, June 30th Beijing Time (UTC+8) / 11:10 ~ 11:25, June 30th New York Time (UTC-4)
C-P Map: A Novel Evaluation Toolkit for Speaker Verification
Lantian Li, Di Wang, Wenqiang Du and Dong Wang

Time: 23:25 ~ 23:40, June 30th Beijing Time (UTC+8) / 11:25 ~ 11:40, June 30th New York Time (UTC-4)
The NIST CTS Speaker Recognition Challenge
Seyed Omid Sadjadi, Craig Greenberg, Elliot Singer, Lisa Mason and Douglas Reynolds

Time: 23:40 ~ 23:55, June 30th Beijing Time (UTC+8) / 11:40 ~ 11:55, June 30th New York Time (UTC-4)
The 2021 NIST Speaker Recognition Evaluation
Seyed Omid Sadjadi, Craig Greenberg, Elliot Singer, Lisa Mason and Douglas Reynolds

Time: 23:55, June 30th ~ 00:10, July 1st Beijing Time (UTC+8) / 11:55 ~ 12:10, June 30th New York Time (UTC-4)
Baseline Systems for the First Spoofing-Aware Speaker Verification Challenge: Score and Embedding Fusion
Hye-jin Shim, Hemlata Tak, Xuechen Liu, Hee-Soo Heo, Jee-weon Jung, Joon Son Chung, Soo-Whan Chung, Ha-Jin Yu, Bong-Jin Lee, Massimiliano Todisco, Héctor Delgado, Kong Aik Lee, Md Sahidullah, Tomi Kinnunen and Nicholas Evans

Time: 00:10 ~ 00:25, July 1st Beijing Time (UTC+8) / 12:10 ~ 12:25, June 30th New York Time (UTC-4)
Advances in Speaker Recognition for Multilingual Conversational Telephone Speech: The JHU-MIT System for NIST SRE20 CTS Challenge
Jesús Villalba, Bengt J. Borgstrom, Saurabh Kataria, Jaejin Cho, Pedro A. Torres-Carrasquillo and Najim Dehak

Time: 00:25 ~ 00:40, July 1st Beijing Time (UTC+8) / 12:25 ~ 12:40, June 30th New York Time (UTC-4)
Development of ABC Systems for the 2021 Edition of NIST Speaker Recognition Evaluation
Jahangir Alam, Radek Beneš, Marián Beszédeš, Lukáš Burget, Mohamed Dahmane, Abderrahim Fathan, Hamed Ghodrati, Ondřej Glembek, Woo Hyun Kang, Pavel Matějka, Ladislav Mošner, Oldřich Plchot, Johan Rohdin, Anna Silnova and Themos Stafylakis

Time: 00:40 ~ 00:55, July 1st Beijing Time (UTC+8) / 12:40 ~ 12:55, June 30th New York Time (UTC-4)
STC Speaker Recognition System for the NIST SRE 2021
Galina Lavrentyeva, Sergey Novoselov, Vladimir Volokhov, Anastasia Avdeeva, Aleksei Gusev, Alisa Vinogradova, Igor Korsunov, Alexandr Kozlov, Timur Pekhovsky, Andrey Shulipa, Evgeny Smirnov and Vasily Galyuk
Special Session: CNSRC 2022 (SS)
Session Chair: Lantian Li

Time: 21:05 ~ 21:20, July 1st Beijing Time (UTC+8) / 09:05 ~ 09:20, July 1st New York Time (UTC-4)
Introduction by the Organizers

Time: 21:20 ~ 21:35, July 1st Beijing Time (UTC+8) / 09:20 ~ 09:35, July 1st New York Time (UTC-4)
The Volkswagen-Mobvoi System for CN-Celeb Speaker Recognition Challenge 2022
Yingwei Tan and Xuefeng Ding

Time: 21:35 ~ 21:50, July 1st Beijing Time (UTC+8) / 09:35 ~ 09:50, July 1st New York Time (UTC-4)
Cross-Scene Speaker Verification Based on Dynamic Convolution for the CNSRC 2022 Challenge
Jialin Zhang, Qinghua Ren, Youcai Qin, Zikai Wan and Qirong Mao

Time: 21:50 ~ 22:05, July 1st Beijing Time (UTC+8) / 09:50 ~ 10:05, July 1st New York Time (UTC-4)
Investigation on Deep Speaker Embedding Extraction Methods for Multi-Genre Speaker Verification
Woo Hyun Kang and Jahangir Alam

Time: 22:05 ~ 22:20, July 1st Beijing Time (UTC+8) / 10:05 ~ 10:20, July 1st New York Time (UTC-4)
Combination of Multiple Embeddings for Speaker Retrieval
Xinmei Su, Qingran Zhan, Chenguang Hu and Xiang Xie
Speech Application (SA)
Session Chair: Jahangir Alam, Woo Hyun Kang

Time: 22:25 ~ 22:40, July 1st Beijing Time (UTC+8) / 10:25 ~ 10:40, July 1st New York Time (UTC-4)
An Empirical Study of Weakly Supervised Audio Tagging Embeddings for General Audio Representations
Heinrich Dinkel, Zhiyong Yan, Yongqing Wang, Junbo Zhang and Yujun Wang

Time: 22:40 ~ 22:55, July 1st Beijing Time (UTC+8) / 10:40 ~ 10:55, July 1st New York Time (UTC-4)
Formant Dynamics of Chinese Compound Vowels with Implications for Forensic Speaker Identification
Jintao Kang, Aijun Li and Jingyang Li

Time: 22:55 ~ 23:10, July 1st Beijing Time (UTC+8) / 10:55 ~ 11:10, July 1st New York Time (UTC-4)
Generating TTS Based Adversarial Samples for Training Wake-Up Word Detection Systems Against Confusing Words
Haoxu Wang, Yan Jia, Zeqing Zhao, Xuyang Wang, Junjie Wang and Ming Li

Time: 23:10 ~ 23:25, July 1st Beijing Time (UTC+8) / 11:10 ~ 11:25, July 1st New York Time (UTC-4)
Multimodal Emotion Recognition Using Transfer Learning from Speaker Recognition and BERT-Based Models
Sarala Padi, Seyed Omid Sadjadi, Dinesh Manocha and Ram D. Sriram

Time: 23:25 ~ 23:40, July 1st Beijing Time (UTC+8) / 11:25 ~ 11:40, July 1st New York Time (UTC-4)
Self-Adaptive Multilingual ASR Rescoring with Language Identification and Unified Language Model
Zhuo Gong, Daisuke Saito, Longfei Yang, Takahiro Shinozaki, Sheng Li, Hisashi Kawai and Nobuaki Minematsu

Time: 23:40 ~ 23:55, July 1st Beijing Time (UTC+8) / 11:40 ~ 11:55, July 1st New York Time (UTC-4)
Gamified Speaker Comparison by Listening
Sandip Ghimire, Tomi Kinnunen and Rosa González Hautamäki