The 2021 Speaker Recognition Evaluation (SRE21) is the next in an ongoing series of speaker recognition evaluations conducted by the US National Institute of Standards and Technology (NIST) since 1996. The objectives of the evaluation series are (1) to effectively measure system-calibrated performance of the current state of technology, (2) to provide a common framework that enables the research community to explore promising new ideas in speaker recognition, and (3) to support the community in their development of advanced technology incorporating these ideas. The evaluations are intended to be of interest to all researchers working on the general problem of text-independent speaker recognition. To this end, the evaluations are designed to focus on core technology issues and to be simple and accessible to those wishing to participate.

SRE21 will be organized similar to SRE19, focusing on speaker detection over conversational telephone speech (CTS) and audio from video (AfV). It will introduce the following new features, thanks to a new multimodal and multilingual (i.e., with multilingual subjects) corpus collected outside North America:

  • trials (target and non-target) with enrollment and test segments originating from different source types (i.e., CTS and AfV)
  • trials (target and non-target) with enrollment and test segments spoken in different languages (i.e., cross-lingual trials)
Similar to SRE19, in addition to the audio-only track, SRE21 will feature a visual-only track and an audio-visual track involving automatic person detection using audio, image, and video material. System submission is required for the audio and audio-visual tracks, and optional for the visual track.

July 12, 2021: Evaluation Plan Published

July-September 2021 : Registration Period

July, 2021 : Training/Dev Data Available

August 2021: Evaluation Data Available

October 2021: System Output/Description Due to NIST

October 2021: Preliminary Official Results Released

December 2021: Workshop