Intel announced the largest datasets for speech recognition

Intel introduced datasets People’s Speech and MSWC, aimed at recognizing and transcribing spoken speech. Both datasets are among the largest in their class and include audio recordings in 59 languages.

The People’s Speech dataset focuses on automatic speech recognition tasks, while MSVC focuses on keyword search. Both projects were launched in 2018 with the aim of identifying and compiling the 50 most used languages in the world into a single dataset. Intel collaborated with Alibaba, Oracle, Google, Baidu and other companies to collect datasets.

People’s Speech includes more than 87,000 hours of oral speech. Currently, it is one of the largest datasets with English, licensed for academic and commercial use.

MSVC contains audio recordings of more than 300 thousand keywords in dozens of languages. The dataset covers languages spoken by more than 5 billion people.

Both datasets will be available for download in the near future.