Under Bhashini, IISc to open source 16,000 hours of speech data
Under Bhashini, electronics and information technology ministry’s flagship effort in artificial intelligence (AI) for Indic languages, Indian Institute of Science’s AI and Robotics Technology Park (IISc-ARTPARK) plans to open-source 16,000 hours of spontaneous speech from 80 districts as part of Project Vaani in collaboration with American technology company Google.
IISc-ARTPARK is curating datasets of 150,000 hours of natural speech and text from around one million people across 773 districts of India, and the first phase of the project, launched at the end of 2022, is nearing completion.