Speech for Social Good Workshop, 2022

Accepted as a Satellite Event at Interspeech 2022

Despite the fact that speech is the most general form of human communication, the appeal of speech technologies has been inhibited by their limited availability and accessibility. Globally, speech technologies, including recognition and generation, are concentrated in regions of relative economic stability, which naturally excludes low resource languages from mainstream research. Even where available, these systems have biases in gender and accent, and are mostly unusable for individuals with speech impediments. Further, even though modelling and data curation are very important in academic circles, deployment and the challenges thereof in real world scenarios are often not spoken about. As speech technologies continue to be employed in an increasing number of application areas - despite the aforementioned shortcomings - we aim to highlight research that will make speech technologies more inclusive and useful for all, and believe that this workshop will provide a platform to address and highlight the social problems of incumbent speech systems.

Though we highlight a few topics of interest below, we invite submissions of any applications of speech technologies to social good:

  1. Bias in Speech Technologies - Uncovering and mitigating gender, racial, and other biases in recognition or generation systems,
  2. Impaired Speech - Developing better ASR, learning aids, and diagnostic tools for individuals with speech impediments or other medical conditions
  3. Support for seniors - synthesis and dialog systems that address the unique challenges brought on by age,
  4. Improving speech intelligibility - better synthesis systems for hearing-impaired individuals,
  5. Low Resource Languages - Developing data efficient systems for low resource languages, low resource data collection and curation, challenges in building such systems, challenges in collecting low resource data, case studies in low-resource languages,
  6. Affective computing - ASR systems for emotion recognition, and synthesis/dialog systems for positive human-computer interaction, identifying signs of depression and other mental health conditions,
  7. Challenges of Speech for Social Good - research and survey papers highlighting the challenges of designing such systems, privacy and ethical concerns, identifying and preventing potential misuse of speech systems,
  8. Social good, justice, equity : surveys on demographic usage of speech technologies including identification of most typical user types, and which groups are currently most disadvantaged by these systems, how speech systems can be used to challenge the existing power structures,
  9. Recommendations on the future of speech technologies : papers that address the current state of affairs in speech technologies and make recommendations to researchers for critical considerations that must be made in designing future speech technologies,
  10. Case studies : case studies of real world speech systems deployment, social and ethical issues in use and deployment of speech systems, etc.

Invited Speakers

We are excited to announce the following speakers at S4SG 2022:

Submission Process

Submissions should follow the Interspeech 2022 submission format : “the paper length should be up to four pages in two columns. An additional page can be used for references only. Paper submissions must conform to the format defined in the paper preparation guidelines as instructed in the author's kit on the conference webpage. Submissions may also be accompanied by additional files such as multimedia files. Authors must declare that their contributions are original and that they have not submitted their papers elsewhere for publication.” You can find more information about submission format, and the submission author kit at this link.
You can submit original research work or a summary of research activities done in the direction of Speech for Social Good. You can present original completed work, a case study, a negative result, an opinion piece, or an interesting application nugget.
The review process will be double-blinded. Submissions must not identify authors or their affiliations, or they will be desk-rejected.
Accepted papers will be presented during the workshop by either oral presentations or posters determined by the program committee. Each submission should discuss the ethical and societal implications of the work.
We encourage authors to also include a discussion of what “positive impact” means to them or to the field of NLP.

Submission Link

We will be using the SoftConf conference management system. Make a new submission.

Venue

The (ISCA-endorsed) workshop will be a Satellite Event at Interspeech 2022, and will be organized virtually considering the current travel restrictions due to the ongoing COVID-19 pandemic.

Important Dates

Abstract Submission Deadline May 2, 2022 May 7, 2022
Paper Submission Deadline May 9, 2022 May 14, 2022
Notification of Acceptance June 13, 2022 June 24, 2022
Camera-ready Papers Due June 20, 2022 July 1, 2022
Workshop September 24-25, 2022

Program

All times in Korean Standard Time. Please make sure to convert to your own time zone. Here is a handy tool you can use for conversions.
Saturday, September 24th
Session 1 - Low-Resource/Multilingual NLP
Time Title
11AM-12PM Keynote : "What does it really take to build speech recognition systems for the next billion users?", Pratyush Kumar (Slides, Video)
12:10PM - 12:30PM "Annotated Speech Corpus for Low Resource Indian Langauges: Awadhi, Bhojpuri, Braj and Magahi", Ritesh Kumar, Siddharth Singh, Shyam Ratan, Mohit Raj, Sonal Sinha, Sumitra Mishra, Bornini Lahiri, Vivek Seshadri, Kalika Bali, Atul Kr. Ojha (Paper, Slides, Video)
12:30PM - 12:50PM "Towards an Automatic Speech Recognizer for the Choctaw language", Jacqueline Brixey, David Traum (Paper, Slides, Video)
12:50PM - 1:10PM "Building TTS systems for low resource languages under resource constraints", Perez Ogayo, Graham Neubig, Alan W Black (Paper, Slides, Video)
1:10PM - 1:30PM Breakout Room

Session 2 - Panel Discussion and Misc. Topics
Time Title
10PM-11PM Panel : "Data Collection, Bias, and Ethical Concerns in Speech Processing" (Video)
Odette Scharenborg, Emily Ahn, Gopala Anumanchipalli, and Sakriani Sakti
11:10PM - 11:30PM "Can Smartphones be a cost-effective alternative to LENA for Early Childhood Language Intervention?" Satwik Dutta, Jacob C. Reyna, Jay F. Buzhardt, Dwight Irvin, John H.L. Hansen (Paper, Slides, Video)
11:30PM - 11:50PM "Comparing data augmentation and training techniques to reduce bias against non-native accents in hybrid speech recognition systems", Yixuan Zhang, Yuanyuan Zhang, Tanvina Patel, Odette Scharenborg (Paper, Slides, Video)
11:50PM - 12:10AM "Text Normalization for Speech Systems for All Languages", Athiya Deviyani, Alan W Black (Paper, Slides, Video)
12:10AM - 12:30AM Breakout Room

Sunday, September 25th
Session 3 - Medical Applications
Time Title
11AM-12PM Keynote : "Multimodal Dialog Technologies for Neurological and Mental Health", Vikram Ramanarayanan (Video)
12:10PM - 12:30PM "Assessing ASR Model Quality on Disordered Speech using BERTScore", Jimmy Tobin, Qisheng Li, Subhashini Venugopalan, Katie Seaver, Richard Cave, Katrin Tomanek (Paper, Slides, Video)
12:30PM - 12:50PM "Cross-Teager Cepsral Coefficients For Dysarthric Severity Level Classification", Anand Therattil, Aastha Kachhi, Hemant Patil (Paper, Slides, Video)
12:50PM - 1:10PM "Highly Intelligible Speech Synthesis for Spinal Muscular Atrophy Patients Based on Model Adaptation", Takuma Yoshimoto, Ryoichi Takashima, Chiho Sasaki, Tetsuya Takiguchi (Paper, Slides, Video)
1:10PM - 1:30PM Breakout Room
Registration : Please fill out this Google Form to receive an invitation to join the Zoom meeting.

Organizers

Anurag Katakkar NVIDIA
Alan W Black Carnegie Mellon University
Sunayana Sitaram Microsoft Research India
Shrimai Prabhumoye NVIDIA
Sai Krishna Rallabandi Carnegie Mellon University
Sakriani Sakti JAIST
Vikram Ramanarayanan Modality AI
Anirudh Koul Pinterest