DISS 2025

Keynote Speakers

Petra Wagner

Bielefeld University

Gopala Anumanchipalli

University of California, Berkeley


The dual nature of lengthening

Speakers: Petra Wagner and Simon Betz

Within disfluency research, lengthening has received far less research interest than other phenomena. While lengthening can act as a hesitation signal, which, alongside fillers and silences, may help with dialogue management, it differs from other hesitations as it often passes unnoticed. It therefore cannot fulfill the same function as fillers, which often have a clear message to convey (e.g., „Please don’t interrupt me“). However, if lengthening is perceived, it can impact listeners’ interpretation of meaning and meta-information. This somewhat two-faced nature, the dichotomy of elusiveness and communicative impact, demands closer investigation.

In this talk we would like to give an overview of our own research on lengthening. We will outline formal aspects of lengthening, its multimodal production and perception, its pragmatic function, and discuss consequences for its empirical analysis and implications for its potential use in speech-technological applications.

Petra Wagner (M.A. linguistics, Bielefeld; Ph.D. phonetics and communication sciences, Bonn) is professor for phonetics at the Faculty of Linguistics and Literary Studies and the Center for Cognitive Interaction Technology at Bielefeld University, Germany, where she leads the Bielefeld Phonetics Workgroup. She was named ISCA Fellow in 2024, and was recipient of the Riksbankens Jubileumsfond Humboldt-award for outstanding German researchers in the humanities in 2018. Prof. Wagner’s expertise spans phonetics, prosody, multimodal communication, speech synthesis and evaluation, and conversational speech analysis. She is particularly interested in (a) applying speech technology for tackling fundamental questions within speech science and (b) using speech science to better evaluate existing speech technological applications.

Simon Betz holds an MA in Linguistics, English Studies, and Medieval History (WWU Münster, 2013) and earned his PhD in Linguistics from Bielefeld University in 2019 under the supervision of Petra Wagner. His dissertation, “Hesitations in Spoken Dialogue Systems,” received the Best Dissertation Award. Since 2020, he has been a Postdoctoral Researcher in Petra Wagner’s Phonetics Workgroup at Bielefeld University. A prominent figure in disfluency research, Simon has authored numerous foundational and experimental publications, co-initiated a multi-author project to strengthen research infrastructure, and participated in a COST action proposal for a disfluency research network. He also served as the main organiser of DiSS 2023 in Bielefeld.


SSDM: Scalable Speech Disfluency Modeling

Speaker: Gopala Krishna Anumanchipalli

Speech disfluency modeling is the core module for spoken language learning, and speech therapy. However, there are three challenges. First, current state-of-the-art solutions suffer from poor scalability. Second, there is a lack of a large-scale dysfluency corpus. Third, there is not an effective learning framework. In this paper, we propose SSDM: Scalable Speech Disfluency Modeling, which (1) adopts articulatory gestures as scalable forced alignment;(2) introduces connectionist subsequence aligner (CSA) to achieve disfluency alignment;(3) introduces a large-scale simulated disfluency corpus called Libri-Dys; and (4) develops an end-to-end system by leveraging the power of large language models (LLMs). We expect SSDM to serve as a standard in the area of disfluency modeling. Demo is available at https://berkeley-speech-group.github.io/SSDM/.

Gopala Anumanchipalli received a B.Tech and MS in Computer Science from IIIT Hyderabad in 2008, and a Ph.D in Language and Information Technologies from Carnegie Mellon University, and a Ph.D in Electrical and Computer Engineering from IST, Lisbon. After Postdoctoral training and being a Full Researcher at Dept. of Neurosurgery at UCSF, he joined the faculty of the Department of Electrical Engineering and Computer Sciences at UC Berkeley in Spring 2021 and continues to hold an adjunct position at Dept. of Neurosurgery at UC San Francisco. He works at the intersection of Speech Processing, Neuroscience, and Artificial Intelligence with an emphasis on human-centered speech and Assistive technologies, including new paradigms for bio-inspired spoken language technologies, automated methods for early diagnosis, characterizing and rehabilitating disordered speech. With colleagues at UCSF, he also develops methods to advance our understanding of the neural mechanisms underlying speech/ language function in healthy people and Brain-Computer Interfaces to externally decode speech and language directly from the brain to augment lost function in paralyzed patients.