Short Bio and Recent Focuses

Jiatong Shi is currently an audio researcher at Anuttacon. He received Ph.D. of Language Technologies in the Language Technologies Institute at Carnegie Mellon University, advised by Dr. Shinji Watanabe. His research focuses on speech representation learning and its applications across various speech processing tasks. He has authored over 100 publications in leading speech and machine learning conferences and has received multiple prestigious honors, including the Best Paper Award at ISCA Interspeech 2024, the Best Paper Award at EMNLP 2024, and the CMU Presidential Fellowship. Jiatong is also a strong advocate for open-source research, making significant contributions to major toolkits such as ESPnet, Muskits, and VERSA. He has played a key role in curating and releasing influential open datasets, including ML-SUPERB, SingMOS, KiSing, and several endangered language corpora, which have driven advancements in speech and music processing.

Here is a brief introduction to my recent focuses and works. I’m working on several brilliant projects these days including speech&music processing, such as ESPnet and VERSA.