{"id":17,"date":"2018-08-30T09:50:52","date_gmt":"2018-08-30T01:50:52","guid":{"rendered":"http:\/\/39.106.170.37\/?page_id=17"},"modified":"2025-03-11T12:25:23","modified_gmt":"2025-03-11T04:25:23","slug":"publications","status":"publish","type":"page","link":"http:\/\/shijt.site\/index.php\/publications\/","title":{"rendered":"Publications &#038; Awards"},"content":{"rendered":"<p><strong>Publication <\/strong>(* for Equal Contribution)<\/p>\n<ol>\n<li><strong>Jiatong Shi<\/strong>, Hye-jin Shim, Jinchuan Tian, Siddhant Arora, Haibin Wu, Darius Petermann, Jia Qi Yip, You Zhang, Yuxun Tang, Wangyou Zhang, Dareen Safar Alharthi, Yichen Huang, Koichi Saito, Jionghao Han, Yiwen Zhao, Chris Donahue, and Shinji Watanabe. 2025. &#8220;VERSA: A Versatile Evaluation Toolkit for Speech, Audio, and Music&#8221;. NAACL Demo. <a href=\"https:\/\/arxiv.org\/abs\/2412.17667\">[details here]<\/a><\/li>\n<li>Jinchuan Tian, Chunlei Zhang, <strong>Jiatong Shi<\/strong>, Hao Zhang, Jianwei Yu, Shinji Watanabe, and Dong Yu. 2025. &#8220;Preference Alignment Improves Language Model-Based TTS&#8221;. ICASSP. <a href=\"https:\/\/arxiv.org\/abs\/2409.12403\">[details here]<\/a><\/li>\n<li><strong>Jiatong Shi<\/strong>, Jinchuan Tian, Yihan Wu, Jee-weon Jung, Jia Qi Yip, Yoshiki Masuyama, William Chen, Yuning Wu, Yuxun Tang, Massa Baali, Dareen Alharthi, Dong Zhang, Ruifan Deng, Tejes Srivastava, Haibin Wu, Alexander Liu, Bhiksha Raj, Qin Jin, Ruihua Song, and Shinji Watanabe. 2024. &#8220;ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs for Audio, Music, and Speech&#8221;. SLT. <a href=\"https:\/\/arxiv.org\/abs\/2409.15897\">[details here]<\/a><\/li>\n<li>Yifeng Yu, <strong>Jiatong Shi<\/strong>, Yuning Wu, Shinji Watanabe. 2024. &#8220;VISinger2+: End-to-End Singing Voice Synthesis Augmented by Self-Supervised Learning Representation&#8221; SLT. <a href=\"https:\/\/arxiv.org\/abs\/2406.08761\">[details here]<\/a><\/li>\n<li>Shih-Heng Wang, <strong>Jiatong Shi<\/strong>, Chien-yu Huang, Shinji Watanabe, and Hung-yi Lee. 2024. &#8220;Fusion of Discrete Representations and Self-Augmented Representations for Multilingual Automatic Speech Recognition&#8221;. SLT. <a href=\"https:\/\/www.arxiv.org\/abs\/2411.18107\">[details here]<\/a><\/li>\n<li>Masao Someki, Kwanghee Choi, Siddhant Arora, William Chen, Samuele Cornell, Jionghao Han, Yifan Peng, <strong>Jiatong Shi<\/strong>, Vaibhav Srivastav, and Shinji Watanabe. 2024. &#8220;ESPnet-EZ: Python-only ESPnet for Easy Fine-tuning and Integration&#8221;. <a href=\"https:\/\/www.arxiv.org\/abs\/2409.09506\">[details here]<\/a><\/li>\n<li>You Zhang, Yongyi Zang, <strong>Jiatong Shi<\/strong>, Ryuichi Yamamoto, Tomoki Toda, and Zhiyao Duan. 2024. &#8220;SVDD 2024: The Inaugural Singing Voice Deepfake Detection Challenge&#8221;. SLT. <a href=\"https:\/\/arxiv.org\/abs\/2408.16132\">[details here]<\/a><\/li>\n<li>William Chen, Wangyou Zhang, Yifan Peng, Xinjian Li, <strong>Jiatong Shi<\/strong>, Jinchuan Tian, Xuankai Chang, Soumi Maiti, Karen Livescu, and Shinji Watanabe. &#8220;Towards Robust Speech Representation Learning for Thousands of Languages&#8221;. EMNLP. <a href=\"https:\/\/aclanthology.org\/2024.emnlp-main.570\/\">[details here]<\/a><\/li>\n<li>Kristin Qi, <b>Jiatong Shi<\/b>, Caroline Summerour, John Batsis, and Xiaohui Liang. 2024. &#8220;Exploiting Longitudinal Speech Data via Voice Assistant Systems for Early Detection of Cognitive Decline&#8221;. IEEE Healthcom2024. <a href=\"https:\/\/arxiv.org\/abs\/2410.12885\">[details here]<\/a><\/li>\n<li>Ibrahim Said Ahmad, Antonios Anastasopoulos, Ond\u0159ej Bojar, Claudia Borg, Marine Carpuat, Roldano Cattoni, Mauro Cettolo, William Chen, Qianqian Dong, Marcello Federico, Barry Haddow, D\u00e1vid Javorsk\u00fd, Mateusz Krubi\u0144ski, Tsz Kim Lam, Xutai Ma, Prashant Mathur, Evgeny Matusov, Chandresh Maurya, John McCrae, Kenton Murray, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, Atul Kr. Ojha, John Ortega, Sara Papi, Peter Pol\u00e1k, Adam Posp\u00ed\u0161il, Pavel Pecina, Elizabeth Salesky, Nivedita Sethiya, Balaram Sarkar, <strong>Jiatong Shi<\/strong>, Claytone Sikasote, Matthias Sperber, Sebastian St\u00fcker, Katsuhito Sudoh, Brian Thompson, Alex Waibel, Shinji Watanabe, Patrick Wilken, Petr Zem\u00e1nek, and Rodolfo Zevallos. 2024. &#8220;Findings of the IWSLT 2024 Evaluation Campaign&#8221;. IWSLT. <a href=\"https:\/\/aclanthology.org\/2024.iwslt-1.1\/\">[details here]<\/a><\/li>\n<li><strong>Jiatong Shi<\/strong>, Shih-Heng Wang, William Chen, Martijn Bartelds, Vanya Bannihatti Kumar, Jinchuan Tian, Xuankai Chang, Dan Jurafsky, Karen Livescu, Hung-yi Lee, and Shinji Watanabe. 2024. &#8220;ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets&#8221;. Interspeech.<a href=\"https:\/\/arxiv.org\/abs\/2406.08641\"> [details here]<\/a><\/li>\n<li><strong>Jiatong Shi<\/strong>, Xutai Ma, Hirofumi Inaguma, Anna Sun, and Shinji Watanabe. 2024. &#8220;MMM: Multi-Layer Multi-Residual Multi-Stream Discrete Speech Representation from Self-supervised Learning Model&#8221;. Interspeech. <a href=\"https:\/\/arxiv.org\/abs\/2406.09869\">[details here]<\/a><\/li>\n<li><strong>Jiatong Shi<\/strong>*, Yueqian Lin*, Xinyi Bai, Keyi Zhang, Yuning Wu, Yuxun Tang, Yifeng Yu, Qin Jin, and Shinji Watanabe. 2024. &#8220;Singing Voice Data Scaling-up: An Introduction to ACE-Opencpop and ACE-KiSing&#8221;. (Accepted by Interspeech 2024) <a href=\"https:\/\/arxiv.org\/pdf\/2401.17619\">[details here]<\/a><\/li>\n<li>Kalvin Chang, Yi-Hui Chou, <strong>Jiatong Shi<\/strong>, Hsuan-Ming Chen, Nicole Holliday, Odette Scharenborg, and David R. Mortensen. 2024. &#8220;Self-supervised Speech Representations Still Struggle with African American Vernacular English&#8221;. Interspeech. <a href=\"https:\/\/www.arxiv.org\/abs\/2408.14262\">[details here]<\/a><\/li>\n<li>Tejes Srivastava, <strong>Jiatong Shi<\/strong>, William Chen, and Shinji Watanabe. 2024. &#8220;EFFUSE: Efficient Self-Supervised Feature Fusion for E2E ASR in Multilingual and Low Resource Scenarios&#8221;. Interspeech. <a href=\"https:\/\/arxiv.org\/abs\/2310.03938\">[details here]<\/a><\/li>\n<li>Jee-weon Jung*, Wangyou Zhang*, <strong>Jiatong Shi<\/strong>*, Zakaria Aldeneh, Takuya Higuchi, Barry-John Theobald, Ahmed Hussen Abdelaziz, and Shinji Watanabe. 2024. &#8220;ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models&#8221;. Interspeech. <a href=\"https:\/\/arxiv.org\/abs\/2401.17230\">[details here]<\/a><\/li>\n<li>Yifan Peng, Jinchuan Tian, William Chen, Siddhant Arora, Brian Yan, Yui Sudo, Muhammad Shakeel, Kwanghee Choi, <strong>Jiatong Shi<\/strong>, Xuankai Chang, Jee-weon Jung, and Shinji Watanabe. 2024. &#8220;OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer&#8221;. Interspeech. <a href=\"https:\/\/arxiv.org\/pdf\/2401.16658\">[details here]\u00a0<\/a><\/li>\n<li>Shuhua Li, Qirong Mao, and <strong>Jiatong Shi<\/strong>. 2024. &#8220;PL-TTS: A Generalizable Prompt-based Diffusion TTS Augmented by Large Language Model&#8221;. Interspeech. <a href=\"https:\/\/www.isca-archive.org\/interspeech_2024\/li24y_interspeech.pdf\">[details here]<\/a><\/li>\n<li>Xuankai Chang, <strong>Jiatong Shi<\/strong>, Jinchuan Tian, Yuning Wu, Yuxun Tang, Yihan Wu, Shinji Watanabe, Yossi Adi, Xie Chen, and Qin Jin. 2024. &#8220;The Interspeech 2024 Challenge on Speech Processing Using Discrete Units&#8221;. Interspeech. <a href=\"https:\/\/arxiv.org\/abs\/2406.07725\">[details here]<\/a><\/li>\n<li>Yongyi Zang, <strong>Jiatong Shi<\/strong>, You Zhang, Ryuichi Yamamoto, Jionghao Han, Yuxun Tang, Shengyuan Xu, Wenxiao Zhao, Jing Guo, Tomoki Toda, Zhiyao Duan. 2024. &#8220;CtrSVDD: A Benchmark Dataset and Baseline Analysis for Controlled Singing Voice Deepfake Detection&#8221;. Interspeech <a href=\"https:\/\/arxiv.org\/abs\/2406.02438\">[details here]<\/a><\/li>\n<li>Yuxun Tang, Yuning Wu, <strong>Jiatong Shi<\/strong>, and Qin Jin. 2024. &#8220;SingOMD: Singing Oriented Multi-resolution Discrete Representation Construction from Speech Models&#8221;. Interspeech. <a href=\"https:\/\/arxiv.org\/abs\/2406.08905\">[details here]<\/a><\/li>\n<li>Yuning Wu, Chunlei Zhang, <strong>Jiatong Shi<\/strong>, Yuxun Tang, and Qin Jin 2024. &#8220;TokSing: Singing Voice Synthesis based on Discrete Tokens&#8221;. Interspeech <a href=\"https:\/\/arxiv.org\/abs\/2406.08416\">[details here]<\/a><\/li>\n<li>Yuxun Tang, <strong>Jiatong Shi<\/strong>, Yuning Wu, and Qin Jin. 2024. &#8220;An Exploration on Singing MOS Prediction&#8221; ISCSLP. <a href=\"https:\/\/ieeexplore.ieee.org\/document\/10800519\">[details here]<\/a><\/li>\n<li>Yuning Wu, <strong>Jiatong Shi<\/strong>, Yifeng Yu, Yuxun Tang, Qian Tao, Yueqian Lin, Jionghao Han, Xinyi Bai, Shinji Watanabe, and Qin Jin. 2024. &#8220;Muskits-ESPnet: a Comprehensive Toolkit for Singing Voice Synthesis in New Paradigm&#8221;. ACMMM. <a href=\"https:\/\/arxiv.org\/abs\/2409.07226\">[details here]<\/a><\/li>\n<li>Yuning Wu*, Yifeng Yu*, <strong>Jiatong Shi<\/strong>, Tao Qian, Qin Jin. 2024. &#8220;A Systematic Exploration of Joint-training for Singing Voice Synthesis&#8221; ISCSLP. <a href=\"https:\/\/arxiv.org\/abs\/2308.02867\">[details here]<\/a><\/li>\n<li>Taiqi He, Kwanghee Choi, Lindia Tjuatja, <strong>Jiatong Shi<\/strong>, Nate Robinson, Graham Neubig, Shinji Watanabe, David Mortensen, and Lori Levin. 2024. &#8220;WAV2GLOSS: Generating Interlinear Glossed Text from Speech&#8221;. ACL. <a href=\"https:\/\/arxiv.org\/abs\/2403.13169\">[Details here]<\/a><\/li>\n<li>Rongjie Huang, Chunlei Zhang, Yongqi Wang, Dongchao Yang, Jinchuan Tian, Luping Liu, Zhenhui Ye, Ziyue Jiang, Xuankai Chang, <strong>Jiatong Shi<\/strong>, Chao Weng, Zhou Zhao, and Dong Yu. 2024. &#8220;Revisiting Voice Large Language Models as Scalable Multi-Lingual and Multi-Task Learners&#8221;. ACL.<\/li>\n<li>Dongchao Yang*, Jinchuan Tian*, Xu Tan, Rongjie Huang, Songxiang Liu, Xuankai Chang, <strong>Jiatong Shi<\/strong>, Sheng Zhao, Jiang Bian, Xixin Wu, Zhou Zhao, and Helen Meng. 2024. &#8220;UniAudio: An Audio Foundation Model Toward Universal Audio Generation&#8221;. ICML. <a href=\"https:\/\/arxiv.org\/abs\/2310.00704\">[details here]<\/a><\/li>\n<li>Shu-wen Yang, Heng-Jui Chang*, Zili Huang*, Andy T. Liu*, Cheng-I Lai*, Haibin Wu*, <strong>Jiatong Shi<\/strong>, Xuankai Chang, Hsiang-Sheng Tsai, Wen-Chin Huang, Tzu-hsun Feng, Po-Han Chi, Yist Y. Lin, Yung-Sung Chuang, Tzu-Hsien Huang, Wei-Cheng Tseng, Kushal Lakhotia, Abdelrahman Mohamed, Shang-Wen Li, Shinji Watanabe, and Hung-yi Lee. 2024. &#8220;A Large-Scale Evaluation of Speech Foundation Models&#8221;. TASLP. <a href=\"https:\/\/arxiv.org\/abs\/2404.09385\">[Details here]<\/a><\/li>\n<li><strong>Jiatong Shi<\/strong>, Hirofumi Inaguma, Xutai Ma, Ilia Kulikov, and Anna Sun. 2024. &#8220;Multi-resolution HuBERT: Multi-resolution Speech Self-Supervised Learning with Masked Unit Prediction&#8221;. ICLR. <a href=\"https:\/\/openreview.net\/forum?id=kUuKFW7DIF\">[Details here]<\/a><\/li>\n<li>Rongjie Huang*, Mingze Li*, Dongchao Yang*, <strong>Jiatong Sh<\/strong>i*, Xuankai Chang, Zhenhui Ye, Yuning Wu, Zhiqing Hong, Jiawei Huang, Jinglin Liu, Yi Ren, Zhou Zhao, Shinji Watanabe. 2024. &#8220;AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head&#8221;. AAAI. <a href=\"https:\/\/arxiv.org\/abs\/2304.12995\">[Details here]<\/a><\/li>\n<li>Takashi Maekaku, <strong>Jiatong Shi<\/strong>, Xuankai Chang, Yuya Fujita, and Shinji Watanabe. 2024. &#8220;HuBERTopic: Enhancing semantic representation of HuBERT through self-supervision utilizing topic model&#8221;. ICASSP. <a href=\"https:\/\/arxiv.org\/abs\/2310.03975\">[Details here]<\/a><\/li>\n<li>Chien-yu Huang, Ke-Han Lu, Shih-Heng Wang, Chi-Yuan Hsiao, Chun-Yi Kuan, Haibin Wu, Siddhant Arora, Kai-Wei Chang, <strong>Jiatong Shi<\/strong>, Yifan Peng, Roshan Sharma, Shinji Watanabe, Bhiksha Ramakrishnan, Shady Shehata, and Hung-yi Lee. 2024. &#8220;Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech&#8221;. ICASSP.<a href=\"https:\/\/arxiv.org\/abs\/2309.09510\"> [Details here]<\/a><\/li>\n<li>Xuankai Chang, Brian Yan, Kwanghee Choi, Jeeweon Jung, Yichen Lu, Soumi Maiti, Roshan Sharma, <strong>Jiatong Shi<\/strong>, Jinchuan Tian, Shinji Watanabe, Yuya Fujita, Takashi Maekaku, Pengcheng Guo, Yao-Fei Cheng, Pavel Denisov, Kohei Saijo, and Hsiu-Hsuan Wang. 2024. &#8220;Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study&#8221;. ICASSP. <a href=\"https:\/\/ieeexplore.ieee.org\/document\/10447929\">[details here]<\/a><\/li>\n<li>Yi-Hui Chou, Kalvin Chang, Meng-Ju Wu, Winston Ou, Alice Wen-Hsin Bi, Carol Yang, Bryan Y. Chen, Rong-Wei Pai, Po-Yen Yeh, Jo-Peng Chiang, Lu-Tshiann Phoann, Winnie Chang, Chenxuan Cui, Noel Chen, and <strong>Jiatong Shi<\/strong>. 2023.\u00a0 &#8220;Evaluating Self-supervised Speech Models on a Taiwanese Hokkien Corpus&#8221;. ASRU.<a href=\"https:\/\/arxiv.org\/abs\/2312.06668\"> [details here]<\/a><\/li>\n<li>Yifan Peng, Jinchuan Tian, Brian Yan, Dan Berrebbi, Xuankai Chang, Xinjian Li, <strong>Jiatong Shi,<\/strong> Siddhant Arora, William Chen, Roshan Sharma, Wangyou Zhang, Yui Sudo, Muhammad Shakeel, Jee-Weon Jung, Soumi Maiti, and Shinji Watanabe. 2023. &#8220;Reproducing Whisper-Style Pre-training Using an Open-Source Toolkit and Publicly Available Data&#8221;. ASRU. <a href=\"https:\/\/arxiv.org\/abs\/2309.13876\">[details here]<\/a><\/li>\n<li>William Chen, <strong>Jiatong Shi,<\/strong> Brian Yan, Dan Berrebbi, Wangyou Zhang, Yifan Peng, Xuankai Chang, Soumi Maiti, and Shinji Watanabe. 2023. &#8220;Joint Prediction and Denoising for Large-Scale Multilingual Self-Supervised Learning&#8221;. ASRU. <a href=\"https:\/\/arxiv.org\/abs\/2309.15317\">[details here]<\/a><\/li>\n<li><strong>Jiatong Shi<\/strong>, William Chen, Dan Berrebbi, Hsiu-Hsuan Wang, Wei-Ping Huang, En-Pei Hu, Ho-Lam Chuang, Xuankai Chang, Yuxun Tang, Shang-Wen Li, Abdelrahman Mohamed, Hung-yi Lee, and Shinji Watanabe. 2023. &#8220;Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond&#8221;. ASRU. <a href=\"https:\/\/arxiv.org\/pdf\/2310.05513.pdf\">[details here]<\/a><\/li>\n<li>Wen-Chin Huang, Lester Phillip Violeta, Songxiang Liu, <strong>Jiatong Shi,<\/strong> and Tomoki Toda. 2023. &#8220;The Singing Voice Conversion Challenge 2023&#8221;. ASRU<a href=\"https:\/\/arxiv.org\/abs\/2306.14422\"> [details here]<\/a><\/li>\n<li>Yui Sudo, Shakeel Muhhamad, Brian Yan, <strong>Jiatong Shi<\/strong>, and Shinji Watanabe. 2023. &#8220;4D: Joint Modeling of CTC, Attention, Transducer, and Mask-predict Decoders&#8221;. Interspeech.<a href=\"https:\/\/arxiv.org\/abs\/2212.10818\"> [details here]<\/a>.<\/li>\n<li><strong>Jiatong Shi<\/strong>, Dan Berrebbi*, William Chen*, En-Pei Hu*, Wei-Ping Huang*, Ho Lam Chung*, Xuankai Chang, Shang-Wen (Daniel) Li, Abdelrahman Mohamed, Hung-yi Lee, and Shinji Watanabe. 2023. &#8220;ML-SUPERB: Multilingual Speech Universal PERformance Benchmark&#8221;. Interspeech.<a href=\"https:\/\/arxiv.org\/abs\/2305.10615\"> [details here]<\/a><\/li>\n<li><strong>Jiatong Shi<\/strong>, Yun Tang, Hirofumi Inaguma, Hongyu Gong, Juan Pino, and Shinji Watanabe. 2023. &#8220;Exploration on HuBERT with Multiple Resolution&#8221; Interspeech. <a href=\"https:\/\/arxiv.org\/pdf\/2306.01084.pdf\">[details here]<\/a><\/li>\n<li>Sweta Agrawal, Antonios Anastasopoulos, Luisa Bentivogli, Ond\u0159ej Bojar, Claudia Borg, Marine Carpuat, Roldano Cattoni, Mauro Cettolo, Mingda Chen, William Chen, Khalid Choukri, Alexandra Chronopoulou, Anna Currey, Thierry Declerck, Qianqian Dong, Kevin Duh, Yannick Est\u00e8ve, Marcello Federico, Souhir Gahbiche, Barry Haddow, Benjamin Hsu, Phu Mon Htut, Hirofumi Inaguma, D\u00e1vid Javorsk\u00fd, John Judge, Yasumasa Kano, Tom Ko, Rishu Kumar, Pengwei Li, Xutai Ma, Prashant Mathur, Evgeny Matusov, Paul McNamee, John P. McCrae, Kenton Murray, Maria Nadejde, Satoshi Nakamura, Matteo Negri, Ha Nguyen, Jan Niehues, Xing Niu, Atul Kr. Ojha, John E. Ortega, Proyag Pal, Juan Pino, Lonneke van der Plas, Peter Pol\u00e1k, Elijah Rippeth, Elizabeth Salesky, <strong>Jiatong Shi<\/strong>, Matthias Sperber, Sebastian St\u00fcker, Katsuhito Sudoh, Yun Tang, Brian Thompson, Kevin Tran, Marco Turchi, Alex Waibel, Mingxuan Wang, Shinji Watanabe, and Rodolfo Zevallos. 2023. &#8220;Findings of the IWSLT 2023 Evaluation Campaign&#8221;. IWSLT. <a href=\"https:\/\/aclanthology.org\/2023.iwslt-1.1\/\">[details here]<\/a><\/li>\n<li>Brian Yan*, <strong>Jiatong Shi<\/strong>*, Soumi Maiti, William Chen, Xinjian Li, Yifan Peng, Siddhant Arora, and Shinji Watanabe. 2023. &#8220;CMU\u2019s IWSLT 2023 Simultaneous Speech Translation System&#8221;. IWSLT. <a href=\"https:\/\/aclanthology.org\/2023.iwslt-1.20\/\">[details here]<\/a><\/li>\n<li>Brian Yan*, <strong>Jiatong Shi<\/strong>*, Yun Tang, Hirofumi Inaguma, Yifan Peng, Siddharth Dalmia, Peter Pol\\&#8217;ak, Patrick Fernandes, Dan Berrebbi, Tomoki Hayashi, Xiaohui Zhang, Zhaoheng Ni, Moto Hira, Soumi Maiti, Juan Pino, Shinji Watanabe. 2023. &#8220;ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit&#8221;. ACL demo.<a href=\"https:\/\/aclanthology.org\/2023.acl-demo.38\/\"> [details here]<\/a><\/li>\n<li>Tao Qian, Fan Lou, Jiatong Shi, Yuning Wu, Shuai GUo, Xiang Yin, and Qin Jin. 2023. &#8220;UniLG: A Unified Structure-aware Framework for Lyrics Generation&#8221;. ACL. <a href=\"https:\/\/aclanthology.org\/2023.acl-long.56\/\">[details here]<\/a><\/li>\n<li>William Chen, Brian Yan, <strong>Jiatong Shi<\/strong>, Yifan Peng, Soumi Maiti, and Shinji Watanabe. &#8220;Improving Massively Multilingual ASR with Auxiliary CTC Objectives&#8221;. ICASSP. <a href=\"https:\/\/ieeexplore.ieee.org\/document\/10095326\">[details here]<\/a><\/li>\n<li><strong>Jiatong Shi<\/strong>, Chan-Jan Hsu, Holam Chung, Dongji Gao, Paola Garcia, Shinji Watanabe, Ann Lee, and Hung-yi Lee. &#8220;Bridging Speech and Text Pre-trained Models with Unsupervised ASR&#8221;. ICASSP. <a href=\"https:\/\/ieeexplore.ieee.org\/document\/10096827\">[details here]<\/a><\/li>\n<li><strong>Jiatong Shi<\/strong>, Yun Tang, Ann Lee, Hirofumi Inaguma, Changhan Wang, Juan Pino, and Shinji Watanabe. 2023. &#8220;Enhancing Speech-to-Speech Translation with Multiple TTS Targets&#8221;. ICASSP. <a href=\"https:\/\/ieeexplore.ieee.org\/document\/10095973\">[details here]<\/a><\/li>\n<li>Dongji Gao*,<strong> Jiatong Shi<\/strong>*, Shun-Po Chuang, Leibny Paola Garcia, Hung-yi Lee, Shinji Watanabe, and Sanjeev Khudanpur. 2023\u3001&#8221;EURO: ESPnet Unsupervised ASR Open-source Toolkit&#8221;. ICASSP. <a href=\"https:\/\/ieeexplore.ieee.org\/abstract\/document\/10096977\">[details here]<\/a><\/li>\n<li>Yuning Wu, <strong>Jiatong Shi<\/strong>, Tao Qian, and Qin Jin. 2023. &#8220;PHONEix: Acoustic Feature Processing Strategy for Enhanced Singing Pronunciation with Phoneme Distribution Predictor&#8221;. ICASSP. <a href=\"https:\/\/arxiv.org\/abs\/2303.08607\">[details here]<\/a><\/li>\n<li>Tzu-hsun Feng, Annie Dong, Ching-Feng Yeh, Shu-wen Yang, Tzu-Quan Lin, <strong>Jiatong Shi<\/strong>, Kai-Wei Chang, Zili Huang, Haibin Wu, Xuankai Chang, Shinji Watanabe, Abdelrahman Mohamed, Shang-Wen Li and Hung-yi Lee. &#8220;SUPERB@ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning&#8221;. SLT. <a href=\"https:\/\/arxiv.org\/pdf\/2210.08634\">[details here]<\/a><\/li>\n<li>Yen Meng, Hsuan-Jui Chen, <strong>Jiatong Shi<\/strong>, Shinji Watanabe, Paola Garcia, Hung-yi Lee and Hao Tang. &#8220;On Compressing Sequences for Self-Supervised Speech Models&#8221;. SLT.<a href=\"https:\/\/arxiv.org\/pdf\/2210.07189\"> [details here]<\/a><\/li>\n<li>Antonios Anastasopoulos, Lo\u00efc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ond\u0159ej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Est\u00e8ve, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, D\u00e1vid Javorsk\u00fd, V\u0115ra Kloudov\u00e1, Surafel Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria N\u01cedejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John Ortega, Juan Pino, Elizabeth Salesky, <strong>Jiatong Shi<\/strong>, Matthias Sperber, Sebastian St\u00fcker, Katsuhito Sudoh, Marco Turchi, Yogesh Virkar, Alexander Waibel, Changhan Wang, and Shinji Watanabe. 2022. &#8220;Findings of the IWSLT 2022 Evaluation Campaign&#8221;. IWSLT. <a href=\"https:\/\/aclanthology.org\/2022.iwslt-1.10\/\">[details here]<\/a><\/li>\n<li><strong><span dir=\"ltr\" role=\"presentation\">Jiatong Shi<\/span><\/strong><span dir=\"ltr\" role=\"presentation\">, George Saon, David Haws, Shinji Watanabe and Brian Kingsbury. \u201dVQ-T: RNN Transducers using Vector-<\/span><span dir=\"ltr\" role=\"presentation\">Quantized Prediction Network States\u201d. Interspeech. <a href=\"https:\/\/arxiv.org\/pdf\/2208.01818\">[details here]<\/a><\/span><\/li>\n<li><strong><span dir=\"ltr\" role=\"presentation\">Jiatong Shi<\/span><\/strong><span dir=\"ltr\" role=\"presentation\">, Shuai Guo<\/span><span dir=\"ltr\" role=\"presentation\">, Tao Qian<\/span><span dir=\"ltr\" role=\"presentation\">, Nan Huo, Tomoki Hayashi, Yuning Wu, Fangzheng Xu, Xuankai Chang, Huazhe Li,\u00a0<\/span><span dir=\"ltr\" role=\"presentation\">Peter Wu, Shinji Watanabe and Qin Jin. 2022. \u201dMuskits: an End-to-end Music Processing Toolkit for Singing Voice Synthesis \u201d. Interspeech. <a href=\"https:\/\/arxiv.org\/pdf\/2205.04029\">[details here]<\/a><\/span><\/li>\n<li><span dir=\"ltr\" role=\"presentation\">Shuai Guo<em>*<\/em><\/span><span dir=\"ltr\" role=\"presentation\">,<\/span> <strong><span dir=\"ltr\" role=\"presentation\">Jiatong Shi*<\/span><\/strong><span dir=\"ltr\" role=\"presentation\">, Tao Qian<\/span><span dir=\"ltr\" role=\"presentation\">, Shinji Watanabe and Qin Jin. \u201dSingAug: Data Augmentation for Singing Voice\u00a0<\/span><span dir=\"ltr\" role=\"presentation\">Synthesis with Cycle-consistent Training Strategy \u201d. Interspeech.<a href=\"https:\/\/arxiv.org\/pdf\/2203.17001\"> [details here]<\/a><\/span><\/li>\n<li><span dir=\"ltr\" role=\"presentation\">Dan Berrebbi<\/span><span dir=\"ltr\" role=\"presentation\">,<\/span> <strong><span dir=\"ltr\" role=\"presentation\">Jiatong Shi<\/span><\/strong><span dir=\"ltr\" role=\"presentation\">, Brian Yan, Osbel L\u00f3pez-Francisco, Jonathan Amith and Shinji Watanabe. \u201dCombining Spec<\/span><span dir=\"ltr\" role=\"presentation\">tral and Self-Supervised Features for Low Resource Speech Recognition and Translation\u201d. Interspeech. <a href=\"https:\/\/arxiv.org\/pdf\/2204.02470\">[details here]<\/a><\/span><\/li>\n<li><span dir=\"ltr\" role=\"presentation\">Keqi Deng, Shinji Watanabe,<\/span> <strong><span dir=\"ltr\" role=\"presentation\">Jiatong Shi<\/span><\/strong> <span dir=\"ltr\" role=\"presentation\">and Siddhant Arora. \u201dBlockwise Streaming Transformer for Spoken Language\u00a0<\/span><span dir=\"ltr\" role=\"presentation\">Understanding and Simultaneous Speech Translation \u201d. Interspeech. <a href=\"https:\/\/arxiv.org\/pdf\/2204.08920\">[details here]<\/a><\/span><\/li>\n<li><span dir=\"ltr\" role=\"presentation\">Yui Sudo, Shakeel Muhammad, Kazuhiro Nakadai,<\/span> <strong><span dir=\"ltr\" role=\"presentation\">Jiatong Shi<\/span><\/strong> <span dir=\"ltr\" role=\"presentation\">and Shinji Watanabe. \u201dStreaming Automatic Speech\u00a0<\/span><span dir=\"ltr\" role=\"presentation\">Recognition with Re-blocking Processing Based on Integrated Voice Activity Detection\u201d. Interspeech. <a href=\"https:\/\/www.researchgate.net\/profile\/Yui-Sudo\/publication\/363646348_Streaming_Automatic_Speech_Recognition_with_Re-blocking_Processing_Based_on_Integrated_Voice_Activity_Detection\/links\/632e2de06063772afd8935d8\/Streaming-Automatic-Speech-Recognition-with-Re-blocking-Processing-Based-on-Integrated-Voice-Activity-Detection.pdf\">[details here]<\/a><\/span><\/li>\n<li><span dir=\"ltr\" role=\"presentation\">Brian Yan, Patrick Fernandes, Siddharth Dalmia,<\/span> <strong><span dir=\"ltr\" role=\"presentation\">Jiatong Shi<\/span><\/strong><span dir=\"ltr\" role=\"presentation\">, Yifan Peng, Dan Berrebbi, Xinyi Wang, Graham Neubig, and\u00a0<\/span><span dir=\"ltr\" role=\"presentation\">Shinji Watanabe. 2022. \u201cCMU\u2019s IWSLT 2022 Dialect Speech Translation System\u201d,<\/span> <span dir=\"ltr\" role=\"presentation\">Proceedings of the 19th International\u00a0<\/span><span dir=\"ltr\" role=\"presentation\">Conference on Spoken Language Translation (IWSLT 2022) <a href=\"https:\/\/aclanthology.org\/2022.iwslt-1.27\/\">[details here]<\/a><\/span><\/li>\n<li><span dir=\"ltr\" role=\"presentation\">Hsiang-Sheng Tsai, Heng-Jui Chang, Wen-Chin Huang, Zili Huang, Kushal Lakhotia, Shu-wen Yang, Shuyan Dong, Andy T.\u00a0<\/span><span dir=\"ltr\" role=\"presentation\">Liu, Cheng-I Lai,<\/span> <strong><span dir=\"ltr\" role=\"presentation\">Jiatong Shi<\/span><\/strong><span dir=\"ltr\" role=\"presentation\">, Xuankai Chang, Phil Hall, Shang-Wen Li, Shinji Watanabe, Abdelrahman Mohamed, Hung-<\/span><span dir=\"ltr\" role=\"presentation\">yi Lee. \u201dSUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative\u00a0<\/span><span dir=\"ltr\" role=\"presentation\">Capabilities\u201d. 2022. Proceedings of the Annual Meeting of the Association for Computational Linguistics. <a href=\"https:\/\/aclanthology.org\/2022.acl-long.580.pdf\">[details here]<\/a><\/span><\/li>\n<li><span dir=\"ltr\" role=\"presentation\">Tao Qian<\/span><span dir=\"ltr\" role=\"presentation\">,<\/span> <strong><span dir=\"ltr\" role=\"presentation\">Jiatong Shi<\/span><\/strong><span dir=\"ltr\" role=\"presentation\">, Shuai Guo<\/span><span dir=\"ltr\" role=\"presentation\">, Peter Wu<\/span><span dir=\"ltr\" role=\"presentation\">, and Qin Jin. \u201dTraining strategies for automatic song writing: a perspec<\/span><span dir=\"ltr\" role=\"presentation\">tive with a unified framework\u201d. 2022. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). <a href=\"https:\/\/ieeexplore.ieee.org\/document\/9746818\">[details here]<\/a><\/span><\/li>\n<li><span dir=\"ltr\" role=\"presentation\"> Chunlei Zhang,<\/span> <strong><span dir=\"ltr\" role=\"presentation\">Jiatong Shi<\/span><\/strong><span dir=\"ltr\" role=\"presentation\">, Chao Weng, Meng Yu, and Dong Yu. 2022. \u201dTowards End-to-end Speaker Diarization with Gener<\/span><span dir=\"ltr\" role=\"presentation\">alized Neural Speaker Clustering\u201d. ICASSP. <a href=\"https:\/\/ieeexplore.ieee.org\/document\/9747301\">[details here]<\/a><\/span><\/li>\n<li><strong><span dir=\"ltr\" role=\"presentation\">Jiatong Shi<\/span><\/strong><span dir=\"ltr\" role=\"presentation\">, Chunlei Zhang, Chao Weng, Shinji Watanabe, Meng Yu, and Dong Yu. 2022. \u201dAn Investigation of Neural Un<\/span><span dir=\"ltr\" role=\"presentation\">certainty Estimation for Target Speaker Extraction Equipped RNN Transducer\u201d.<\/span> <span dir=\"ltr\" role=\"presentation\">Computer Speech and Language (CSL)<\/span><span dir=\"ltr\" role=\"presentation\">.<\/span><span dir=\"ltr\" role=\"presentation\">\u00a0<a href=\"https:\/\/www.sciencedirect.com\/science\/article\/abs\/pii\/S0885230821001200\">[details here]<\/a><\/span><\/li>\n<li><span dir=\"ltr\" role=\"presentation\">Peter Wu<\/span><span dir=\"ltr\" role=\"presentation\">, Paul Pu Liang,<\/span> <strong><span dir=\"ltr\" role=\"presentation\">Jiatong Shi<\/span><\/strong><span dir=\"ltr\" role=\"presentation\">, Ruslan Salakhutdinov, and Shinji Watanabe, Louis-Philippe Morency. 2021. \u201dUnder<\/span><span dir=\"ltr\" role=\"presentation\">standing the Tradeoffs in Client-side Privacy for Downstream Speech Tasks\u201d.<\/span> <span dir=\"ltr\" role=\"presentation\">APSIPA<\/span><span dir=\"ltr\" role=\"presentation\">. <a href=\"https:\/\/arxiv.org\/abs\/2101.08919\">[details here]<\/a><\/span><\/li>\n<li><span dir=\"ltr\" role=\"presentation\">Peter Wu<\/span><span dir=\"ltr\" role=\"presentation\">,<\/span> <strong><span dir=\"ltr\" role=\"presentation\">Jiatong Shi<\/span><\/strong><span dir=\"ltr\" role=\"presentation\">, Yifan Zhong, Shinji Watanabe, and Alan W Black. 2021.\u00a0 \u201dAcoustic Cross-lingual Transfer using Lan<\/span><span dir=\"ltr\" role=\"presentation\">guage Similarity\u201d. ASRU. <a href=\"https:\/\/arxiv.org\/abs\/2111.01326\">[details here]<\/a><\/span><\/li>\n<li>Hirofumi Inaguma, Brian Yan, Siddharth Dalmia, Pengcheng Gu, <strong>Jiatong Shi<\/strong>, Kevin Duh, Shinji Watanabe. 2021. &#8220;ESPnet-ST IWSLT 2021 Offline Speech Translation System&#8221;. IWSLT. <a href=\"https:\/\/arxiv.org\/pdf\/2107.00636\">[details here]<\/a><\/li>\n<li>Shu-wen Yang, Po-Han Chi*, Yung-Sung Chuang*, Cheng-I Jeff Lai*, Kushal Lakhotia*, Yist Y Lin*, Andy T Liu*, <strong>Jiatong Shi*<\/strong>, Xuankai Chang, Guan-Ting Lin, Tzu-Hsien Huang, Wei-Cheng Tseng, Ko-tik Lee, Da-Rong Liu, Zili Huang, Shuyan Dong, Shang-Wen Li, Shinji Watanabe, Abdelrahman Mohamed, Hung-yi Lee. 2020. &#8220;SUPERB: Speech processing Universal PERformance Benchmark&#8221;. Interspeech.<a href=\"https:\/\/arxiv.org\/pdf\/2105.01051\">\u00a0[details here]<\/a><\/li>\n<li><strong>Jiatong Shi<\/strong>, Jonathan D. Amith, Xuankai Chang, Siddharth Dalmia, Brian Yan, and Shinji Watanabe. 2021. &#8220;Highland Puebla Nahuatl&#8211;Spanish Speech Translation Corpus for Endangered Language Documentation&#8221;. AmericasNLP <a href=\"https:\/\/aclanthology.org\/2021.americasnlp-1.7.pdf\">[details here]<\/a><\/li>\n<li>Jonathan D. Amith, <strong>Jiatong Shi<\/strong>, Rey Castillo Garc\u00eda. 2021. &#8220;End-to-End Automatic Speech Recognition: Its Impact on the Workflow for Documenting Yolox\u00f3chitl Mixtec&#8221;. AmericasNLP. <a href=\"https:\/\/aclanthology.org\/2021.americasnlp-1.8.pdf\">[details here]<\/a><\/li>\n<li><strong>Jiatong, Shi<\/strong>, Chunlei Zhang, Chao Weng, Shinji Watanabe, Meng Yu, Dong Yu. 2021. &#8220;Improving RNN Transducer with Target Speaker Extraction and Neural Uncertainty Estimation&#8221;. ICASSP. <a href=\"https:\/\/arxiv.org\/pdf\/2011.13393.pdf\">[details here]<\/a><\/li>\n<li><strong>Jiatong, Shi*<\/strong>, Shuai Guo*, Nan Huo, Yuekai Zhang, Qin, Jin. 2021. &#8220;Sequence-to-sequence Singing Voice Synthesis with Perceptual Entropy Loss&#8221;. ICASSP. <a href=\"https:\/\/arxiv.org\/pdf\/2010.12024.pdf\">[details here]<\/a><\/li>\n<li>Pengcheng Guo, Florian Boyer, Xuankai Chang, Tomoki Hayashi, Yosuke Higuchi, Hirofumi Inaguma, Naoyuki Kamo, Chenda Li, Daniel Garcia-Romero, <strong>Jiatong Shi<\/strong>, Jing Shi, Shinji Watanabe Kun Wei, Wangyou Zhang, Yuekai Zhang. 2021. &#8220;Recent Developments on ESPNet Toolkit Boosted by Conformer&#8221;.\u00a0 \u00a0ICASSP. <a href=\"https:\/\/arxiv.org\/pdf\/2010.13956.pdf\">[details here]<\/a><\/li>\n<li><strong>Jiatong, Shi<\/strong>, Jonathan D. Amith, Rey Castillo Garc\u00eda, Esteban Guadalupe Sierra, Kevin Duh, and Shinji Watanabe. \u201cLeveraging End-to-End ASR for Endangered Language Documentation: An Empirical Study on Yol\u00f3xochitl Mixtec\u201d. EACL.\u00a0\u00a0<a href=\"http:\/\/shijt.site\/wp-content\/uploads\/2020\/12\/2021_eacl_EndangeredLanguageDocumentation_NoviceTranscriptionCorrection.pdf\">[details here]<\/a><\/li>\n<li><strong>Jiatong, Shi<\/strong>, Kunlin, Yang, Wei, Xu, and Mingming Wang<i>.<\/i>\u00a0&#8220;Leveraging deep learning with audio analytics to predict the success of crowdfunding projects.&#8221;\u00a0<i>Journal of Supercomputing<\/i>\u00a0(2021). https:\/\/doi.org\/10.1007\/s11227-020-03595-2 <a href=\"https:\/\/link.springer.com\/article\/10.1007\/s11227-020-03595-2\">[details here]<\/a><\/li>\n<li><strong>Jiatong, Shi<\/strong>, Nan, Huo, and Qin Jin. 2020. &#8220;Context-aware Goodness of Pronunciation for Computer Assisted Pronunciation Training&#8221;. <i>Interspeech<\/i>.\u00a0<a href=\"https:\/\/www.isca-speech.org\/archive\/Interspeech_2020\/abstracts\/2953.html\">[details here]<\/a><\/li>\n<li>Wenxin, Hou, Yue, Dong, Bairong Zhuang, Longfei, Yang, <strong>Jiatong, Shi<\/strong>, and Takahiro, Shinozaki. 2020. &#8220;Large-Scale End-to-End Multilingual Speech Recognition and Language Identification with Multi-Task Learning&#8221;.\u00a0 <i>Interspeech<\/i>.\u00a0<a href=\"https:\/\/www.isca-speech.org\/archive\/Interspeech_2020\/abstracts\/2164.html\">[details here]<\/a><\/li>\n<li><strong>Jiatong, Shi<\/strong>, Wei, Du, and Wei, Xu. 2018. \u201cIdentifying Impact Factors of Question Quality in Online Health Q&amp;A Communities: an Empirical Analysis on MedHelp.\u201d <i>PACIS.<\/i>\u00a0\u00a0<a href=\"https:\/\/aisel.aisnet.org\/pacis2018\/173\/\">[details here]<\/a><\/li>\n<li><strong>Jiatong, Shi<\/strong>. 2019. &#8220;Computer Assisted Language Learning System for Young English Learner&#8221;, Undergraduate Thesis, Renmin University of China.<\/li>\n<\/ol>\n<p><strong>Awards<\/strong><\/p>\n<table>\n<tbody>\n<tr>\n<td width=\"588\">CMU Presidential Fellowship<\/td>\n<td width=\"122\">2022<\/td>\n<\/tr>\n<tr>\n<td width=\"588\">PhD fellowship at LTI, CMU<\/td>\n<td width=\"122\">2021<\/td>\n<\/tr>\n<tr>\n<td width=\"588\">Special Award in \u2018National University Data-driven Innovation &amp; Research Competition\u2019 (1\/594)<\/td>\n<td width=\"122\">2018<\/td>\n<\/tr>\n<tr>\n<td width=\"588\">National Level in \u2018Training Programs of Innovation and Entrepreneurship for Undergraduates<\/td>\n<td width=\"122\">2017<\/td>\n<\/tr>\n<tr>\n<td width=\"588\">Scholarship of Academic Excellence (TOP 20%)<\/td>\n<td width=\"122\">2016 &amp; 2017 &amp; 2018<\/td>\n<\/tr>\n<tr>\n<td width=\"588\">Golden Prize in Beijing Art Festival of Undergraduates (Accordion Contest)<\/td>\n<td width=\"122\">2016<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n","protected":false},"excerpt":{"rendered":"<p>Publication (* for Equal Contribution) Jiatong Shi, Hye-jin Shim, Jinchuan Tian, Siddhant Arora, Haibin Wu, Darius Petermann, Jia Qi Yip, You Zhang, Yuxun Tang, Wangyou Zhang, Dareen Safar Alharthi, Yichen Huang, Koichi Saito, Jionghao Han, Yiwen Zhao, Chris Donahue, and Shinji Watanabe. 2025. &#8220;VERSA: A Versatile Evaluation Toolkit for Speech, Audio, and Music&#8221;. NAACL Demo. &hellip; <\/p>\n<p class=\"link-more\"><a href=\"http:\/\/shijt.site\/index.php\/publications\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Publications &#038; Awards&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":84,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":[],"_links":{"self":[{"href":"http:\/\/shijt.site\/index.php\/wp-json\/wp\/v2\/pages\/17"}],"collection":[{"href":"http:\/\/shijt.site\/index.php\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"http:\/\/shijt.site\/index.php\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"http:\/\/shijt.site\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/shijt.site\/index.php\/wp-json\/wp\/v2\/comments?post=17"}],"version-history":[{"count":46,"href":"http:\/\/shijt.site\/index.php\/wp-json\/wp\/v2\/pages\/17\/revisions"}],"predecessor-version":[{"id":426,"href":"http:\/\/shijt.site\/index.php\/wp-json\/wp\/v2\/pages\/17\/revisions\/426"}],"wp:featuredmedia":[{"embeddable":true,"href":"http:\/\/shijt.site\/index.php\/wp-json\/wp\/v2\/media\/84"}],"wp:attachment":[{"href":"http:\/\/shijt.site\/index.php\/wp-json\/wp\/v2\/media?parent=17"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}