I held a tutorial session at JSALT2022 summer school with Leo Yang. During the workshop, Leo introduced S3PRL and I introduced ESPnet with several of our recent updates. So glad to know that many students are planning to use ESPnet after the tutorial~
It has been a while since my PhD application. Though a personal statement (PS) only weighs a small portion of the entire application, it might be the most time-consuming one during the application. I’ve received tons of help when writing the PS. For the most, I want to thank Jonathan D. Amith, who is also a co-author in some of my papers. He takes the words very carefully and helps me through several rounds of revision. Many other people also kindly give some suggestions to the PS, including Shinji, Chao, Chunlei, and Xuankai. Meanwhile, I’ve also learned a lot from some people who shared their PS over the Internet. Therefore, I finally decided to post my PS on the website as well, in the case of helping people who would like to know my story. The current version is NOT the final version, but very close. Because at the last few days, we decide to use Word for revision instead of Latex (for easier use of my “reviewers”).
Keyi (Kiki) Zhang, 张钶浥, is a talented Chinese female singer, composer, and lyricist. She has published around 30 songs with a variety of styles. The KiSing corpus, named after her name Kiki, mainly consists of some of his published songs. Those songs with accompaniments can be found in both QQ music and Netease Cloud Music. Feel free to check them out!
Term of Use
All the data in the corpus is licensed with Creative Commons Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0).
Jiatong Shi, The Johns Hopkins University, firstname.lastname@example.org
Keyi (Kiki) Zhang, the singer, composer, and lyricist
Zhaodong Yao, the writer for the music score (i.e., MIDI) annotation
The corresponding recipe to train a singing voice synthesis system will be released soon in Muskits
The project is under the support of the AIM3 lab from Renmin University of China. We would also like to thank Bingrong Shi and Yunhong Wei for correcting the phonetic alignment.
With so much effort on discussion and design, the first published homework is out now. It is with a large workload, but it really worthy since the Kaldi structure is really awesome! Hopefully, the new students for SLP would benefit a lot through the homework.
The homework explores the basic function of Kaldi. Though Speech Recognition nowadays always follows a large dataset, we tried our best to find a tutorial with a small dataset that can be also handled on laptops.
Here are samples for our self-made music generation system.
The system’s input is English sentences longer than 10 words. After that, it will analyze their sentiment and map the sentiment to one of the 16 sentiment classes. With the input, the system can generate a piece of music for the sentiment. Here are some samples for the system.
Input: “Let life be beautiful like summer flowers And Death like autumn leaves.”
Output: (the sample is generated based on Butterfly lovers but with a variation in the second part)
Input: “If by life you were deceived, Don’t be dismal, don’t be wild! In the day of grief, be mild. Merry days will come, believe. Heart is living in tomorrow; Present is dejected here; In a moment, passes sorrow; That which passes will be dear.
Six Montn ago, I joined AI group of Youdao Businssess Department in NetEase as a Machine Learning intern. It’s time to say goodbye now.
During the period, I mainly worked on a project of Computer-Assisted Language Learning (CALL). Several improvements were achieved not only for the company but also for myself.
To be specific of my progress, I tested two possible new models to improve the original model (though proved to be failed…). In addition, I designed a scoring model which helps to scale the raw output from acoustic model so that the score can be given under a find distribution. Moreover, I implemented stress detection and intonation detection algorithms in the system. In the last month, a severe alignment problem was detected. On this issue, I applied a HCLG graph to handle it and reach a better result.
It had been a tough time for me to learning so much ASR, CALL and Signal Processing Knowledge. Besides, from the intership, I got a more clear mind on midium-size systematical design other than school’s assignments. Thanks very much for Yixuan Xiao, my mentor during the intership and Thanks a lot to Youdao~
We are currently working on a project of eye tracking focusing on sight-reading solution for accordion. With help of Tobii eye tracker, fixation and eye moment can be measured accurately. Based on several eye movement features, we will further extract sight-reading patterns from our data and offer suggestions for current teaching theory. Thanks a lot for abundant data support from Beijing Children’s Palace.